Loading…
Spark Meets MPI: Towards High-Performance Communication Framework for Spark using MPI
There are several popular Big Data processing frameworks including Apache Spark, Dask, and Ray. The Apache Spark software provides an easy-to-use high-level API in different languages including Scala, Java, and Python. Spark supports parallel and distributed execution of user workloads by supporting...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | There are several popular Big Data processing frameworks including Apache Spark, Dask, and Ray. The Apache Spark software provides an easy-to-use high-level API in different languages including Scala, Java, and Python. Spark supports parallel and distributed execution of user workloads by supporting communication using an event-driven framework called Netty. Some efforts - including RDMA-Spark and SparkUCX - were made in the past to optimize Apache Spark on High-Performance Computing (HPC) systems equipped with high-performance interconnects like InfiniBand. In the HPC community, Message Passing Interface (MPI) libraries are widely adopted for parallelizing science and engineering applications. This paper presents MPI4Spark which uses MPI for communication in a parallel and distributed setting on HPC systems. MPI4Spark can launch the Spark ecosystem using MPI launchers to utilize MPI communication inside the Big Data framework. It also maintains isolation for application execution on worker nodes by forking new processes using Dynamic Process Management (DPM). It bridges semantic differences between the event-driven communication in Spark compared to the application-driven communication engine in MPI. MPI4Spark also provides portability and performance benefits as it is capable of utilizing popular HPC interconnects including InfiniBand, Omni-Path, Slingshot, and others. The performance of MPI4Spark is evaluated against RDMA-Spark and Vanilla Spark using OSU HiBD Benchmarks (OHB) and Intel HiBench that contain a variety of Resilient Distributed Dataset (RDD), Graph Processing, and Machine Learning workloads. This evaluation is done on three HPC systems including TACC Frontera, TACC Stampede2, and an internal cluster. MPI4Spark outperforms Vanilla Spark and RDMA-Spark by 4.23x and 2.04x, respectively, on the TACC Frontera system using 448 processing cores (8 Spark workers) for the GroupByTest benchmark in OHB. The communication performance of MPI4Spark is 13.08x and 5.56x better than Vanilla Spark and RDMA-Spark, respectively. |
---|---|
ISSN: | 2168-9253 |
DOI: | 10.1109/CLUSTER51413.2022.00022 |