The Spark application bundle deploys an Apache Spark 3.4.1 stand-alone cluster. The figure below shows the application creation screen for the Spark bundle.
This bundle includes:
- Apache Spark 3.4.1
- Hadoop client libraries (for HDFS access)
- Hadoop AWS libraries (for S3 access)
- Delta Lake 2.4.0 and 3.0.0rc1 libraries
- NVIDIA CUDA 11.4 (for GPU support)
- NVIDIA RAPIDS Accelerator for Apache Spark (for Spark GPU support)
At application creation time, the numbers of both Spark workers and Spark GPU workers can be specified. The bundle also allows resources to be specified, including number of cores and memory per worker.