Spark applications run as a collection of multiple processes. Each application consists of a process for the main program (the driver program), and one or more executor processes that run Spark tasks.
Spark relies on a separate Cluster Manager to allocate and manage resources for these processes in multinode environments. Spark supports several cluster managers:
This reference architecture uses Kubernetes as the cluster manager.