The entire Job Engine’s orchestration is handled by the coordinator, which is a process that runs on one of the nodes in a cluster. Any node can act as the coordinator, and the principal responsibilities include:
While the individual nodes manage the work item allocation, the coordinator node takes control, divides up the job, and evenly distributes the resulting tasks across the nodes in the cluster. For example, if the coordinator needs to communicate with a manager process running on node five, it first sends a message to node five’s director. Node five’s director then passes the message to the appropriate manager process under its control. The coordinator also periodically sends messages, through the director processes, instructing the managers to increment or decrement the number of worker threads.
The coordinator is also responsible for starting and stopping jobs, and for processing work results as they are returned during job processing. Should the coordinator process die for any reason, the coordinator responsibility automatically moves to another node.
The coordinator node can be identified through the following CLI command:
# isi job status --verbose | grep Coordinator