Home > Storage > PowerScale (Isilon) > Product Documentation > Storage (general) > PowerScale OneFS Job Engine > Job performance
Not all Job Engine jobs run equally fast. For example, a job that is based on a file system tree walk runs slower on a cluster with a very large number of small files than on a cluster with a small number of large files. Jobs that compare data across nodes, such as Dedupe, run more slowly where there are many comparisons to be made. Many factors affect speed, and true linear scaling is not always possible. If a job is running slowly, the first step in troubleshooting the issue is to discover what the specific context of the job is.
The main methods by which jobs and their associated processes interact with the file system are through:
Each of these approaches has its pros and cons and suits particular jobs. The specific access method influences the runtime of a job. For instance, some jobs are unaffected by cluster size, while others slow down or accelerate based on the number of nodes in the cluster. Some jobs are highly influenced by file counts and directory depths.
For a number of jobs, particularly the LIN-based ones, the Job Engine provides an estimated percentage completion of the job during runtime (see Figure 15).
With LIN scans, even though the metadata is of variable size, the Job Engine can fairly accurately predict how much effort will be required to scan all LINs. The data, however, can be of widely variable size, and so estimates of how long it will take to process each task will be a best reasonable guess.
For example, the Job Engine might know that the highest LIN is 1:0009:0000. Assuming that the job will start with a single thread on each of three nodes, the coordinator evenly divides the LINs into nine ranges: 1:0000:0000-1:0000:ffff, 1:0001:0000-1:0001:ffff, and so on, through 1:0008:0000-1:0009:0000. These nine tasks are then divided between the three nodes. However, each range might take a different time to process. For example, the first range might have fewer actual LINs, as a result of old LINs having been deleted, so it might complete unexpectedly fast. Perhaps the third range contains a disproportional number of large files and so takes longer to process. And maybe the seventh range has heavy contention with client activity, also resulting in an increased runtime. Despite such variances, the splitting and redistribution of tasks across the node manager processes alleviates this issue, mitigating the need for perfectly fair divisions at the onset.
Priorities play a large role in job initiation, and it is possible for a high-priority job to significantly affect the running of other jobs. Job priority’s effect on job initiation is by design—FlexProtect could run with a greater level of urgency than SmartPools, for example. However, sometimes this effect can be an inconvenience, which is why the storage administrator has the ability to manually control the impact level and relative priority of jobs.
Certain jobs such as FlexProtect have a corresponding job provided with a name suffixed by Lin, for example FlexProtectLin. This indicates that the job will automatically use an SSD-based copy of metadata, where available, to scan the LIN tree, rather than the drives themselves. Depending on the workflow, this often significantly improves job runtime performance.
On large clusters with multiple jobs running at HIGH impact, the job coordinator can become bombarded by the volume of task results being sent directly from the worker threads. This effect is mitigated by certain jobs performing intermediate merging of results on individual nodes and batching delivery of their results to the coordinator. The jobs that support results merging include:
AutoBalance(Lin) | MultiScan |
AVScan | PermissionRepair |
CloudPoolsLin | QuotaScan |
CloudPoolsTreewalk | SnapRevert |
Collect | SnapshotDelete |
FlexProtect(Lin) | TreeDelete |
LinCount | Upgrade |