Home Storage PowerScale (Isilon) Blogs

OneFS Job Execution and Node Exclusion

Thu, 06 Jan 2022 23:26:13 -0000

Read Time: 0 minutes

Up through OneFS 9.2, a job engine job was an all or nothing entity. Whenever a job ran, it involved the entire cluster – regardless of individual node type, load, or condition. As such, any nodes that were overloaded or in a degraded state could still impact the execution ability of the job at large.

To address this, OneFS 9.3 provides the capability to exclude one or more nodes from participating in running a job. This allows the temporary removal of any nodes with high load, or other issues, from the job execution pool so that jobs do not become stuck.

The majority of the OneFS job engine’s jobs have no default schedule and are typically manually started by a cluster administrator or process. Other jobs such as FSAnalyze, MediaScan, ShadowStoreDelete, and SmartPools, are normally started via a schedule. The job engine can also initiate certain jobs on its own. For example, if the SnapshotIQ process detects that a snapshot has been marked for deletion, it will automatically queue a SnapshotDelete job.

The Job Engine will also execute jobs in response to certain system event triggers. In the case of a cluster group change, for example the addition or subtraction of a node or drive, OneFS automatically informs the job engine, which responds by starting a FlexProtect job. The coordinator notices that the group change includes a newly-smart-failed device and then initiates a FlexProtect job in response.

Job administration and execution can be controlled via the WebUI, CLI, or platform API. A job can be started, stopped, paused and resumed, and this is managed via the job engines’ check-pointing system. For each of these control methods, additional administrative security can be configured using roles-based access control (RBAC).

The job engine’s impact control and work throttling mechanism can limit the rate at which individual jobs can run. Throttling is employed at a per-manager process level, so job impact can be managed both granularly and gracefully.

Every twenty seconds, the coordinator process gathers cluster CPU and individual disk I/O load data from all the nodes across the cluster. The coordinator uses this information, in combination with the job impact configuration, to decide how many threads can run on each cluster node to service each running job. This can be a fractional number, and fractional thread counts are achieved by having a thread sleep for a given percentage of each second.

Using this CPU and disk I/O load data, every sixty seconds the coordinator evaluates how busy the various nodes are and makes a job throttling decision, instructing the various job engine processes as to the action they need to take. This enables throttling to be sensitive to workloads in which CPU and disk I/O load metrics yield different results. There are also separate load thresholds tailored to the different classes of drives used in OneFS powered clusters, from capacity optimized SATA disks to flash-based SSDs.

Configuration is via the OneFS CLI and gconfig and is global, such that it applies to all jobs on startup. However, the exclusion configuration is not dynamic, and once a job is started with the final node set, there is no further reconfiguration permitted. So if a participant node is excluded, it will remain excluded until the job has completed. Similarly, if a participant needs to be excluded, the current job will have to be cancelled and a new job started. Any nodes can be excluded, including the node running the job engine’s coordinator process. The coordinator will still monitor the job, it just won’t spawn a manager for the job.

The list of participating nodes for a job are computed in three phases:

Query the cluster’s GMP group.
Call to job.get_participating_nodes to get a subset from the gmp group.
Remove the nodes listed in core.excluded_participants from the subset.

The CLI syntax for configuring an excluded nodes list on a cluster is as follows (in this example, excluding nodes one through three):

# isi_gconfig –t job-config core.excluded_participants="{1,2,3}"

The ‘excluded_participants’ are entered as a comma-separated devid value list with no spaces, specified within parentheses and double quotes. All excluded nodes must be specified in full, since there’s no aggregation. Note that, while the excluded participant configuration will be displayed via gconfig, it is not reported as part of the ‘sysctl efs.gmp.group’ output.

A job engine node exclusion configuration can be easily reset to avoid excluding any nodes by assigning the “{}” value.

# isi_gconfig –t job-config core.excluded_participants="{}"
A ‘core.excluded_participant_percent_warn’ parameter defines the maximum percentage of removed nodes.
# isi_gconfig -t job-config core.excluded_participant_percent_warn
core.excluded_participant_percent_warn (uint) = 10

This parameter defaults to 10%, above which a CELOG event warning is generated.

As many nodes as desired can be removed from the job group. CELOG informational event will notify of removed nodes. If too many nodes have been removed (the gconfig parameter sets too many node thresholds), CELOG will fire a warning event. If some nodes are removed but they’re not part of the GMP group, a different warning event will trigger.

If all nodes are removed, a CLI/pAPI error will be returned, the job will fail, and a CELOG warning will fire. For example:

# isi job jobs start LinCount

Job operation failed: The job had no participants left. Check core.excluded_participants setting and make sure there is at least one node to run the job:  Invalid argument

# isi job status

10   LinCount         Failed    2021-10-24T:20:45:23

------------------------------------------------------------------

Total: 9

Note, however, that the following core system maintenance jobs will continue to run across all nodes in a cluster even if a node exclusion has been configured:

AutoBalance
Collect
FlexProtect
MediaScan
MultiScan

Author: Nick Trimbee

Tags:

Service	Description
PowerScaleUI	The OneFS WebUI configuration interface.
Platform-API-External	External access to the OneFS platform API endpoints.
Rest Access to Namespace (RAN)	REST-ful access by HTTP to a cluster’s /ifs namespace.
RemoteService	Remote Support and In-Product Activation.
SWIFT (deprecated)	Deprecated object access to the cluster using the SWIFT protocol. This has been replaced by the S3 protocol in OneFS.

Service	Disabling impacts
WebUI	The WebUI is completely disabled, and access attempts (default TCP port 8080) are denied with the warning Service Unavailable. Please contact Administrator. If the WebUI is re-enabled, the external platform API service (Platform-API-External) is also started if it is not running. Note that disabling the WebUI does not affect the PlatformAPI service.
Platform API	External API requests to the cluster are denied, and the WebUI is disabled, because it uses the Platform-API-External service. Note that the Platform-API-Internal service is not impacted if/when the Platform-API-External is disabled, and internal pAPI services continue to function as expected. If the Platform-API-External service is re-enabled, the WebUI will remain inactive until the PowerScaleUI service is also enabled.
RAN	If RAN is disabled, the WebUI components for File System Explorer and File Browser are also automatically disabled. From the WebUI, attempts to access the OneFS file system explorer (File System > File System Explorer) fail with the warning message Browse is disabled as RAN service is not running. Contact your administrator to enable the service. This same warning also appears when attempting to access any other WebUI components that require directory selection.
RemoteService	If RemoteService is disabled, the WebUI components for Remote Support and In-Product Activation are disabled. In the WebUI, going to Cluster Management > General Settings and selecting the Remote Support tab displays the message The service required for the feature is disabled. Contact your administrator to enable the service. In the WebUI, going to Cluster Management > Licensing and scrolling to the License Activation section displays the message The service required for the feature is disabled. Contact your administrator to enable the service.
SWIFT	Deprecated object protocol and disabled by default.

Option	Description
--access-control <boolean>	Enable Access Control Authentication for the HTTP service. Access Control Authentication requires at least one type of authentication to be enabled.
--basic-authentication <boolean>	Enable Basic Authentication for the HTTP service.
--webhdfs-ran-https-port <integer>	Configure Data Services Port for the HTTP service.
--revert-webhdfs-ran-https-port	Set value to system default for --webhdfs-ran-https-port.
--dav <boolean>	Comply with Class 1 and 2 of the DAV specification (RFC 2518) for the HTTP service. All DAV clients must go through a single node. DAV compliance is NOT met if you go through SmartConnect, or using 2 or more node IPs.
--enable-access-log <boolean>	Enable writing to a log when the HTTP server is accessed for the HTTP service.
--https <boolean>	Enable the HTTPS transport protocol for the HTTP service.
--https <boolean>	Enable the HTTPS transport protocol for the HTTP service.
--integrated-authentication <boolean>	Enable Integrated Authentication for the HTTP service.
--server-root <path>	Document root directory for the HTTP service. Must be within /ifs.
--service (enabled \| disabled \| redirect \| disabled_basicfile)	Enable/disable the HTTP Service or redirect to WebUI or disabled BasicFileAccess.
--service-timeout <duration>	The amount of time (in seconds) that the server will wait for certain events before failing a request. A value of 0 indicates that the service timeout value is the Apache default.
--revert-service-timeout	Set value to system default for --service-timeout.
--inactive-timeout <duration>	Get the HTTP RequestReadTimeout directive from both the WebUI and the HTTP service.
--revert-inactive-timeout	Set value to system default for --inactive-timeout.
--session-max-age <duration>	Get the HTTP SessionMaxAge directive from both WebUI and HTTP service.
--revert-session-max-age	Set value to system default for --session-max-age.
--httpd-controlpath-redirect <boolean>	Enable or disable WebUI redirection to the HTTP service.

Your Browser is Out of Date

OneFS Job Execution and Node Exclusion

Related Blog Posts

OneFS and HTTP Security

OneFS and PowerScale F-series Management Ports