Short articles related to Dell PowerScale.
Wed, 22 Nov 2023 00:17:57 -0000
|Read Time: 0 minutes
In the rapidly evolving landscape of technology, we find ourselves on the brink of a major technological leap with the integration of artificial intelligence (AI) into our daily lives. The potential impact of AI on the global economy is staggering, with forecasts predicting a whopping $13 trillion contribution. While the idea of AI isn't entirely new in the security sector which has previously employed analytics to monitor and report pixel changes in CCTV footage, the integration of AI technologies such as machine and deep learning has opened up a world of possibilities. One particularly rich source of data that organizations are eager to harness is video data, which is pivotal in a variety of use cases including operational improvements for retail, marketing strategies, and the enhancement of overall customer experiences.
Industries across the board are exploring AI's ability to enhance business efficiency, underscored by a whopping 63% of enterprise clients considering their security data as mission critical. That said, the success of AI deployments hinges on the collection and storage of data. AI models thrive on large, diverse datasets to achieve effectiveness and accuracy. For instance, when analyzing traffic patterns within a city, having access to comprehensive data spanning multiple seasons allows for more accurate planning. This necessity has led to the emergence of exceptionally large storage volumes to cater to AI's insatiable appetite for data.
A considerable portion of data – approximately 80% – collected by organizations is unstructured, including video data. Data scientists are faced with the arduous task of mapping this unstructured data into their models, thanks in part to the fragmented nature of security solutions. Shockingly, over 79% of a data scientists’ time is consumed by data wrangling and collection rather than actual data analysis due to siloed data storage. Complex scenarios involving thousands of cameras pointed at different targets further complicate the application of AI models to this data.
Recent discussions in the field of AI have introduced the concept of ‘Data Fuzion,’ which underscores the importance of consolidating and harmonizing data, overcoming the current infrastructure's obstacles, and making data more accessible and usable for data science applications in the security industry. There is a significant divide between the potential for data science solutions to drive business outcomes and the actual implementation, largely attributed to – as previously mentioned – the fragmented, siloed nature of data storage and the scarcity of in-house data science expertise.
The AI solutions available today in the security domain often come as black box offerings with pre-programmed models, however end-users are increasingly seeking low- or no-code AI tools that allow them to tailor and modify models to meet their specific organizational needs. This shift enables organizations to fine-tune AI to their precise requirements, further optimizing business outcomes. Additionally, the rise of cloud computing has presented budgetary challenges as organizations are increasingly paying for data access, leading to a trend of cloud repatriation – moving data back to on-premises environments to better manage costs and reduce latency in real-time applications.
AI is transforming the way organizations protect not only their external security but also their internal data. Dell Technologies, for example, offers a solution known as Ransomware Defender within its unstructured data offerings, an AI-based detection tool which identifies anomalies and takes action when malicious actors attempt to encrypt or delete data by modeling typical behaviors and sounding alarms when suspicious activities occur. Check out the Dell Technologies cyber security solution page for more information.
To fully harness the power of AI and navigate these complex data landscapes, organizations are turning to single-volume unstructured data solutions that embody the concept of ‘Data Fuzion.’ Dell Technologies Unstructured Data Solutions, with their petabyte-scale single-volume architecture, offer not only the ability to support this burgeoning workload but also robust cyber protection and multi-cloud capabilities. In this way, organizations can chart a seamless path towards AI adoption while ensuring data-driven security and efficiency. Visit the Dell Technologies PowerScale solutions page to learn more.
Authors: Mordekhay Shushan | Safety and Security Solution Architect & Brian Stonge | Business Development Manager, Video Safety and Security
Thu, 16 Nov 2023 20:53:16 -0000
|Read Time: 0 minutes
Earlier in this series, we took a look at the architecture of the new OneFS WebUI SSO functionality. Now, we move on to its management and troubleshooting.
As we saw in the previous article, once the IdP and SP are configured, a cluster admin can enable SSO per access zone using the OneFS WebUI by navigating to Access > Authentication providers > SSO. From here, select the desired access zone and click the ‘Enable SSO’ toggle:
Or from the OneFS CLI using the following syntax:
# isi auth sso settings modify --sso-enabled 1
Once complete, the SSO configuration can be verified from a client web browser by browsing to the OneFS login screen. If all is operating correctly, redirection to the ADFS login screen will occur. For example:
After successful authentication with ADFS, cluster access is granted and the browser session is redirected back to the OneFS WebUI .
In addition to the new SSO WebUI pages, OneFS 9.5 also adds a subcommand to the ‘isi auth’ command set for configuring SSO from the CLI. This new syntax includes:
With these, you can use the following procedure to configure and enable SSO using the OneFS command line.
1. Define the ADFS instance in OneFS.
Enter the following command to create the IdP account:
# isi auth ads create <domain_name> <user> --password=<password> ...
where:
Attribute | Description |
<domain_name> | Fully qualified Active Directory domain name that identifies the ADFS server. For example, idp1.isilon.com. |
<user> | The user account that has permission to join machines to the given domain. |
<password> | The password for <user>. |
2. Next, add the IdP to the pertinent OneFS zone. Note that each of a cluster’s access zone(s) must have an IdP configured for it. The same IdP can be used for all the zones, but each access zone must be configured separately.
# isi zone zones modify --add-auth-providers
For example:
# isi zone zones modify system --add-auth-providers=lsa-activedirectoryprovider:idp1.isilon.com
3. Verify that OneFS can find users in Active Directory.
# isi auth users view idp1.isilon.com\\<username>
In the output, ensure that an email address is displayed. If not, return to Active Directory and assign email addresses to users.
4. Configure the OneFS hostname for SAML SSO.
# isi auth sso sp modify --hostname=<name>
Where <name> is the name that SAML SSO can use to represent the OneFS cluster to ADFS. SAML redirects clients to this hostname.
5. Obtain the ADFS metadata and store it under /ifs on the cluster.
In the following example, an HTTPS GET request is issued using the 'curl' utility to obtain the metadata from the IDP and store it under /ifs on the cluster.
# curl -o /ifs/adfs.xml https://idp1.isilon.com/FederationMetadata/2007-06/ FederationMetadata.xml
6. Create the IdP on OneFS using the ‘metadata-location’ path for the xml file in the previous step.
# isi auth sso idps create idp1.isilon.com --metadata-location="/ifs/adfs.xml"
7. Enable SSO:
# isi auth sso settings modify --sso-enabled=yes -–zone <zone>
Use the following syntax to view the IdP configuration:
# isi auth sso idps view <idp_ID>
For example:
# isi auth sso idps view idp ID: idp Metadata Location: /ifs/adfs.xml Entity ID: https://dns.isilon.com/adfs/services/trust Login endpoint of the IDP URL: https://dns.isilon.com/adfs/ls/ Binding: wrn:oasis:names:tc:SAML:2.0:bidings:HTTP-Redirect Logout endpoint of the IDP URL: https://dns.isilon.com/adfs/ls/ Binding: wrn:oasis:names:tc:SAML:2.0:bidings:HTTP-Redirect Response URL: - Type: metadata Signing Certificate: - Path: Issuer: CN-ADFS Signing – dns.isilon.com Subject: CN-ADFS Signing – dns.isilon.com Not Before: 2023-02-02T22:22:00 Not After: 2024-02-02T22:22:00 Status: valid Value and Type Value: -----BEGIN CERTIFICATE----- MITC9DCCAdygAwIBAgIQQQQc55appr1CtfPNj5kv+DANBgkqhk1G9w8BAQsFADA2 <snip>
If the IdP and/or SP Signing certificate happens to expire, users will be unable to login to the cluster with SSO and an error message will be displayed on the login screen.
In this example, the IdP certificate has expired, as described in the alert message. When this occurs, a warning is also displayed on the SSO Authentication page, as shown here:
To correct this, download either a new signing certificate from the identity provider or a new metadata file containing the IdP certificate details. When this is complete, you can then update the cluster’s IdP configuration by uploading the XML file or the new certificate.
Similarly, if the SP certificate has expired, the following notification alert is displayed upon attempted login:
The following error message is also displayed on the WebUI SSO tab, under Access > Authentication providers > SSO, along with a link to regenerate the metadata file:
The expired SP signing key and certificate can also be easily regenerated from the OneFS CLI:
# isi auth sso sp signing-key rekey
This command will delete any existing signing key and certificate and replace them with a newly generated signing key and certificate. Make sure the newly generated certificate is added to the IDP to ensure that the IDP can verify messages sent from the cluster. Are you sure? (yes/[no]): yes # isi auth sso sp signing-key dump -----BEGIN CERIFICATE----- MIIE6TCCAtGgAwIBAgIJAP30nSyYUz/cMA0GCSqGSIb3DQEBCwUAMCYxJDAiBgNVBAMMG1Bvd2VyU2NhbGUgU0FNTCBTaWduaWSnIEtleTAeFw0yMjExMTUwMzU0NTFaFw0yMzExMTUwMzU0NTFaMCYxJDAiBgNVBAMMG1Bvd2VyU2NhbGUgU0FNTCBTaWduaWSnIEtleTCCAilwDQYJKoZIhvcNAQEBBQADggIPADCCAgoCggIBAMOOmYJ1aUuxvyH0nbUMurMbQubgtdpVBevy12D3qn+x7rgym8/v50da/4xpMmv/zbE0zJ0IVbWHZedibtQhLZ1qRSY/vBlaztU/nA90XQzXMnckzpcunOTG29SMO3x3Ud4*fqcP4sKhV <snip>
When it is regenerated, either the XML file or certificate can be downloaded, and the cluster configuration updated by either metadata download or manual copy:
Finally, upload the SP details back to the identity provider.
For additional troubleshooting of OneFS SSO and authentication issues, there are some key log files to check. These include:
Log file | Information |
/var/log/isi_saml_d.log | SAML specific log messages logged by isi_saml_d. |
/var/log/apache2/webui_httpd_error.log | WebUI error messages including some SAML errors logged by the WebUI HTTP server. |
/var/log/jwt.log | Errors related to token generation logged by the JWT service. |
/var/log/lsassd.log | General authentication errors logged by the ‘lsassd’ service, such as failing to lookup users by email. |
Author: Nick Trimbee
Mon, 13 Nov 2023 17:58:49 -0000
|Read Time: 0 minutes
In the previous article in this series, we took a look at the new NFS locks and waiters reporting CLI command set and API endpoints. Next, we turn our attention to some additional context, caveats, and NFSv3 lock removal.
Before the NFS locking enhancements in OneFS 9.5, the legacy CLI commands were somewhat inefficient. Their output also included other advisory domain locks such as SMB, which made the output more difficult to parse. The table below maps the new 9.5 CLI commands (and corresponding handlers) to the old NLM syntax.
Type / Command set | OneFS 9.5 and later | OneFS 9.4 and earlier |
Locks | isi nfs locks | isi nfs nlm locks |
Sessions | isi nfs nlm sessions | isi nfs nlm sessions |
Waiters | isi nfs locks waiters | isi nfs nlm locks waiters |
Note that the isi_classic nfs locks and waiters CLI commands have also been deprecated in OneFS 9.5.
When upgrading to OneFS 9.5 or later from a prior release, the legacy platform API handlers continue to function through and post upgrade. Thus, any legacy scripts and automation are protected from this lock reporting deprecation. Additionally, while the new platform API handlers will work in during a rolling upgrade in mixed-mode, they will only return results for the nodes that have already been upgraded (‘high nodes’).
Be aware that the NFS locking CLI framework does not support partial responses. However, if a node is down or the cluster has a rolling upgrade in progress, the alternative is to query the equivalent platform API endpoint instead.
Performance-wise, on very large busy clusters, there is the possibility that the lock and waiter CLI commands’ output will be sluggish. In such instances, the --timeout flag can be used to increase the command timeout window. Output filtering can also be used to reduce number of locks reported.
When a lock is in a transition state, there is a chance that it may not have/report a version. In these instances, the Version field will be represented as —. For example:
# isi nfs locks list -v Client: 1/TMECLI1:487722/10.22.10.250 Client ID: 487722351064074 LIN: 4295164422 Path: /ifs/locks/nfsv3/10.22.10.250_1 Lock Type: exclusive Range: 0, 92233772036854775807 Created: 2023-08-18T08:03:52 Version: - --------------------------------------------------------------- Total: 1
This behavior should be experienced very infrequently. However, if it is encountered, simply execute the CLI command again, and the lock version should be reported correctly.
When it comes to troubleshooting NFSv3/NLM issues, if an NFSv3 client is consistently experiencing NLM_DENIED or other lock management issues, this is often a result of incorrectly configured firewall rules. For example, take the following packet capture (PCAP) excerpt from an NFSv4 Linux client:
21 08:50:42.173300992 10.22.10.100 → 10.22.10.200 NLM 106 V4 LOCK Reply (Call In 19) NLM_DENIED
Often, the assumption is that only the lockd or statd ports on the server side of the firewall need to be opened and that the client always makes that connection that way. However, this is not the case. Instead, the server will continually respond with a ‘let me get back to you’, then later reconnect to the client. As such, if the firewall blocks access to rcpbind on the client and/or lockd or statd on the client, connection failures will likely occur.
Occasionally, it does become necessary to remove NLM locks and waiters from the cluster. Traditionally, the isi_classic nfs clients rm command was used, however that command has limitations and is fully deprecated in OneFS 9.5 and later. Instead, the preferred method is to use the isi nfs nlm sessions CLI utility in conjunction with various other ancillary OneFS CLI commands to clear problematic locks and waiters.
Note that the isi nfs nlm sessions CLI command, available in all current OneFS version, is Zone-Aware. The output formatting is seen in the output for the client holding the lock as it now shows the Zone ID number at the beginning. For example:
4/tme-linux1/10.22.10.250
This represents:
Zone ID 4 / Client tme-linux1 / IP address of cluster node holding the connection.
A basic procedure to remove NLM locks and waiters from a cluster is as follows:
1. List the NFS locks and search for the pertinent filename.
In OneFS 8.5 and later, the locks list can be filtered using the --path argument.
# isi nfs locks list --path=<path> | grep <filename>
Be aware that the full path must be specified, starting with /ifs. There is no partial matching or substitution for paths in this command set.
For OneFS 9.4 and earlier, the following CLI syntax can be used:
# isi_for_array -sX 'isi nfs nlm locks list | grep <filename>'
2. List the lock waiters associated with the same filename using |grep.
For OneFS 8.5 and later, the waiters list can also be filtered using the --path syntax:
# isi nfs locks waiters –path=<path> | grep <filename>
With OneFS 9.4 and earlier, the following CLI syntax can be used:
# isi_for_array -sX 'isi nfs nlm locks waiters |grep -i <filename>'
3. Confirm the client and logical inode number (LIN) being waited upon.
This can be accomplished by querying the efs.advlock.failover.lock_waiters sysctrl. For example:
# isi_for_array -sX 'sysctl efs.advlock.failover.lock_waiters' [truncated output] ... client = { '4/tme-linux1/10.20.10.200’, 0x26593d37370041 } ... resource = 2:df86:0218
Note that for sanity checking, the isi get -L CLI utility can be used to confirm the path of a file from its LIN:
isi get -L <LIN>
4. Remove the unwanted locks which are causing waiters to stack up.
Keep in mind that the isi nfs nlm sessions command syntax is access zone-aware.
List the access zones by their IDs.
# isi zone zones list -v | grep -iE "Zone ID|name"
Once the desired zone ID has been determined, the isi_run -z CLI utility can be used to specify the appropriate zone in which to run the isi nfs nlm sessions commands:
# isi_run -z 4 -l root
Next, the isi nfs nlm sessions delete CLI command will remove the specific lock waiter which is causing the issue. The command syntax requires specifying the client hostname and node IP of the node holding the lock.
# isi nfs nlm sessions delete –-zone <AZ_zone_ID> <hostname> <cluster-ip>
For example:
# isi nfs nlm sessions delete –zone 4 tme-linux1 10.20.10.200 Are you sure you want to delete all NFSv3 locks associated with client tme-linux1 against cluster IP 10.20.10.100? (yes/[no]): yes
5. Repeat the commands in step 1 to confirm that the desired NLM locks and waiters have been successfully culled.
BEFORE applying the process....
# isi_for_array -sX 'isi nfs nlm locks list |grep JUN' TME-1: 4/tme-linux1/192.168.2.214 /ifs/tmp/TME/sequences/mncr_fabjob_seq_file_27JUN2017 TME-1: 4/ tme-linux1/192.168.2.214 /ifs/tmp/TME/sequences/mncr_fabjob_seq_file_28JUN2017 TME-2: 4/ tme-linux1/192.168.2.214 /ifs/tmp/TME/sequences/mncr_fabjob_seq_file_27JUN2017 TME-2: 4/ tme-linux1/192.168.2.214 /ifs/tmp/TME/sequences/mncr_fabjob_seq_file_28JUN2017 TME-3: 4/ tme-linux1/192.168.2.214 /ifs/tmp/TME/sequences/mncr_fabjob_seq_file_27JUN2017 TME-3: 4/ tme-linux1/192.168.2.214 /ifs/tmp/TME/sequences/mncr_fabjob_seq_file_28JUN2017 TME-4: 4/ tme-linux1/192.168.2.214 /ifs/tmp/TME/sequences/mncr_fabjob_seq_file_27JUN2017 TME-4: 4/ tme-linux1/192.168.2.214 /ifs/tmp/TME/sequences/mncr_fabjob_seq_file_28JUN2017 TME-5: 4/ tme-linux1/192.168.2.214 /ifs/tmp/TME/sequences/mncr_fabjob_seq_file_27JUN2017 TME-5: 4/ tme-linux1/192.168.2.214 /ifs/tmp/TME/sequences/mncr_fabjob_seq_file_28JUN2017 TME-6: 4/ tme-linux1/192.168.2.214 /ifs/tmp/TME/sequences/mncr_fabjob_seq_file_27JUN2017 TME-6: 4/ tme-linux1/192.168.2.214 /ifs/tmp/TME/sequences/mncr_fabjob_seq_file_28JUN2017 # isi_for_array -sX 'isi nfs nlm locks waiters |grep -i JUN' TME-1: 4/ tme-linux1/192.168.2.214 /ifs/tmp/TME/sequences/mncr_fabjob_seq_file_28JUN2017 TME-1: 4/ tme-linux1/192.168.2.214 /ifs/tmp/TME/sequences/mncr_fabjob_seq_file_28JUN2017 TME-2 exited with status 1 TME-3 exited with status 1 TME-4 exited with status 1 TME-5 exited with status 1 TME-6 exited with status 1
AFTER...
TME-1# isi nfs nlm sessions delete --hostname= tme-linux1 --cluster-ip=192.168.2.214 Are you sure you want to delete all NFSv3 locks associated with client tme-linux1 against cluster IP 192.168.2.214? (yes/[no]): yes TME-1# TME-1# TME-1# isi_for_array -sX 'sysctl efs.advlock.failover.locks |grep 2:ce75:0319' TME-1 exited with status 1 TME-2 exited with status 1 TME-3 exited with status 1 TME-4 exited with status 1 TME-5 exited with status 1 TME-6 exited with status 1 TME-1# TME-1# isi_for_array -sX 'isi nfs nlm locks list |grep -i JUN' TME-1 exited with status 1 TME-2 exited with status 1 TME-3 exited with status 1 TME-4 exited with status 1 TME-5 exited with status 1 TME-6 exited with status 1 TME-1# TME-1# isi_for_array -sX 'isi nfs nlm locks waiters |grep -i JUN' TME-1 exited with status 1 TME-2 exited with status 1 TME-3 exited with status 1 TME-4 exited with status 1 TME-5 exited with status 1 TME-6 exited with status 1
Mon, 13 Nov 2023 17:56:59 -0000
|Read Time: 0 minutes
Included among the plethora of OneFS 9.5 enhancements is an updated NFS lock reporting infrastructure, command set, and corresponding platform API endpoints. This new functionality includes enhanced listing and filtering options for both locks and waiters, based on NFS major version, client, LIN, path, creation time, etc. But first, some backstory.
The ubiquitous NFS protocol underwent some fundamental architectural changes between its versions 3 and 4. One of the major differences concerns the area of file locking.
NFSv4 is the most current major version of the protocol, natively incorporating file locking and thereby avoiding the need for any additional (and convoluted) RPC callback mechanisms necessary with prior NFS versions. With NFSv4, locking is built into the main file protocol and supports new lock types, such as range locks, share reservations, and delegations/oplocks, which emulate those found in Window and SMB.
File lock state is maintained at the server under a lease-based model. A server defines a single lease period for all states held by an NFS client. If the client does not renew its lease within the defined period, all states associated with the client's lease may be released by the server. If released, the client may either explicitly renew its lease or simply issue a read request or other associated operation. Additionally, with NFSv4, a client can elect whether to lock the entire file or a byte range within a file.
In contrast to NFSv4, the NFSv3 protocol is stateless and does not natively support file locking. Instead, the ancillary Network Lock Manager (NLM) protocol supplies the locking layer. Since file locking is inherently stateful, NLM itself is considered stateful. For example, when an NFSv3 filesystem mounted on an NFS client receives a request to lock a file, it generates an NLM remote procedure call instead of an NFS remote procedure call.
The NLM protocol itself consists of remote procedure calls that emulate the standard UNIX file control (fcntl) arguments and outputs. Because a process blocks waiting for a lock that conflicts with another lock holder – also known as a ‘blocking lock’ – the NLM protocol has the notion of callbacks from the file server to the NLM client to notify that a lock is available. As such, the NLM client sometimes acts as an RPC server in order to receive delayed results from lock calls.
Attribute | NFSv3 | NFSv4 |
State | Stateless - A client does not technically establish a new session if it has the correct information to ask for files and so on. This allows for simple failover between OneFS nodes using dynamic IP pools. | Stateful - NFSv4 uses sessions to handle communication. As such, both client and server must track session state to continue communicating. |
Presentation | User and Group info is presented numerically - Client and Server communicate user information by numeric identifiers, allowing the same user to appear as different names between client and server. | User and Group info is presented as strings - Both the client and server must resolve the names of the numeric information stored. The server must look up names to present while the client must remap those to numbers on its end. |
Locking | File Locking is out of band - uses NLM to perform locks. This requires the client to respond to RPC messages from the server to confirm locks have been granted, etc. | File Locking is in band - No longer uses a separate protocol for file locking, instead making it a type of call that is usually compounded with OPENs, CREATEs, or WRITEs. |
Transport | Can run over TCP or UDP - This version of the protocol can run over UDP instead of TCP, leaving handling of loss and retransmission to the software instead of the operating system. We always recommend using TCP. | Only supports TCP - Version 4 of NFS has left loss and retransmission up to the underlying operating system. Can batch a series of calls in a single packet, allowing the server to process all of them and reply at the end. This is used to reduce the number of calls involved in common operations. |
Since NFSv3 is stateless, it requires more complexity to recover from failures like client and server outages and network partitions. If an NLM server crashes, NLM clients that are holding locks must reestablish them on the server when it restarts. The NLM protocol deals with this by having the status monitor on the server send a notification message to the status monitor of each NLM client that was holding locks. The initial period after a server restart is known as the grace period, during which only requests to reestablish locks are granted. Thus, clients that reestablish locks during the grace period are guaranteed to not lose their locks.
When an NLM client crashes, ideally any locks it was holding at the time are removed from the pertinent NLM server(s). The NLM protocol handles this by having the status monitor on the client send a message to each server's status monitor once the client reboots. The client reboot indication informs the server that the client no longer requires its locks. However, if the client crashes and fails to reboot, the client's locks will persist indefinitely. This is undesirable for two primary reasons: Resources are indefinitely leaked. Eventually, another client will want to get a conflicting lock on at least one of the files the crashed client had locked and, as a result, the other client is postponed indefinitely.
Therefore, having NFS server utilities to swiftly and accurately report on lock and waiter status and utilities to clear NFS lock waiters is highly desirable for administrators – particularly on clustered storage architectures.
Prior to OneFS 9.5, the old NFS locking CLI commands were somewhat inefficient and also showed other advisory domain locks, which rendered the output somewhat confusing. The following table shows the new CLI commands (and corresponding handlers) which replace the older NLM syntax.
Type / Command set | OneFS 9.4 and earlier | OneFS 9.5 |
Locks | isi nfs nlm locks | isi nfs locks |
Sessions | isi nfs nlm sessions | isi nfs nlm sessions |
Waiters | isi nfs nlm locks waiters | isi nfs locks waiters |
In OneFS 9.5 and later, the old API handlers will still exist to avoid breaking existing scripts and automation, however the CLI command syntax is deprecated and will no longer work.
Also be aware that the isi_classic nfs locks and waiters CLI commands have also been disabled in OneFS 9.5. Attempts to run these will yield the following warning message:
# isi_classic nfs locks
This command has been disabled. Please use isi nfs for this functionality.
The new isi nfs locks CLI command output includes the following locks object fields:
Field | Description |
Client | The client host name, Frequently Qualified Domain Name, or IP |
Client_ID | The client ID (internally generated) |
Created | The UNIX Epoch time that the lock was created |
ID | The lock ID (Id necessary for platform API sorting, not shown in CLI output) |
LIN | The logical inode number (LIN) of the locked resource |
Lock_type | The type of lock (shared, exclusive, none) |
Path | Path of locked file |
Range | The byte range within the file that is locked |
Version | The NFS major version: v3, or v4 |
Note that the ISI_NFS_PRIV RBAC privilege is required in order to view the NFS locks or waiters via the CLI or PAPI. In addition to ‘root’, the cluster’s ‘SystemAdmin’ and ‘SecurityAdmin’ roles contain this privilege by default.
Additionally, the new locks CLI command sets have a default timeout of 60 seconds. If the cluster is very large, the timeout may need to be increased for the CLI command. For example:
# isi –timeout <timeout value> nfs locks list
The basic architecture of the enhanced NFS locks reporting framework is as follows:
The new API handlers leverage the platform API proxy, yielding increased performance over the legacy handlers. Additionally, updated syscalls have been implemented to facilitate filtering by NFS service and major version.
Since NFSv3 is stateless, the cluster does not know when a client has lost its state unless it reconnects. For maximum safety, the OneFS locking framework (lk) holds locks forever. The isi nfs nlm sessions CLI command allows administrators to manually free NFSv3 locks in such cases, and this command remains available in OneFS 9.5 as well as prior versions. NFSv3 locks may also be leaked on delete, since a valid inode is required for lock operations. As such, lkf has a lock reaper which periodically checks for locks associated with deleted files.
In OneFS 9.5 and later, current NFS locks can be viewed with the new isi nfs locks list command. This command set also provides a variety of options to limit and format the display output. In its basic form, this command generates a basic list of client IP address and the path. For example:
# isi nfs locks list Client Path ------------------------------------------------------------------- 1/TMECLI1:487722/10.22.10.250 /ifs/locks/nfsv3/10.22.10.250_1 1/TMECLI1:487722/10.22.10.250 /ifs/locks/nfsv3/10.22.10.250_2 Linux NFSv4.0 TMECLI1:487722/10.22.10.250 /ifs/locks/nfsv4/10.22.10.250_1 Linux NFSv4.0 TMECLI1:487722/10.22.10.250 /ifs/locks/nfsv4/10.22.10.250_2 ------------------------------------------------------------------- Total: 4
To include more information, the -v flag can be used to generate a verbose locks listing:
# isi nfs locks list -v Client: 1/TMECLI1:487722/10.22.10.250 Client ID: 487722351064074 LIN: 4295164422 Path: /ifs/locks/nfsv3/10.22.10.250_1 Lock Type: exclusive Range: 0, 92233772036854775807 Created: 2023-08-18T08:03:52 Version: v3 --------------------------------------------------------------- Client: 1/TMECLI1:487722/10.22.10.250 Client ID: 5175867327774721 LIN: 42950335042 Path: /ifs/locks/nfsv3/10.22.10.250_1 Lock Type: exclusive Range: 0, 92233772036854775807 Created: 2023-08-18T08:10:31 Version: v3 --------------------------------------------------------------- Client: Linux NFSv4.0 TMECLI1:487722/10.22.10.250 Client ID: 487722351064074 LIN: 429516442 Path: /ifs/locks/nfsv3/10.22.10.250_1 Lock Type: exclusive Range: 0, 92233772036854775807 Created: 2023-08-18T08:19:48 Version: v4 --------------------------------------------------------------- Client: Linux NFSv4.0 TMECLI1:487722/10.22.10.250 Client ID: 487722351064074 LIN: 4295426674 Path: /ifs/locks/nfsv3/10.22.10.250_2 Lock Type: exclusive Range: 0, 92233772036854775807 Created: 2023-08-18T08:17:02 Version: v4 --------------------------------------------------------------- Total: 4
The previous syntax returns more detailed information for each lock, including client ID, LIN, path, lock type, range, created date, and NFS version.
The lock listings can also be filtered by client or client-id. Note that the --client option must be the full name in quotes:
# isi nfs locks list --client="full_name_of_client/IP_address" -v
For example:
# isi nfs locks list --client="1/TMECLI1:487722/10.22.10.250" -v Client: 1/TMECLI1:487722/10.22.10.250 Client ID: 5175867327774721 LIN: 42950335042 Path: /ifs/locks/nfsv3/10.22.10.250_1 Lock Type: exclusive Range: 0, 92233772036854775807 Created: 2023-08-18T08:10:31 Version: v3
Additionally, be aware that the CLI does not support partial names, so the full name of the client must be specified.
Filtering by NFS version can be helpful when attempting to narrow down which client has a lock. For example, to show just the NFSv3 locks:
# isi nfs locks list --version=v3 Client Path ------------------------------------------------------------------- 1/TMECLI1:487722/10.22.10.250 /ifs/locks/nfsv3/10.22.10.250_1 1/TMECLI1:487722/10.22.10.250 /ifs/locks/nfsv3/10.22.10.250_2 ------------------------------------------------------------------- Total: 2
Note that the –-version flag supports both v3 and nlm as arguments and will return the same v3 output in either case. For example:
# isi nfs locks list --version=nlm Client Path ------------------------------------------------------------------- 1/TMECLI1:487722/10.22.10.250 /ifs/locks/nfsv3/10.22.10.250_1 1/TMECLI1:487722/10.22.10.250 /ifs/locks/nfsv3/10.22.10.250_2 ------------------------------------------------------------------- Total: 2
Filtering by LIN or path is also supported. For example, to filter by LIN:
# isi nfs locks list --lin=42950335042 -v Client: 1/TMECLI1:487722/10.22.10.250 Client ID: 5175867327774721 LIN: 42950335042 Path: /ifs/locks/nfsv3/10.22.10.250_1 Lock Type: exclusive Range: 0, 92233772036854775807 Created: 2023-08-18T08:10:31 Version: v3
Or by path:
# isi nfs locks list --path=/ifs/locks/nfsv3/10.22.10.250_2 -v Client: Linux NFSv4.0 TMECLI1:487722/10.22.10.250 Client ID: 487722351064074 LIN: 4295426674 Path: /ifs/locks/nfsv3/10.22.10.250_2 Lock Type: exclusive Range: 0, 92233772036854775807 Created: 2023-08-18T08:17:02 Version: v4
Be aware that the full path must be specified, starting with /ifs. There is no partial matching or substitution for paths in this command set.
Filtering can also be performed by creation time, for example:
# isi nfs locks list --created=2023-08-17T09:30:00 -v
Note that when filtering by created, the output will include all locks that were created before or at the time provided.
The —limits argument can be used to curtail the number of results returned, and limits can be used in conjunction with all other query options. For example, to limit the output of the NFSv4 locks listing to one lock:
# isi nfs locks list -–version=v4 --limit=1
Note that limit can be used with the range of query types.
The filter options are mutually exclusive with the exception of version. Note that version can be used with any of the other filter options. For example, filtering by both created and version.
This can be helpful when troubleshooting and trying to narrow down results.
In addition to locks, OneFS 9.5 also provides the isi nfs locks waiters CLI command set. Note that waiters are specific to NFSv3 clients, and the CLI reports any v3 locks that are pending and not yet granted.
Since NFSv3 is stateless, a cluster does not know when a client has lost its state unless it reconnects. For maximum safety, lk holds locks forever. The isi nfs nlm command allows administrators to manually free locks in such cases. Locks may also be leaked on delete, since a valid inode is required for lock operations. Thus, lkf has a lock reaper which periodically checks for locks associated with deleted files:
# isi nfs locks waiters
The waiters CLI syntax uses a similar range of query arguments as the isi nfs locks list command set.
In addition to the CLI, the platform API can also be used to query both NFS locks and NFSv3 waiters. For example, using curl to view the waiters via the OneFS pAPI:
# curl -k -u <username>:<passwd> https://localhost:8080/platform/protocols/nfs/waiters” { “total” : 2, “waiters”; } { “client” : “1/TMECLI1487722/10.22.10.250”, “client_id” : “4894369235106074”, “created” : “1668146840”, “id” : “1 1YUIAEIHVDGghSCHGRFHTiytr3u243567klj212-MANJKJHTTy1u23434yui-ouih23ui4yusdftyuySTDGJSDHVHGDRFhgfu234447g4bZHXhiuhsdm”, “lin” : “4295164422”, “lock_type” : “exclusive” “path” : “/ifs/locks/nfsv3/10.22.10.250_1” “range” : [0, 92233772036854775807 ], “version” : “v3” } }, “total” : 1 }
Similarly, using the platform API to show locks filtered by client ID:
# curl -k -u <username>:<passwd> “https://<address>:8080/platform/protocols/nfs/locks?client=<client_ID>”
For example:
# curl -k -u <username>:<passwd> “https://localhost:8080/platform/protocols/nfs/locks?client=1/TMECLI1487722/10.22.10.250” { “locks”; } { “client” : “1/TMECLI1487722/10.22.10.250”, “client_id” : “487722351064074”, “created” : “1668146840”, “id” : “1 1YUIAEIHVDGghSCHGRFHTiytr3u243567FCUJHBKD34NMDagNLKYGHKHGKjhklj212-MANJKJHTTy1u23434yui-ouih23ui4yusdftyuySTDGJSDHVHGDRFhgfu234447g4bZHXhiuhsdm”, “lin” : “4295164422”, “lock_type” : “exclusive” “path” : “/ifs/locks/nfsv3/10.22.10.250_1” “range” : [0, 92233772036854775807 ], “version” : “v3” } }, “Total” : 1 }
Note that, as with the CLI, the platform API does not support partial name matches, so the full name of the client must be specified.
Mon, 13 Nov 2023 17:56:44 -0000
|Read Time: 0 minutes
In the initial article in this series, we took a look at the OneFS SSL architecture, plus the first two steps in the basic certificate renewal or creation flow detailed below:
The following procedure includes options to complete a self-signed certificate replacement or renewal or to request an SSL replacement or renewal from a Certificate Authority (CA).
At this point, depending on the security requirements of the environment, the certificate can either be self-signed or signed by a Certificate Authority.
The following CLI syntax can be used to self-sign the certificate with the key, creating a new signed certificate which, in this instance, is valid for 1 year (365 days):
# openssl x509 -req -days 365 -in server.csr -signkey server.key -out server.crt
To verify that the key matches the certificate, ensure that the output of the following CLI commands return the same md5 checksum value:
# openssl x509 -noout -modulus -in server.crt | openssl md5
# openssl rsa -noout -modulus -in server.key | openssl md5
Next, proceed to the Add certificate to cluster section of this article once this step is complete.
If a CA is signing the certificate, ensure that the new SSL certificate is in x509 format and includes the entire certificate trust chain.
Note that the CA may return the new SSL certificate, the intermediate cert, and the root cert in different files. If this is the case, the PEM formatted certificate will need to be created manually.
Notably, the correct ordering is important when creating the PEM-formatted certificate. The SSL cert must be the top of the file, followed by the intermediate certificate, with the root certificate at the bottom. For example:
-----BEGIN CERTIFICATE-----
<Contents of new SSL certificate>
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
<Contents of intermediate certificate>
<Repeat as necessary for every intermediate certificate provided by your CA>
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
<Contents of root certificate file>
-----END CERTIFICATE-----
A simple method for creating the PEM formatted file from the CLI is to cat them in the correct order as follows:
# cat CA_signed.crt intermediate.crt root.crt > onefs_pem_formatted.crt
Copy the onefs_pem_formatted.crt file to /ifs/tmp and rename it to server.crt.
Note that if any of the aforementioned files are generated with a .cer extension, they should be renamed with a .crt extension instead.
The attributes and integrity of the certificate can be sanity checked with the following CLI syntax:
# openssl x509 -text -noout -in server.crt
The first step in adding the certificate involves importing the new certificate and key into the cluster:
# isi certificate server import /ifs/tmp/server.crt /ifs/tmp/server.key
Next, verify that the certificate imported successfully:
# isi certificate server list -v
The following CLI command can be used to show the names and corresponding IDs of the certificates:
# isi certificate server list -v | grep -A1 "ID:"
Set the imported certificate as default:
# isi certificate settings modify --default-https-certificate=<id_of_cert_to_set_as_default>
Confirm that the imported certificate is being used as default by verifying status of Default HTTPS Certificate:
# isi certificate settings view
If there is an unused or outdated cert, it can be deleted with the following CLI syntax:
# isi certificate server delete --id=<id_of_cert_to_delete>
Next, view the new imported cert with command:
# isi certificate server view --id=<id_of_cert>
Note that ports 8081 and 8083 still use the certificate from the local directory for SSL. Follow the steps below if you want to use the new certificates for port 8081/8083:
# isi services -a isi_webui disable # chmod 640 server.key # chmod 640 server.crt # isi_for_array -s 'cp /ifs/tmp/server.key /usr/local/apache2/conf/ssl.key/server.key' # isi_for_array -s 'cp /ifs/tmp/server.crt /usr/local/apache2/conf/ssl.crt/server.crt' isi services -a isi_webui enable
There are two methods for verifying the updated SSL certificate.:
# echo QUIT | openssl s_client -connect localhost:8080
Note that where <cluster_name> is the FQDN or IP address, that’s typically used to access the cluster’s WebUI interface. The security details for the web page will contain the location and contact info, as above.
In both cases, the output includes location and contact info. For example:
Subject: C=US, ST=<yourstate>, L=<yourcity>, O=<yourcompany>, CN=isilon.example.com/emailAddress=tme@isilon.com
Additionally, OneFS provides warning of an impending certificate expiry by sending a CELOG event alert, similar to the following:
SW_CERTIFICATE_EXPIRING: X.509 certificate default is nearing expiration:
Event: 400170001
Certificate 'default' in '**' store is nearing expiration:
Note that OneFS does not attempt to automatically renew a certificate. Instead, an expiring cert has to be renewed manually, per the procedure described above.
When adding an additional certificate, the matching cert is used any time you connect to that SmartConnect name via HTTPS. If no matching certificate is found, OneFS will automatically revert to using the default self-signed certificate.
Thu, 16 Nov 2023 04:57:00 -0000
|Read Time: 0 minutes
When using either the OneFS WebUI or platform API (pAPI), all communication sessions are encrypted using SSL (Secure Sockets Layer), also known as Transport Layer Security (TLS). In this series, we will look at how to replace or renew the SSL certificate for the OneFS WebUI.
SSL requires a certificate that serves two principal functions: It grants permission to use encrypted communication using Public Key Infrastructure and authenticates the identity of the certificate’s holder.
Architecturally, SSL consists of four fundamental components:
SSL Component | Description |
Alert | Reports issues. |
Change cipher spec | Implements negotiated crypto parameters. |
Handshake | Negotiates crypto parameters for SSL session. Can be used for many SSL/TCP connections. |
Record | Provides encryption and MAC. |
These sit in the stack as follows:
The basic handshake process begins with a client requesting an HTTPS WebUI session to the cluster. OneFS then returns the SSL certificate and public key. The client creates a session key, encrypted with the public key it is received from OneFS. At this point, the client only knows the session key. The client now sends its encrypted session key to the cluster, which decrypts it with the private key. Now, both the client and OneFS know the session key. So, finally, the session, encrypted using a symmetric session key, can be established. OneFS automatically defaults to the best supported version of SSL, based on the client request.
A PowerScale cluster initially contains a self-signed certificate, which can be used as-is or replaced with a third-party certificate authority (CA)-issued certificate. If the self-signed certificate is used upon expiry, it must be replaced with either a third-party (public or private) CA-issued certificate or another self-signed certificate that is generated on the cluster. The following are the default locations for the server.crt and server.key files.
File | Location |
SSL certificate | /usr/local/apache2/conf/ssl.crt/server.crt |
SSL certificate key | /usr/local/apache2/conf/ssl.key/server.key |
The ‘isi certificate settings view’ CLI command displays all of the certificate-related configuration options. For example:
# isi certificate settings view Certificate Monitor Enabled: Yes Certificate Pre Expiration Threshold: 4W2D Default HTTPS Certificate ID: default Subject: C=US, ST=Washington, L=Seattle, O="Isilon", OU=Isilon, CN=Dell, emailAddress=tme@isilon.com Status: valid |
The above ‘certificate monitor enabled’ and ‘certificate pre expiration threshold’ configuration options govern a nightly cron job, which monitors the expiration of each managed certificate and fires a CELOG alert if a certificate is set to expire within the configured threshold. Note that the default expiration is 30 days (4W2D, which represents 4 weeks plus 2 days). The ‘ID: default’ configuration option indicates that this certificate is the default TLS certificate.
The basic certificate renewal or creation flow is as follows:
The steps below include options to complete a self-signed certificate replacement or renewal, or to request an SSL replacement or renewal from a Certificate Authority (CA).
Backing up the existing SSL certificate
The first task is to obtain the list of certificates by running the following CLI command, and identify the appropriate one to renew:
# isi certificate server list ID Name Status Expires ------------------------------------------- eb0703b default valid 2025-10-11T10:45:52 ------------------------------------------- |
It’s always a prudent practice to save a backup of the original certificate and key. This can be easily accomplished using the following CLI commands, which, in this case, create the directory ‘/ifs/data/ssl_bkup’ directory, set the perms to root-only access, and copy the original key and certificate to it:
# mkdir -p /ifs/data/ssl_bkup # chmod 700 /ifs/data/ssl_bkup # cp /usr/local/apache24/conf/ssl.crt/server.crt /ifs/data/ssl_bkup # cp /usr/local/apache24/conf/ssl.key/server.key /ifs/data/ssl_bkup # cd !$ cd /ifs/data/ssl_bkup # ls server.crt server.key |
Renewing or creating a certificate
The next step in the process involves either the renewal of an existing certificate or creation of a certificate from scratch. In either case, first, create a temporary directory, for example /ifs/tmp:
# mkdir /ifs/tmp; cd /ifs/tmp |
a) Renew an existing self-signed Certificate.
The following syntax creates a renewal certificate based on the existing ssl.key. The value of the ‘-days’ parameter can be adjusted to generate a certificate with the wanted expiration date. For example, the following command will create a one-year certificate.
# cp /usr/local/apache2/conf/ssl.key/server.key ./ ; openssl req -new -days 365 -nodes -x509 -key server.key -out server.crt |
Answer the system prompts to complete the self-signed SSL certificate generation process, entering the pertinent information location and contact information. For example:
Country Name (2 letter code) [AU]:US
State or Province Name (full name) [Some-State]:Washington
Locality Name (eg, city) []:Seattle
Organization Name (eg, company) [Internet Widgits Pty Ltd]:Isilon
Organizational Unit Name (eg, section) []:TME
Common Name (e.g. server FQDN or YOUR name) []:isilon.com
Email Address []:tme@isilon.com
When all the information has been successfully entered, the server.csr and server.key files will be generated under the /ifs/tmp directory.
Optionally, the attributes and integrity of the certificate can be verified with the following syntax:
# openssl x509 -text -noout -in server.crt |
Next, proceed directly to the ‘Add the certificate to the cluster’ steps in section 4 of this article.
b) Alternatively, a certificate and key can be generated from scratch, if preferred.
The following CLI command can be used to create an 2048-bit RSA private key:
# openssl genrsa -out server.key 2048 Generating RSA private key, 2048 bit long modulus ............+++++
...........................................................+++++
e is 65537 (0x10001) |
Next, create a certificate signing request:
# openssl req -new -nodes -key server.key -out server.csr |
For example:
# openssl req -new -nodes -key server.key -out server.csr -reqexts SAN -config <(cat /etc/ssl/openssl.cnf <(printf "[SAN]\nsubjectAltName=DNS:isilon.com")) You are about to be asked to enter information that will be incorporated into your certificate request. What you are about to enter is what is called a Distinguished Name or a DN. There are quite a few fields but you can leave some blank For some fields there will be a default value, If you enter '.', the field will be left blank. ----- Country Name (2 letter code) [AU]:US State or Province Name (full name) [Some-State]:WA Locality Name (eg, city) []:Seattle Organization Name (eg, company) [Internet Widgits Pty Ltd]:Isilon Organizational Unit Name (eg, section) []:TME Common Name (e.g. server FQDN or YOUR name) []:h7001 Email Address []:tme@isilon.com Please enter the following 'extra' attributes to be sent with your certificate request A challenge password []:1234 An optional company name []: # |
Answer the system prompts to complete the self-signed SSL certificate generation process, entering the pertinent information location and contact information. Additionally, a ‘challenge password’ with a minimum of 4-bytes in length will need to be selected and entered.
As prompted, enter the information to be incorporated into the certificate request. When completed, the server.csr and server.key files will appear in the /ifs/tmp directory.
If wanted, a CSR file for a Certificate Authority, which includes Subject-Alternative-Names (SAN) can be generated. For example, additional host name entries can be added using a comma (IE. DNS:isilon.com,DNS:www.isilon.com).
In the next article, we will look at the certificate singing, addition, and verification steps of the process.
Fri, 10 Nov 2023 19:37:15 -0000
|Read Time: 0 minutes
As on-the-wire encryption becomes increasingly commonplace, and often mandated via regulatory compliance security requirements, the policies applied in enterprise networks are rapidly shifting towards fully encrypting all traffic.
The OneFS SMB protocol implementation (lwio) has supported encryption for Windows and other SMB client connections to a PowerScale cluster since OneFS 8.1.1.
However, prior to OneFS 9.5, this did not include encrypted communications between the SMB redirector and Active Directory (AD) domain controller (DC). While Microsoft added support for SMB encryption in SMB 3.0, the redirector in OneFS 9.4 and prior releases only supported Microsoft’s earlier SMB 2.002 dialect.
When OneFS connects to Active Directory for tasks requiring remote procedure calls (RPCs), such as joining a domain, NTLM authentication, or resolving usernames and SIDs, these SMB connections are established from OneFS as the client connecting to a domain controller server.
As outlined in the Windows SMB security documentation, by default, and starting with Windows 2012 R2, domain admins can choose to encrypt access to a file share, which can include a domain controller. When encryption is enabled, only SMB3 connections are permitted.
With OneFS 9.5, the OneFS SMB redirector now supports SMB3, thereby allowing the Local Security Authority Subsystem Service (LSASS) daemon to communicate with domain controllers running Windows Server 2012 R2 and later over an encrypted session.
The OneFS redirector, also known as the ‘rdr driver’, is a stripped-down SMB client with minimal functionality, only supporting what is absolutely necessary.
Under the hood, OneFS SMB encryption and decryption use standard OpenSSL functions, and AES-128-CCM encryption is negotiated during SMB negotiate phase.
Although everything stems from the NTLM authentication requested by SMB server, the sequence of calls leads to the redirector establishing an SMB connection to the AD domain controller.
With OneFS 9.5, no configuration is required to enable SMB encryption in most situations, and there are no WebUI OR CLI configuration settings for the redirector.
With the default OneFS configuration, the redirector supports encryption if negotiated but it does not require it. Similarly, if the Active Directory domain requires encryption, the OneFS redirector will automatically enable and use encryption. However, if the OneFS redirector is explicitly configured to require encryption and the domain controller does not support encryption, the connection will fail.
The OneFS redirector encryption settings include:
Key | Values | Description |
Smb3EncryptionEnabled | Boolean. Default is ‘1’ == Enabled | Enable or disable SMB3 encryption for OneFS redirector. |
Smb3EncryptionRequired | Boolean. Default is ‘0’ == Not required. | Require or do not require the redirector connection to be encrypted. |
MaxSmb2DialectVersion | Default is ‘max’ == SMB 3.0.2 | Set the SMB dialect, so the redirector will support it. The maximum is currently SMB 3.0.2. |
The above keys and values are stored in the OneFS Likewise SMB registry and can be viewed and configured with the ‘lwreqshell’ utility. For example, to view the SMB redirector encryption config settings:
# /usr/likewise/bin/lwregshell list_values "HKEY_THIS_MACHINE\Services\lwio\Parameters\Drivers\rdr" | grep -i encrypt
"Smb3EncryptionEnabled" REG_DWORD 0x00000001 (1)
"Smb3EncryptionRequired" REG_DWORD 0x00000000 (0)
The following syntax can be used to disable the ‘Smb3EncryptionRequired’ parameter by setting it to value ‘1’:
# /usr/likewise/bin/lwregshell set_value "[HKEY_THIS_MACHINE\Services\lwio\Parameters\Drivers\rdr]" "Smb3EncryptionRequired" "0x00000001"
# /usr/likewise/bin/lwregshell list_values "HKEY_THIS_MACHINE\Services\lwio\Parameters\Drivers\rdr" | grep -i encrypt
"Smb3EncryptionEnabled" REG_DWORD 0x00000001 (1)
"Smb3EncryptionRequired" REG_DWORD 0x00000001 (1)
Similarly, to restore the ‘Smb3EncryptionRequired’ parameter’s default value of ‘0’ (ie. not required):
# /usr/likewise/bin/lwregshell set_value "[HKEY_THIS_MACHINE\Services\lwio\Parameters\Drivers\rdr]" "Smb3EncryptionEnabled" "0x00000001"
Note that, during the upgrade to OneFS 9.5, any nodes still running the old version will not be able to NTLM-authenticate if the DC they have affinity with requires encryption.
While redirector encryption is implemented in user space (in contrast to the SMB server, which is in the kernel), since it involves OpenSSL, the library does take advantage of hardware acceleration on the processor and utilizes AES-NI. As such, performance is only minimally impacted when the number of NTLM authentications to the AD domain is very large.
Also note that redirector encryption also only currently supports only AES-128-CCM encryption provided in the SMB 3.0.0 and 3.0.2 dialects. OneFS does not use AES-128-GCM encryption, available in the SMB 3.1.1 dialect (the latest), at this time.
When it comes to troubleshooting the redirector, the lwregshell tool can be used to verify its configuration settings. For example, to view the redirector encryption settings:
# /usr/likewise/bin/lwregshell list_values "HKEY_THIS_MACHINE\Services\lwio\Parameters\Drivers\rdr" | grep -i encrypt
"Smb3EncryptionEnabled" REG_DWORD 0x00000001 (1)
"Smb3EncryptionRequired" REG_DWORD 0x00000000 (0)
Similarly, to find the maximum SMB version supported by the redirector:
# /usr/likewise/bin/lwregshell list_values "HKEY_THIS_MACHINE\Services\lwio\Parameters\Drivers\rdr" | grep -i dialect
"MaxSmb2DialectVersion" REG_SZ "max"
The ‘lwsm’ CLI utility with the following syntax will confirm the status of the various lsass components:
# /usr/likewise/bin/lwsm list | grep lsass
lsass [service] running (lsass: 5164)
netlogon [service] running (lsass: 5164)
rdr [driver] running (lsass: 5164)
It can also be used to show and modify the logging level. For example:
# /usr/likewise/bin/lwsm get-log rdr
<default>: syslog LOG_CIFS at WARNING
# /usr/likewise/bin/lwsm set-log-level rdr - debug
# /usr/likewise/bin/lwsm get-log rdr
<default>: syslog LOG_CIFS at DEBUG
When finished, rdr logging can be returned to its previous log level as follows:
# /usr/likewise/bin/lwsm set-log-level rdr - warning
# /usr/likewise/bin/lwsm get-log rdr
<default>: syslog LOG_CIFS at WARNING
Additionally, the existing ‘lwio-tool’ utility has been modified in OneFS 9.5 to include functionality allowing simple test connections to domain controllers (no NTLM) via the new ‘rdr’ syntax:
# /usr/likewise/bin/lwio-tool rdr openpipe //<domain_controller>/NETLOGON
The ‘lwio-tool’ usage in OneFS 9.5 is as follows:
# /usr/likewise/bin/lwio-tool -h
Usage: lwio-tool <command> [command-args]
commands:
iotest rundown
rdr [openpipe|openfile] username@password://domain/path
srvtest transport [query|start|stop]
testfileapi [create|createnp] <path>
Author: Nick Trimbee
Thu, 31 Aug 2023 20:47:58 -0000
|Read Time: 0 minutes
There has been a tremendous surge of information about artificial intelligence (AI), and generative AI (GenAI) has taken center stage as a key use case. Companies are looking to learn more about how to build architectures to successfully run AI infrastructures. In most cases, creating a GenAI solution involves fine-tuning a pretrained foundational model and deploying it as an inference service. Dell recently published a design guide – Generative AI in the Enterprise – Inferencing, that provides an outline of the overall process.
All AI projects should start with understanding the business objectives and key performance indicators. Planning, data prep, and training make up the other phases of the cycle. At the core of the development are the systems that drive these phases – servers, GPUs, storage, and networking infrastructures. Dell is well equipped to deliver everything an enterprise needs to build, develop, and maintain analytic models that serve business needs.
GPUs and accelerators have become common practice within AI infrastructures. They pull in data and training/fine-tune models within the computational capabilities of the GPU. As GPUs have evolved, their ability to handle larger models and parallel development cycles has evolved. This has left a lot of us wondering - how do we build an architecture that will support the model development that my business needs? It helps to understand a few parameters.
Defining business objectives and use cases will help shape your architecture requirements.
Answering these questions helps determine how many GPUs are needed to train/fine-tune the model. Consider two main factors in GPU sizing. First is the amount of GPU memory needed to store model parameters and optimizer state. Second is the number of floating-point operations (FLOPs) needed to execute the model. Both generally scale with model size. Large models often exceed the resources of a single GPU and require spreading a single model over multiple GPUs.
Estimating the number of GPUs needed to train/fine-tune the model helps determine the server technologies to choose. When sizing servers, it’s important to balance the right GPU density and interconnect, power consumption, PCI bus technology, external port capacity, memory, and CPU. Dell PowerEdge servers include a variety of options for GPU types and density. PowerEdge XE Servers can host up to 8 NVIDIA H100 GPUs in a single server GenAI on PowerEdge XE9680, as well as the latest technologies, including NVLink, NVIDIA GPUDirect, PCIe 5.0, and NVMe disks. PowerEdge mainstream servers range from two to four GPU configurations, offering a variety of GPUs from different manufacturers. PowerEdge servers provide outstanding performance for all phases of model development. Visit Dell.com for more on PowerEdge Servers.
Now that we understand how many GPUs are needed and the servers to host them, it’s time to tackle storage. At a minimum, the storage should have capacity to host the training data set, the checkpoints during the model training, and any other data that relates to the pruning/preparing phase. The storage also needs to deliver the data at a rate the GPUs request it. The rate of delivery is multiplied by model parallelism, or the number of models being trained in parallel, and subsequently the number of GPUs requesting the data simultaneously (concurrently). Ideally, every GPU is running at 90% or better to maximize our investment, and a storage system that supports high concurrency is suited for these types of workloads.
Tools such as FIO or its cousin GDSIO (used to understand speeds and feeds of the storage system) are great for gaining hero numbers or theoretical maximums for reads/writes, but they are not representative of performance requirements for the AI development cycles. Data prep and stage shows up on the storage as random R/W, while during the training/fine-tuning phase, the GPUs are concurrently streaming reads from the storage system. Checkpoints throughout training are handled as writes back to the storage. These different points during the AI lifecycle require storage that can successfully handle these workloads at the scale determined by our model calculations and parallel development cycles.
Data scientists at Dell take great effort in understanding how different model development affects server and storage requirements. For example, language models like BERT and GPT have little effect on storage performance and resources, whereas image sequencing and DLRM models have significant or show worst case storage performance and resource demand. For this, the Dell storage teams focus testing and benchmarking on AI Deep Learning workflows based on popular image models like ResNet with real GPUs to understand the performance requirements needed to deliver data to the GPU during model training. The following image shows an architecture designed with Dell PowerEdge servers and networking with PowerScale scale-out storage.
Dell PowerScale scale-out file storage is especially suited for these workloads. Each node in a PowerScale cluster delivers equivalent performance as the cluster and workloads scale. The following images show how PowerScale performance scales linearly as GPUs are increased, while the performance of each individual GPU remains constant. The scale-out architecture of PowerScale file storage easily supports AI workflows from small to large.
Figure 1. PowerScale linear performance
Figure 2. Consistent GPU performance with scale
The predictability of PowerScale allows us to estimate the storage resources needed for model training and fine-tuning. We can easily scale these architectures based on the model type and size along with the number and type of GPUs required.
Architecting for small and large AI workloads is challenging and takes planning. Understanding performance needs and how the components in the architecture will perform as the AI workload demand scales is critical.
Author: Darren Miller
Tue, 01 Aug 2023 17:03:47 -0000
|Read Time: 0 minutes
Now in its 9th generation, Emmy-award-winning Dell PowerScale storage has been field proven in media workflows for over two decades and is the world’s most flexible1, efficient2, and secure3 scale-out NAS solution.
Our partnership with Marvel Studios is a wonderful example of the innovations we collaborate on with leading media and entertainment companies around the world—with PowerScale as the preeminent storage solution that enables data-driven workflows to accelerate content-creation pipelines.
Hear about Marvel Studios’ implementation of PowerScale directly in this educational video series from The Advanced Imaging Society:
The underlying OneFS file system leverages the foundations of clustered high-performance computing to solve the challenges of data protection at scale and client accessibility in a massively parallelized way. In practice, a single namespace that easily scales out with nodes to increase performance and capacity is a fundamentally game-changing architecture.
Media workflows require increased levels of access for the applications and users to provide for workflow collaboration in balance with security that doesn’t impede performance. Further, performance and access can’t be impeded even during hardware failures such as drive rebuilds or system upgrades to ensure that production work can continue uninterrupted while maintenance is being performed in the background.
Maximizing uptime correlates with fundamental business needs including meeting project timelines and budgets while ensuring that personnel have access to the content at the required performance levels, even during a background maintenance activity.
As a sufficiently advanced enterprise-class solution, PowerScale incorporates these capabilities to eliminate complexity and provide for increased uptime through its self-healing and self-managing functionality. (For more information, see the PowerScale OneFS Technical Overview.) This takes many of the traditional storage management burdens off the administrator’s plate, lowering the overhead and time needed to maintain storage, which is often increasing in size and scale.
While the benefits of collaboration over Ethernet-based storage are inherent in PowerScale, the user experience is also paramount in correlation with the performance of the network and underlying storage system. Operations such as playout and scrubbing need to perform reliably with no frame drops while providing response times that look and feel equivalent to working from the local workstation.
As I’ve participated in the development of media storage solutions from SCSI, Fibre Channel, iSCSI, SAN, and Ethernet, I’ve been able to test and compare solutions over the years and have closely watched the evolution and trends of these protocols in relation to their ability to support media workflows.
In 2007, I demonstrated real-time color grading with uncompressed content over 1 Gb Ethernet, the first of its kind. At the time, using Ethernet-based storage for color grading was largely unheard of, with few applications supporting it. That was more of an exercise to showcase the art of the possible in comparison with Fibre Channel-based solutions. The wider adoption of Ethernet for this particular use case was not yet of high interest because Ethernet speeds still needed to evolve. However, 1 Gb Ethernet was very appropriate for compressed media workflows and rendering, which were well aligned with the high-performance, scale-out design of PowerScale.
As 10 Gb Ethernet speeds became prevalent, there was a significant uptick in the adoption of Ethernet-based storage compared to Fibre Channel-based solutions for media use cases. I also started to see more datasets being moved over Ethernet rather than by sneaker net, physically delivering drives and tapes between locations. This led to cost and time savings for project timelines and budgets, among other benefits.
Fast-forward to 2014, when, with the OneFS 7.1.1 version supporting SMB multi-channel, we were able to use two 10 Gb Ethernet connections to support a stream of full resolution uncompressed 4K, whereas a single 10 Gb connection was only capable of supporting 2K full resolution streams. This began an adoption trend of Ethernet solutions for 4K full-resolution workflows.
In 2017, with the release of the F800 All-Flash PowerScale and OneFS 8.1, 40 Gb Ethernet speeds were supported. The floodgates were unlocked for media workflows. Multiple full-resolution 2K and 4K workflows could run on a single shared OneFS namespace with uncompromised performance. Workload consolidation could be performed and started to eliminate the need for multiple discreet storage solutions that were each supporting different parts of the pipeline, bringing all those together under a single unified OneFS namespace to streamline environments.
Complete pipeline transformations were taking place and began to replace iSCSI and Fibre Channel-based solutions at an accelerated pace, as those solutions were siloed within workgroups and inflexible with the emerging needs of collaboration. When the PowerScale F900 NVMe solution supporting 100 Gb Ethernet came out in 2021, the technology was set to change the industry yet again.
With the increasing prevalence of 100 Gb Ethernet over these past few years, performance parity with Fibre Channel-based solutions to support full-resolution 4K, 8K, and all related media workflows in between is no longer in question. Native Ethernet-based solutions are preferred for many reasons—including cloud capability, scale, cost, and supportability—to facilitate unstructured media datasets, leveraging the abundance of network engineering talent in comparison to Fibre Channel-trained engineers.
With reliability, performance, and shared access for collaboration delivering uncompromised benefits, we now look to several PowerScale storage capabilities that enable rich media ecosystems to be further streamlined and flourish.
There are four additional key areas of focus and their underlying feature sets that are increasingly important to today’s media ecosystems. They encompass:
For more information about OneFS security, see Dell PowerScale OneFS: Security Considerations.
For more information about PowerScale data orchestration, see the OneFS documentation on Dell Support.
The PowerScale Backup and Recovery Guide provides more information about data movement capabilities in OneFS.
For more information about quality of service in OneFS, see this blog post: OneFS SmartQoS.
The capabilities of PowerScale storage with OneFS are delivering unparalleled scale and feature benefits that elevate the capabilities of media entertainment use cases from the highest performance workflows to highly dense archives. Standardization on this enterprise-class, secure, and collaborative platform is the key to unlocking innovation and advancing your media pipelines.
1 Based on internal analysis of publicly available information sources, February 2023. CLM-0013892.
2 Based on Dell analysis comparing efficiency-related features: data reduction, storage capacity, data protection, hardware, space, lifecycle management efficiency, and ENERGY STAR certified configurations, June 2023. CLM-008608.
3 Based on Dell analysis comparing cyber-security software capabilities offered for Dell PowerScale vs. competitive products, September 2022.
4 Dell Technologies Executive Summary of Compliance with Media Industry Security Guidelines, https://www.delltechnologies.com/asset/en-ae/products/storage/briefs-summaries/tpn-executive-summary-compliance-statement.pdf.
Author: Brian Cipponeri, Global Solutions Architect
Dell Technologies – Unstructured Data Solutions
Thu, 17 Aug 2023 20:57:36 -0000
|Read Time: 0 minutes
Welcome to the first in a series of blog posts to reveal some helpful tips and tricks when supporting media production workflows on PowerScale OneFS.
OneFS has an incredible user-drivable toolset underneath the hood that can grant you access to data so valuable to your workflow that you'll wonder how you ever lived without it.
When working on productions in the past I’ve witnessed and had to troubleshoot many issues that arise in different parts of the pipeline. Often these are in the render part of the pipeline, which is what I’m going to focus on in this blog.
Render pipelines are normally fairly straightforward in their make-up, but they require everything to be just right to ensure that you don’t starve a cluster of resource, which, if your cluster is at the center of all of your production operations can cause a whole studio outage, causing impact to your creatives, revenue loss, and unnecessary delays in production.
Did you know that any command that is run on a OneFS cluster is an API call down to the OneFS API. This can be observed if you add the --debug flag to any command that you run on the CLI. As shown here, this displays the call information that was sent to gather the information requested, which is helpful if you're integrating your own administration tools into your pipeline.
# isi --debug statistics client list 2023-06-22 10:24:41,086 DEBUG rest.py:80: >>>GET ['3', 'statistics', 'summary', 'client'] 2023-06-22 10:24:41,086 DEBUG rest.py:81: args={'sort': 'operation_rate,in,out,time_avg,node,protocol,class,user.name,local_name,remote_name', 'degraded': 'False', 'timeout': '15'} body={} 2023-06-22 10:24:41,212 DEBUG rest.py:106: <<<(200, {'content-type': 'application/json', 'allow': 'GET, HEAD', 'status': '200 Ok'}, b'n{\n"client" : [ ]\n}\n')
There are so many potential applications for OneFS API calls, from monitoring statistics on the cluster to using your own tools for creating shares, and so on. (We'll go deeper into the API in a future post!)
When we are facing production-stopping activities on a cluster, they're often caused by a rogue process outside the OneFS environment that is as yet unknown to us, which means we have to figure out what that process is and what it is doing.
In walks isi statistics.
By using the isi statistics command, we can very quickly see what is happening on a cluster at any given time. It can give us live reports on which user or connection is causing an issue, how much I/O they're generating as well as what their IP is, what protocol they’re connected using, and so on.
If the cluster is experiencing a sudden slowdown (during a render, for example), we can run a couple of simple statistics commands to show us what the cluster is doing and who's hitting it the hardest. Some examples of these commands are as follows:
Displays all nodes’ real-time statistics in a *NIX “top” style format:
# isi statistics system --n=all --format=top Node CPU SMB FTP HTTP NFS HDFS S3 Total NetIn NetOut DiskIn DiskOut All 33.7% 0.0 0.0 0.0 0.0 0.0 0.0 0.0 401.6 215.6 0.0 0.0 1 33.7% 0.0 0.0 0.0 0.0 0.0 0.0 0.0 401.6 215.6 0.0 0.0
This command displays all clients connected and shows their stats, including the UserName they are connected with. It places the users with the highest number of total Ops at the top so that you can track down the user or account that is hitting the storage the hardest.
# isi statistics client --totalby=UserName --sort=Ops Ops In Out TimeAvg Node Proto Class UserName LocalName RemoteName ----------------------------------------------------------------------------- 12.8 12.6M 1.1k 95495.8 * * * root * * -----------------------------------------------------------------------------
This command goes a bit further and breaks down ALL of the Ops by type being requested by that user. If you know the protocol that the user you’re investigating is using we can also add the operator “--proto=<nfs/smb>” to the command too.
# isi statistics client --user-names=root --sort=Ops Ops In Out TimeAvg Node Proto Class UserName LocalName RemoteName ---------------------------------------------------------------------------------------------- 5.8 6.1M 487.2 142450.6 1 smb2 write root 192.168.134.101 192.168.134.1 2.8 259.2 332.8 497.2 1 smb2 file_state root 192.168.134.101 192.168.134.1 2.6 985.6 549.8 10255.1 1 smb2 create root 192.168.134.101 192.168.134.1 2.6 275.0 570.6 3357.5 1 smb2 namespace_read root 192.168.134.101 192.168.134.1 0.4 85.6 28.0 3911.5 1 smb2 namespace_write root 192.168.134.101 192.168.134.1 ----------------------------------------------------------------------------------------------
The other useful command, particularly when troubleshooting ad hoc performance issues, is isi statistics heat.
This command shows the top 10 file paths that are being hit by the largest number of I/O operations.
# isi statistics heat list --totalby=path --sort=Ops | head -12 Ops Node Event Class Path ---------------------------------------------------------------------------------------------------- 141.7 * * * /ifs/ 127.8 * * * /ifs/.ifsvar 86.3 * * * /ifs/.ifsvar/modules 81.7 * * * SYSTEM (0x0) 33.3 * * * /ifs/.ifsvar/modules/tardis 28.6 * * * /ifs/.ifsvar/modules/tardis/gconfig 28.3 * * * /ifs/.ifsvar/upgrade 13.1 * * * /ifs/.ifsvar/upgrade/logs/UpgradeLog-1.db 11.9 * * * /ifs/.ifsvar/modules/tardis/namespaces/healthcheck_schedules.sqlite 10.5 * * * /ifs/.ifsvar/modules/cloud
Once you have all this information, you can now find the user or process (based on IP, UserName, and so on) and figure out what that user is doing and what's causing the render to fail or high I/O generation. In many situations, it will be an asset that is either sitting on a lower-performance tier of the cluster or, if you're using a front side render cache, an asset that is sitting outside of the pre-cached path, so the spindles in the cluster are taking the I/O hit.
For more tips and tricks that can help to save you valuable time, keep checking back. In the meantime, if you have any questions, please feel free to get in touch and I'll do my best to help!
Author: Andy Copeland
Media & Entertainment Solutions Architect
Mon, 24 Jul 2023 19:16:34 -0000
|Read Time: 0 minutes
The OneFS key manager is a backend service that orchestrates the storage of sensitive information for PowerScale clusters. To satisfy Dell’s Secure Infrastructure Ready requirements and other public and private sector security mandates, the manager provides the ability to replace, or rekey, cryptographic keys.
The quintessential consumer of OneFS key management is data-at-rest encryption (DARE). Protecting sensitive data stored on the cluster with cryptography ensures that it’s guarded against theft, in the event that drives or nodes are removed from a PowerScale cluster. DARE is a requirement for federal and industry regulations, ensuring data is encrypted when it is stored. OneFS has provided DARE solutions for many years through secure encrypted drives (SEDs) and the OneFS key management system.
A 256-bit key (MK) encrypts the Key Manager Database (KMDB) for SED and cluster domains. In OneFS 9.2 and later, the MK for SEDs can either be stored off-cluster on a KMIP server or locally on a node (the legacy behavior).
However, there are a variety of other consumers of the OneFS key manager, in addition to DARE. These include services and protocols such as:
Service | Description |
---|---|
CELOG | Cluster event log |
CloudPools | Cluster tier to cloud service |
Electronic mail | |
FTP | File transfer protocol |
IPMI | Intelligent platform management interface for remote cluster console access |
JWT | JSON web tokens |
NDMP | Network data management protocol for cluster backups and DR |
Pstore | Active directory and Kerberos password store |
S3 | S3 object protocol |
SyncIQ | Cluster replication service |
SmartSync | OneFS push and pull replication cluster and cloud replication service |
SNMP | Simple network monitoring protocol |
SRS | Old Dell support remote cluster connectivity |
SSO | Single sign-on |
SupportAssist | Remote cluster connectivity to Dell Support |
OneFS 9.5 introduces a number of enhancements to the venerable key manager, including:
As such, OneFS 9.5 now provides the ability to rekey the MK, irrespective of where it is stored.
Note that when you are upgrading from an earlier OneFS release, the new rekey functionality is only available once the OneFS 9.5 upgrade has been committed.
Under the hood, each provider store in the key manager consists of secure backend storage and an MK. Entries are kept in a SQLite database or key-value store. A provider datastore uses its MK to encrypt all its entries within the store.
During the rekey process, the old MK is only deleted after a successful re-encryption with the new MK. If for any reason the process fails, the old MK is available and remains as the current MK. The rekey daemon retries the rekey every 15 minutes if the process fails.
The OneFS rekey process is as follows:
To support the rekey process, the MK in OneFS 9.5 now has an ID associated with it. All entries have a new field referencing the MK ID.
During the rekey operation, there are two MK values with different IDs, and all entries in the database will associate which key they are encrypted by.
In OneFS 9.5, the rekey configuration and management is split between the cluster keys and the SED keys:
Rekey component | Detail |
---|---|
SED |
|
Cluster |
|
The SED key manager rekey operation can be managed through a DARE cluster’s CLI or WebUI, and it can either be automatically scheduled or run manually on demand. The following CLI syntax can be used to manually initiate a rekey:
# isi keymanager sed rekey start
Alternatively, to schedule a rekey operation, for example, to schedule a key rotation every two months:
# isi keymanager sed rekey modify --key-rotation=2m
The key manager status for SEDs can be viewed as follows:
# isi keymanager sed status Node Status Location Remote Key ID Key Creation Date Error Info(if any) ----------------------------------------------------------------------------- 1 LOCAL Local 1970-01-01T00:00:00 ----------------------------------------------------------------------------- Total: 1
Alternatively, from the WebUI, go to Access > Key Management > SED/Cluster Rekey, select Automatic rekey for SED keys, and configure the rekey frequency:
Note that for SED rekey operations, if a migration from local cluster key management to a KMIP server is in progress, the rekey process will begin once the migration is complete.
As mentioned previously, OneFS 9.5 also supports the rekey of cluster keystore domains. This cluster rekey operation is available through the CLI and the WebUI and may either be scheduled or run on demand. The available cluster domains can be queried by running the following CLI syntax:
# isi keymanager cluster status Domain Status Key Creation Date Error Info(if any) ---------------------------------------------------------- CELOG ACTIVE 2023-04-06T09:19:16 CERTSTORE ACTIVE 2023-04-06T09:19:16 CLOUDPOOLS ACTIVE 2023-04-06T09:19:16 EMAIL ACTIVE 2023-04-06T09:19:16 FTP ACTIVE 2023-04-06T09:19:16 IPMI_MGMT IN_PROGRESS 2023-04-06T09:19:16 JWT ACTIVE 2023-04-06T09:19:16 LHOTSE ACTIVE 2023-04-06T09:19:11 NDMP ACTIVE 2023-04-06T09:19:16 NETWORK ACTIVE 2023-04-06T09:19:16 PSTORE ACTIVE 2023-04-06T09:19:16 RICE ACTIVE 2023-04-06T09:19:16 S3 ACTIVE 2023-04-06T09:19:16 SIQ ACTIVE 2023-04-06T09:19:16 SNMP ACTIVE 2023-04-06T09:19:16 SRS ACTIVE 2023-04-06T09:19:16 SSO ACTIVE 2023-04-06T09:19:16 ---------------------------------------------------------- Total: 17
The rekey process generates a new key and re-encrypts the entries for the domain. The old key is then deleted.
Performance-wise, the rekey process does consume cluster resources (CPU and disk) as a result of the re-encryption phase, which is fairly write-intensive. As such, a good practice is to perform rekey operations outside of core business hours or during scheduled cluster maintenance windows.
During the rekey process, the old MK is only deleted once a successful re-encryption with the new MK has been confirmed. In the event of a rekey process failure, the old MK is available and remains as the current MK.
A rekey may be requested immediately or may be scheduled with a cadence. The rekey operation is available through the CLI and the WebUI. In the WebUI, go to Access > Key Management > SED/Cluster Rekey.
To start a rekey of the cluster domains immediately, from the CLI run the following syntax:
# isi keymanager cluster rekey start Are you sure you want to rekey the master passphrase? (yes/[no]):yes
Alternatively, from the WebUI, go to Access under the SED/Cluster Rekey tab, and click Rekey Now next to Cluster keys:
A scheduled rekey of the cluster keys (excluding the SED keys) can be configured from the CLI with the following syntax:
# isi keymanager cluster rekey modify –-key-rotation [YMWDhms]
Specify the frequency of the Key Rotation field as an integer, using Y for years, M for months, W for weeks, D for days, h for hours, m for minutes, and s for seconds. For example, the following command will schedule the cluster rekey operation to run every six weeks:
# isi keymanager cluster rekey view Rekey Time: 1970-01-01T00:00:00 Key Rotation: Never # isi keymanager cluster rekey modify --key-rotation 6W # isi keymanager cluster rekey view Rekey Time: 2023-04-28T18:38:45 Key Rotation: 6W
The rekey configuration can be easily reverted back to on demand from a schedule as follows:
# isi keymanager cluster rekey modify --key-rotation Never # isi keymanager cluster rekey view Rekey Time: 2023-04-28T18:38:45 Key Rotation: Never
Alternatively, from the WebUI, under the SED/Cluster Rekey tab, select the Automatic rekey for Cluster keys checkbox and specify the rekey frequency. For example:
In an event of a rekeying failure, a CELOG KeyManagerRekeyFailed or KeyManagerSedsRekeyFailed event is created. Since SED rekey is a node-local operation, the KeyManagerSedsRekeyFailed event information will also include which node experienced the failure.
Additionally, current cluster rekey status can also be queried with the following CLI command:
# isi keymanager cluster status Domain Status Key Creation Date Error Info(if any) ---------------------------------------------------------- CELOG ACTIVE 2023-04-06T09:19:16 CERTSTORE ACTIVE 2023-04-06T09:19:16 CLOUDPOOLS ACTIVE 2023-04-06T09:19:16 EMAIL ACTIVE 2023-04-06T09:19:16 FTP ACTIVE 2023-04-06T09:19:16 IPMI_MGMT ACTIVE 2023-04-06T09:19:16 JWT ACTIVE 2023-04-06T09:19:16 LHOTSE ACTIVE 2023-04-06T09:19:11 NDMP ACTIVE 2023-04-06T09:19:16 NETWORK ACTIVE 2023-04-06T09:19:16 PSTORE ACTIVE 2023-04-06T09:19:16 RICE ACTIVE 2023-04-06T09:19:16 S3 ACTIVE 2023-04-06T09:19:16 SIQ ACTIVE 2023-04-06T09:19:16 SNMP ACTIVE 2023-04-06T09:19:16 SRS ACTIVE 2023-04-06T09:19:16 SSO ACTIVE 2023-04-06T09:19:16 ---------------------------------------------------------- Total: 17
Or, for SEDs rekey status:
# isi keymanager sed status Node Status Location Remote Key ID Key Creation Date Error Info(if any) ----------------------------------------------------------------------------- 1 LOCAL Local 1970-01-01T00:00:00 2 LOCAL Local 1970-01-01T00:00:00 3 LOCAL Local 1970-01-01T00:00:00 4 LOCAL Local 1970-01-01T00:00:00 ----------------------------------------------------------------------------- Total: 4
The rekey process also outputs to the /var/log/isi_km_d.log file, which is a useful source for additional troubleshooting.
If an error in rekey occurs, the previous MK is not deleted, so entries in the provider store can still be created and read as normal. The key manager daemon will retry the rekey operation in the background every 15 minutes until it succeeds.
Author: Nick Trimbee
Mon, 24 Jul 2023 20:08:49 -0000
|Read Time: 0 minutes
Among the slew of security enhancements introduced in OneFS 9.5 is the ability to mandate a more stringent password policy. This is required to comply with security requirements such as the U.S. military STIG, which stipulates:
Requirement | Description |
---|---|
Length | An OS or network device must enforce a minimum 15-character password length. |
Percentage | An OS must require the change of at least 50% of the total number of characters when passwords are changed. |
Position | A network device must require that when a password is changed, the characters are changed in at least eight of the positions within the password. |
Temporary password | The OS must allow the use of a temporary password for system logons with an immediate change to a permanent password. |
The OneFS password security architecture can be summarized as follows:
Within the OneFS security subsystem, authentication is handled in OneFS by LSASSD, the daemon used to service authentication requests for lwiod.
Component | Description |
---|---|
LSASSD | The local security authority subsystem service (LSASS) handles authentication and identity management as users connect to the cluster. |
File provider | The file provider includes users from /etc/password and groups from /etc/groups. |
Local provider | The local provider includes local cluster accounts such as anonymous, guest, and so on. |
SSHD | The OpenSSH Daemon provides secure encrypted communications between a client and a cluster node over an insecure network. |
pAPI | The OneFS Platform API provides programmatic interfaces to OneFS configuration and management through a RESTful HTTPS service. |
In OneFS AIMA, there are several different kinds of backend providers: Local provider, file provider, AD provider, NIS provider, and so on. Each provider is responsible for the management of users and groups inside the provider. For OneFS password policy enforcement, the local and file providers are the focus.
The local provider is based on an SamDB style file stored with prefix path of /ifs/.ifsvar, and its provider settings can be viewed by the following CLI syntax:
# isi auth local view System
On the other hand, the file provider is based on the FreeBSD spwd.db file, and its configuration can be viewed by the following CLI command:
# isi auth file view System
Each provider stores and manage its own users. For the local provider, isi auth users create CLI command will create a user inside the provider by default. However, for the file provider, there is no corresponding command. Instead, the OneFS pw CLI command can be used to create a new file provider user.
After the user is created, the isi auth users modify <USER> CLI command can be used to change the attributes of the user for both the file and local providers. However, not all attributes are supported for both providers. For example, the file provider does not support password expiry.
The fundamental password policy CLI changes introduced in OneFS 9.5 are as follows:
Operation | OneFS 9.5 change | Details |
---|---|---|
change-password | Modified | Needed to provide old password for changing so that we can calculate how many chars/percent changed |
reset-password | Added | Generates a temp password that meets current password policy for user to log in |
set-password | Deprecated | Doesn't need to provide old password |
A user’s password can now be set, changed, and reset by either root or admin. This is supported by the new isi auth users change-password or isi auth users reset-password CLI command syntax. The latter, for example, returns a temporary password and requires the user to change it on next login. After logging in with the temporary (albeit secure) password, OneFS immediately forces the user to change it:
# whoami admin # isi auth users reset-password user1 4$_x\d\Q6V9E:sH # ssh user1@localhost (user1@localhost) Password: (user1@localhost) Your password has expired. You are required to immediately change your password. Changing password for user1 New password: (user1@localhost) Re-enter password: Last login: Wed May 17 08:02:47 from 127.0.0.1 PowerScale OneFS 9.5.0.0 # whoami user1
Also in OneFS 9.5 and later, the CLI isi auth local view system command sees the addition of four new fields:
For example:
# isi auth local view system Name: System Status: active Authentication: Yes Create Home Directory: Yes Home Directory Template: /ifs/home/%U Lockout Duration: Now Lockout Threshold: 0 Lockout Window: Now Login Shell: /bin/zsh Machine Name: Min Password Age: Now Max Password Age: 4W Min Password Length: 0 Password Prompt Time: 2W Password Complexity: - Password History Length: 0 Password Chars Changed: 0 Password Percent Changed: 0 Password Hash Type: NTHash Max Inactivity Days: 0
The following CLI command syntax configures OneFS to require a minimum password length of 15 characters, a 50% or greater change, and 8 or more characters to be altered for a successful password reset:
# isi auth local modify system --min-password-length 15 --password-chars-changed 8 --password-percent-changed 50
Next, a command is issued to create a new user, user2, with a 10-character password:
# isi auth users create user2 --password 0123456789 Failed to add user user1: The specified password does not meet the configured password complexity or history requirements
This attempt fails because the password does not meet the configured password criteria (15 chars, 50% change, 8 chars to be altered).
Instead, the password for the new account, user2, is set to an appropriate value: 0123456789abcdef. Also, the --prompt-password-change flag is used to force the user to change their password on next login.
# isi auth users create user2 --password 0123456789abcdef –prompt-password-change 1
When the user logs in to the user2 account, OneFS immediately prompts for a new password. In the following example, a non-compliant password (012345678zyxw) is entered.
0123456789abcdef -> 012345678zyxw = Failure
This returns an unsuccessful change attempt failure because it does not meet the 15-character minimum:
# su user2 New password: Re-enter password: The specified password does not meet the configured password complexity requirements. Your password must meet the following requirements: * Must contain at least 15 characters. * Must change at least 8 characters. * Must change at least 50% of characters. New password:
Instead, a compliant password and successful change could be:
0123456789abcdef -> 0123456zyxwvuts = Success
The following command can also be used to change the password for a user. For example, to update user2’s password:
# isi auth users change-password user2 Current password (hit enter if none): New password: Confirm new password:
If a non-compliant password is entered, the following error is returned:
Password change failed: The specified password does not meet the configured password complexity or history requirements
When employed, OneFS hardening automatically enforces security-based configurations. The hardening engine is profile-based, and its STIG security profile is predicated on security mandates specified in the U.S. Department of Defense (DoD) Security Requirements Guides (SRGs) and Security Technical Implementation Guides (STIGs).
On applying the STIG hardening security profile to a cluster (isi hardening apply --profile=STIG), the password policy settings are automatically reconfigured to the following values:
Field | Normal value | STIG hardened |
---|---|---|
Lockout Duration | Now | Now |
Lockout Threshold | 0 | 3 |
Lockout Window | Now | 15m |
Min Password Age | Now | 1D |
Max Password Age | 4W | 8W4D |
Min Password Length | 0 | 15 |
Password Prompt Time | 2W | 2W |
Password Complexity | - | lowercase, numeric, repeat, symbol, uppercase |
Password History Length | 0 | 5 |
Password Chars Changed | 0 | 8 |
Password Percent Changed | 0 | 50 |
Password Hash Type | NTHash | SHA512 |
Max Inactivity Days | 0 | 35 |
For example:
# uname -or Isilon OneFS 9.5.0.0 # isi hardening list Name Description Status --------------------------------------------------- STIG Enable all STIG security settings Applied --------------------------------------------------- Total: 1 # isi auth local view system Name: System Status: active Authentication: Yes Create Home Directory: Yes Home Directory Template: /ifs/home/%U Lockout Duration: Now Lockout Threshold: 3 Lockout Window: 15m Login Shell: /bin/zsh Machine Name: Min Password Age: 1D Max Password Age: 8W4D Min Password Length: 15 Password Prompt Time: 2W Password Complexity: lowercase, numeric, repeat, symbol, uppercase Password History Length: 5 Password Chars Changed: 8 Password Percent Changed: 50 Password Hash Type: SHA512 Max Inactivity Days: 35
Note that Password Hash Type is changed from the default NTHash to the more secure SHA512 encoding, in addition to setting the various password criteria.
The OneFS 9.5 WebUI also sees several additions and alterations to the Password policy page. These include:
Operation | OneFS 9.5 change | Details |
---|---|---|
Policy page | Added | New Password policy page under Access > Membership and roles |
reset-password | Added | Generates a random password that meets current password policy for user to log in |
The most obvious change is the transfer of the policy configuration elements from the local provider page to a new dedicated Password policy page.
Here’s the OneFS 9.4 View a local provider page, under Access > Authentication providers > Local providers > System:
This is replaced and augmented in the OneFS 9.5 WebUI with the following page, located under Access > Membership and roles > Password policy:
New password policy configuration options are included to require uppercase, lowercase, numeric, or special characters and limit the number of contiguous repeats of a character, and so on.
When it comes to changing a password, only a permitted user can make their change. This can be performed from a couple of locations in the WebUI. First, the user options on the task bar at the top of each screen now provides a Change password option:
A pop-up warning message will also be displayed by the WebUI, informing the user when password expiration is imminent. This warning provides a Change Password link:
Clicking on the Change Password link displays the following page:
A new password complexity tool-tip message is also displayed, informing the user of safe password selection.
Note that re-login is required after a password change.
On the Users page under Access > Membership and roles > Users, the Action drop-down list on the now also contains a Reset Password option:
The successful reset confirmation pop-up offers both a show and copy option, while informing the cluster administrator to share the new password with the user, and for them to change their password during their next login:
The Create user page now provides an additional field that requires password confirmation. Additionally, the password complexity tool-tip message is also displayed:
The redesigned Edit user details page no longer provides a field to edit the password directly:
Instead, the Action drop-down list on the Users page now contains a Reset Password option.
Author: Nick Trimbee
Thu, 20 Jul 2023 18:27:32 -0000
|Read Time: 0 minutes
In the first article in this series, we took a look at the architecture of the new OneFS WebUI SSO functionality. Now, we move on to its provisioning and setup.
SSO on PowerScale can be configured through either the OneFS WebUI or CLI. OneFS 9.5 debuts a new dedicated WebUI SSO configuration page under Access > Authentication Providers > SSO. Alternatively, for command line afficionados, the CLI now includes a new isi auth sso command set.
Here is the overall configuration flow:
1. Upgrade to OneFS 9.5
First, ensure the cluster is running OneFS 9.5 or a later release. If upgrading from an earlier OneFS version, note that the SSO service requires this upgrade to be committed prior to configuration and use.
Next, configure an SSO administrator. In OneFS, this account requires at least one of the following privileges:
Privilege | Description |
---|---|
ISI_PRIV_LOGIN_PAPI | Required for the admin to use the OneFS WebUI to administer SSO |
ISI_PRIV_LOGIN_SSH | Required for the admin to use the OneFS CLI through SSH to administer SSO |
ISI_PRIV_LOGIN_CONSOLE | Required for the admin to use the OneFS CLI on the serial console to administer SSO |
The user account used for identity provider management should have an associated email address configured.
2. Setup Identity Provider
OneFS SSO activation also requires having a suitable identity provider (IdP), such as ADFS, provisioned and available before setting up OneFS SSO.
ADFS can be configured through either the Windows GUI or command shell, and detailed information on the deployment and configuration of ADFS can be found in the Microsoft Windows Server documentation.
The Windows remote desktop utility (RDP) can be used to provision, connect to, and configure an ADFS server.
$AuthRules = @" @RuleTemplate="AllowAllAuthzRule" => issue(Type = "http://schemas.microsoft.com/ authorization/claims/permit", Value="true"); "@
or from the ADFS UI:
$TransformRules = @" @RuleTemplate = "LdapClaims" @RuleName = "LDAP mail" c:[Type == "http://schemas.microsoft.com/ws/2008/06/identity/claims/ windowsaccountname", Issuer == "AD AUTHORITY"] => issue(store = "Active Directory", types = ("http://schemas.xmlsoap.org/ws/2005/05/identity/claims/ emailaddress"), query = ";mail;{0}", param = c.Value); @RuleTemplate = "MapClaims" @RuleName = "NameID" c:[Type == "http://schemas.xmlsoap.org/ws/2005/05/identity/claims/emailaddress"] => issue(Type = "http://schemas.xmlsoap.org/ws/2005/05/identity/claims/ nameidentifier", Issuer = c. Issuer, OriginalIssuer = c.OriginalIssuer, Value = c.Value, ValueType = c.ValueType, Properties["http://schemas.xmlsoap.org/ws/2005/05/identity / claimproperties/format"] = "urn:oasis:names:tc:SAML:1.1:nameid-format:emailAddress"); "@
Add-AdfsRelyingPartyTrust -Name <cluster-name>\ -MetadataUrl "https://<cluster-node- ip>:8080/session/1/saml/metadata" \ -IssuanceAuthorizationRules $AuthRules -IssuanceTransformRules $TransformRules
or from Windows Server Manager:
3. Select Access Zone
Because OneFS SSO is zone-aware, the next step involves choosing the access zone to configure. Go to Access > Authentication providers > SSO, select an access zone (that is, the system zone), and click Add IdP.
Note that each of a cluster’s access zone or zones must have an IdP configured for it. The same IdP can be used for all the zones, but each access zone must be configured separately.
4. Add IdP Configuration
In OneFS 9.5 and later, the WebUI SSO configuration is a wizard-driven, “guided workflow” process involving the following steps:
First, go to Access > Authentication providers > SSO, select an access zone (that is, the system zone), and then click Add IdP.
On the Add Identity Provider page, enter a unique name for the IdP. For example, Isln-IdP1 in this case:
When done, click Next, select the default Upload metadata XML option, and browse to the XML file downloaded from the ADFS system:
Alternatively, if the preference is to enter the information by hand, select Manual entry and complete the configuration form fields:
If the manual entry method is selected, you must have the IdP certificate ready to upload. With the manual entry option, the following information is required:
Field | Description |
---|---|
Binding | Select POST or Redirect binding. |
Entity ID | Unique identifier of the IdP as configured on the IdP. For example: http://idp1.isilon.com/adfs/services/trust |
Login URL | Log in endpoint for the IdP. For example: http://idp1.isilon.com/adfs/ls/ |
Logout URL | Log out endpoint for the IdP. For example: http://idp1.example.com/adfs/ls/ |
Signing Certificate | Provide the PEM encoded certificate obtained from the IdP. This certificate is required to verify messages from the IdP. |
Upload the IdP certificate:
For example:
Repeat this step for each access zone in which SSO is to be configured.
When complete, click Next to move on to the service provider configuration step.
5. Configure Service Provider
On the Service Provider page, confirm that the current access zone is carried over from the previous page.
Select Metadata download or Manual copy, depending on the chosen method of entering OneFS details about this service provider (SP) to the IdP.
Provide the hostname or IP address for the SP for the current access zone.
Click Generate to create the information (metadata) about OneFS and this access zone for use in configuring the IdP.
This generated information can now be used to configure the IdP (in this case, Windows ADFS) to accept requests from PowerScale as the SP and its configured access zone.
As shown, the WebUI page provides two methods for obtaining the information:
Method | Action |
---|---|
Metadata download | Download the XML file that contains the signing certificate, etc. |
Manual copy | Select Copy Link in the lower half of the form to copy the information to the IdP. |
Next, download the Signing Certificate.
When completed, click Next to finish the configuration.
6. Enable SSO and Verify Operation
Once the IdP and SP are configured, a cluster admin can enable SSO per access zone through the OneFS WebUI by going to Access > Authentication providers > SSO. From here, select the access zone and select the toggle to enable SSO:
Or from the OneFS CLI, use the following syntax:
# isi auth sso settings modify --sso-enabled 1
Author: Nick Trimbee
Thu, 20 Jul 2023 16:32:13 -0000
|Read Time: 0 minutes
The Security Assertion Markup Language (SAML) is an open standard for sharing security information about identity, authentication, and authorization across different systems. SAML is implemented using the Extensible Markup Language (XML) standard for sharing data. The SAML framework enables single sign-on (SSO), which in turn allows users to log in once, and their login credential can be reused to authenticate with and access other different service providers. It defines several entities including end users, service providers, and identity providers, and is used to manage identity information. For example, the Windows Active Directory Federation Services (ADFS) is one of the ubiquitous identity providers for SAML contexts.
Entity | Description |
---|---|
End user | Requires authentication prior to being allowed to use an application. |
Identity provider (IdP) | Performs authentication and passes the user's identity and authorization level to the service provider—for example, ADFS. |
Service provider (SP) | Trusts the identity provider and authorizes the given user to access the requested resource. With SAML 2.0, a PowerScale cluster is a service provider. |
SAML Assertion | XML document that the identity provider sends to the service provider that contains the user authorization. |
OneFS 9.5 introduces SAML-based SSO for the WebUI to provide a more convenient authentication method, in addition to meeting the security compliance requirements for federal and enterprise customers. In OneFS 9.5, the WebUI’s initial login page has been redesigned to support SSO and, when enabled, a new Log in with SSO button is displayed on the login page under the traditional username and password text boxes. For example:
OneFS SSO is also zone-aware in support of multi-tenant cluster configurations. As such, a separate IdP can be configured independently for each OneFS access zone.
Under the hood, OneFS SSO employs the following high-level architecture:
In OneFS 9.5, the SSO operates through HTTP REDIRECT and POST bindings, with the cluster acting as the service provider.
There are three different types of SAML Assertions—authentication, attribute, and authorization decision.
SAML SSO works by transferring the user’s identity from one place (the identity provider) to another (the service provider). This is done through an exchange of digitally signed XML documents.
A SAML Request, also known as an authentication request, is generated by the service provider to “request” an authentication.
A SAML Response is generated by the identity provider and contains the actual assertion of the authenticated user. In addition, a SAML Response may contain additional information, such as user profile information and group/role information, depending on what the service provider can support. Note that the service provider never directly interacts with the identity provider, with a browser acting as the agent facilitating any redirections.
Because SAML authentication is asynchronous, the service provider does not maintain the state of any authentication requests. As such, when the service provider receives a response from an identity provider, the response must contain all the necessary information.
The general flow is as follows:
When OneFS redirects a user to the configured IdP for login, it makes an HTTP GET request (SAMLRequest), instructing the IdP that the cluster is attempting to perform a login (SAMLAuthnRequest). When the user successfully authenticates, the IdP responds back to OneFS with an HTTP POST containing an HTML form (SAMLResponse) that indicates whether the login was successful, who logged in, plus any additional claims configured on the IdP.
On receiving the SAMLResponse, OneFS verifies the signature using the public key (X.509 certificate) in to ensure that it really came from its trusted IdP and that none of the contents have been tampered with. OneFS then extracts the identity of the user, along with any other pertinent attributes. At this point, the user is redirected back to the OneFS WebUI dashboard (landing page), as if logged into the site manually.
In the next article in this series, we’ll take a detailed look at the following procedure to deploy SSO on a PowerScale cluster:
Author: Nick Trimbee
Thu, 20 Jul 2023 16:23:21 -0000
|Read Time: 0 minutes
Another of the core security enhancements introduced in OneFS 9.5 is the ability to enforce strict user account security policies. This is required for compliance with both private and public sector security mandates. For example, the account policy restriction requirements expressed within the U.S. military STIG requirements stipulate:
Requirement | Description |
---|---|
Delay | The OS must enforce a delay of at least 4 seconds between logon prompts following a failed logon attempt. |
Disable | The OS must disable account identifiers (individuals, groups, roles, and devices) after 35 days of inactivity. |
Limit | The OS must limit the number of concurrent sessions to ten for all accounts and/or account types. |
To directly address these security edicts, OneFS 9.5 adds the following account policy restriction controls:
Account policy function | Details |
---|---|
Delay after failed login |
|
Disable inactive accounts |
|
Concurrent session limit |
|
OneFS provides a variety of access mechanisms for administering a cluster. These include SSH, serial console, WebUI, and platform API, all of which use different underlying access methods. The serial console and SSH are standard FreeBSD third-party applications and are accounted for per node, whereas the WebUI and pAPI use HTTP module extensions to facilitate access to the system and services and are accounted for cluster-wide. Before OneFS 9.5, there was no common mechanism to represent or account for sessions across these disparate applications.
Under the hood, the OneFS account security policy framework encompasses the following high-level architecture:
With SSH, there’s no explicit or reliable “log-off” event sent to OneFS, beyond actually disconnecting the connection. As such, accounting for active sessions can be problematic and unreliable, especially when connections time out or unexpectedly disconnect. However, OneFS does include an accounting database that stores records of system activities like user login and logout, which can be queried to determine active SSH sessions. Each active SSH connection has an isi_ssh_d process owned by the account associated with it, and this information can be gathered via standard syscalls. OneFS enumerates the number of SSHD processes per account to calculate the total number of active established sessions. This value is then used as part of the total concurrent administrative sessions limit. Since SSH only supports user access through the system zone, there is no need for any zone-aware accounting.
The WebUI and platform API use JSON web tokens (JWTs) for authenticated sessions. OneFS stores the JWTs in the cluster-wide kvstore, and access policy uses valid session tokens in the kvstore to account for active sessions when a user logs on through the WebUI or pAPI. When the user logs off, the associated token is removed, and a message is sent to JWT service with an explicit log off notification. If a session times out or disconnects, the JWT service will not get an event, but the tokens have a limited, short lifespan, and any expired tokens are purged from the list on a scheduled basis in conjunction with the JWT timer. OneFS enumerates the unique session IDs associated with each user’s JWT tokens in the kvstore to get a number of active WebUI and pAPI sessions to use as part of user’s session limit check.
For serial console access accounting, the process table will have information when an STTY connection is active, and OneFS extrapolates user data from it to determine the session count, similar to ssh with a syscall for process data. There is an accounting database that stores records of system activities like user login and logout, which is also queried for active console sessions. Serial console access is only from the system zone, so there is no need for zone-aware accounting.
An API call retrieves user session data from the process table and kvstore to calculate number of user active sessions. As such, the checking and enforcement of session limits is performed in similar manner to the verification of user privileges for SSH, serial console, or WebUI access.
OneFS 9.5 provides the ability to enforce a configurable delay period. This delay is specified in seconds, after which every unsuccessful authentication attempt results in the user being denied the ability to reconnect to the cluster until after the configured delay period has passed. The login delay period is defined in seconds through the FailedLoginDelayTime global attribute and, by default, OneFS is configured for no delay through a FailedLoginDelayTime value of 0. When a cluster is placed into hardened mode with the STIG policy enacted, the delay value is automatically set to 4 seconds. Note that the delay happens in the lsass client, so that the authentication service is not affected.
The configured failed login delay time limit can be viewed with following CLI command:
# isi auth settings global view Send NTLMv2: No Space Replacement: Workgroup: WORKGROUP Provider Hostname Lookup: disabled Alloc Retries: 5 User Object Cache Size: 47.68M On Disk Identity: native RPC Block Time: Now RPC Max Requests: 64 RPC Timeout: 30s Default LDAP TLS Revocation Check Level: none System GID Threshold: 80 System UID Threshold: 80 Min Mapped Rid: 2147483648 Group UID: 4294967292 Null GID: 4294967293 Null UID: 4294967293 Unknown GID: 4294967294 Unknown UID: 4294967294 Failed Login Delay Time: Now Concurrent Session Limit: 0
Similarly, the following syntax will configure the failed login delay time to a value of 4 seconds:
# isi auth settings global modify --failed-login-delay-time 4s # isi auth settings global view | grep -i delay Failed Login Delay Time: 4s
However, when a cluster is put into STIG hardening mode, the “Concurrent sessions limit” is automatically set to 10.
# isi auth settings global view | grep -i delay Failed Login Delay Time: 10s
The delay time after login failure can also be configured from the WebUI under Access > Settings > Global provider settings:
The valid range of the FailedLoginDelayTime global attribute is from 0 to 65535, and the delay time is limited to the same cluster node.
Note that this maximum session limit is only applicable to administrative logins.
In OneFS 9.5, any user account that has been inactive for a configurable duration can be automatically disabled. Administrative intervention is required to re-enable a deactivated user account. The last activity time of a user is determined by their previous logon, and a timer runs every midnight during which all “inactive” accounts are disabled. If the last logon record for a user is unavailable, or stale, the timestamp when the account was enabled is taken as their last activity instead. If inactivity tracking is enabled after the last logon (or enabled) time of a user, the inactivity tracking time is considered for inactivity period.
This feature is disabled by default in OneFS, and all users are exempted from inactivity tracking until configured otherwise. However, individual accounts can be exempted from this behavior, and this can be configured through the user-specific DisableWhenInactive attribute. For example:
# isi auth user view user1 | grep -i inactive Disable When Inactive: Yes # isi auth user modify user1 --disable-when-inactive 0 # isi auth user view user1 | grep -i inactive Disable When Inactive: No
If a cluster is put into STIG hardened mode, the value for the MaxInactivityDays parameter is automatically reconfigured to 35, meaning a user will be disabled after 35 days of inactivity. All the local users are removed from exemption when in STIG hardened mode.
Note that this functionality is limited to only the local provider and does not apply to file providers.
The inactive account disabling configuration can be viewed from the CLI with the following syntax. In this example, the MaxInactivityDays attribute is configured for 35 days:
# isi auth local view system Name: System Status: active Authentication: Yes Create Home Directory: Yes Home Directory Template: /ifs/home/%U Lockout Duration: Now Lockout Threshold: 0 Lockout Window: Now Login Shell: /bin/zsh Machine Name: Min Password Age: Now Max Password Age: 4W Min Password Length: 15 Password Prompt Time: 2W Password Complexity: - Password History Length: 0 Password Chars Changed: 8 Password Percent Changed: 50 Password Hash Type: NTHash Max Inactivity Days: 35
Inactive account disabling can also be configured from the WebUI under Access > Authentication providers > Local provider:
The valid range of the MaxInactivityDays parameter is from 0 to UINT_MAX. As such, the following CLI syntax will configure the maximum number of days a user account can be inactive before it will be disabled to 10 days:
# isi auth local modify system --max-inactivity-days 10 # isi auth local view system | grep -i inactiv Max Inactivity Days: 0tem –max-inactivity-days 10
Setting this value to 0 days will disable the feature:
# isi auth local modify system --max-inactivity-days 0 # isi auth local view system | grep -i inactiv Max Inactivity Days: 0tem –max-inactivity-days 0
Inactivity account disabling, as well as password expiry, can also be configured granularly, per user account. For example, user1 has a default configuration of the Disable When Inactive threshold set to No.
# isi auth users view user1 Name: user1 DN: CN=user1,CN=Users,DC=GLADOS DNS Domain: - Domain: GLADOS Provider: lsa-local-provider:System Sam Account Name: user1 UID: 2000 SID: S-1-5-21-1839173366-2940572996-2365153926-1000 Enabled: Yes Expired: No Expiry: - Locked: No Email: - GECOS: - Generated GID: No Generated UID: No Generated UPN: Yes Primary Group ID: GID:1800 Name: Isilon Users Home Directory: /ifs/home/user1 Max Password Age: 4W Password Expired: No Password Expiry: 2023-06-15T17:45:55 Password Last Set: 2023-05-18T17:45:55 Password Expired: No Last Logon: - Shell: /bin/zsh UPN: user1@GLADOS User Can Change Password: Yes Disable When Inactive: No
The following CLI command will activate the account inactivity disabling setting and enable password expiry for the user1 account:
# isi auth users modify user1 --disable-when-inactive Yes --password-expires Yes
Inactive account disabling can also be configured from the WebUI under Access > Membership and roles > Users > Providers:
OneFS 9.5 can limit the number of administrative sessions active on a OneFS cluster node, and all WebUI, SSH, pAPI, and serial console sessions are accounted for when calculating the session limit. The SSH and console session count is node-local, whereas WebUI and pAPI sessions are tracked cluster-wide. As such, the formula used to calculate a node’s total active sessions is as follows:
Total active user sessions on a node = Total WebUI and pAPI sessions across the cluster + Total SSH and Console sessions on the node
This feature leverages the cluster-wide session management through JWT for calculating the total number of sessions on a cluster’s node. By default, OneFS 9.5 has no configured limit, and the Concurrent Session Limit parameter has a value of 0. For example:
# isi auth settings global view Send NTLMv2: No Space Replacement: Workgroup: WORKGROUP Provider Hostname Lookup: disabled Alloc Retries: 5 User Object Cache Size: 47.68M On Disk Identity: native RPC Block Time: Now RPC Max Requests: 64 RPC Timeout: 30s Default LDAP TLS Revocation Check Level: none System GID Threshold: 80 System UID Threshold: 80 Min Mapped Rid: 2147483648 Group UID: 4294967292 Null GID: 4294967293 Null UID: 4294967293 Unknown GID: 4294967294 Unknown UID: 4294967294 Failed Login Delay Time: Now Concurrent Session Limit: 0
The following CLI syntax will configure Concurrent Session Limit to a value of 5:
# isi auth settings global modify –-concurrent-session-limit 5 # isi auth settings global view | grep -i concur Concurrent Session Limit: 5
Once the session limit has been exceeded, attempts to connect, in this case as root through SSH, will be met with the following Access denied error message:
login as: root Keyboard-interactive authentication prompts from server: | Password: End of keyboard-interactive prompts from server Access denied password:
The concurrent sessions limit can also be configured from the WebUI under Access > Settings > Global provider settings:
However, when a cluster is put into STIG hardening mode, the concurrent session limit is automatically set to a maximum of 10 sessions.
Note that this maximum session limit is only applicable to administrative logins.
Disabling an account after a period of inactivity in OneFS requires a SQLite database update every time a user has successfully logged on to the OneFS cluster. After a successful logon, the time to logon is recorded in the database, which is later used to compute the inactivity period.
Inactivity tracking is disabled by default in OneFS 9.5, but can be easily enabled by configuring the MaxInactivityDays attribute to a non-zero value. In cases where inactivity tracking is enabled and many users are not exempt from inactivity tracking, a significant number of logons within a short period of time can generate significant SQLite database requests. However, OneFS consolidates multiple database updates during user logon to a single commit to minimize the overall load.
When it comes to troubleshooting OneFS account security policy configurations, there are these main logfiles to check:
For additional reporting detail, debug level logging can be enabled on the lsassd.log file with the following CLI command:
# /usr/likewise/bin/lwsm set-log-level lsass – debug
When finished, logging can be returned to the regular error level:
# /us/likewise/bin/lwsm set-log-level lsass - error
Author: Nick Trimbee
Wed, 19 Jul 2023 18:16:59 -0000
|Read Time: 0 minutes
Of the many changes in OneFS 9.5, the most exciting are the performance enhancements on the NVMe-based PowerScale nodes: F900 and F600. These performance increases are the result of some significant changes “under-the-hood” to OneFS. In the lead-up to the National Association of Broadcasters show last April, I wanted to qualify how much of a difference the extra performance would make for Adobe Premiere Pro video editing workflows. Adobe is one of Dell’s biggest media software partners, and Premiere Pro is crucial to all sorts of media production, from broadcast to cinema.
The awesome news is that the changes to OneFS make a big difference. I saw 40% more video streams with the software upgrade: up to 140 streams of UHD ProRes422 from a single F900 node!
Broadly speaking, there were changes to three areas in OneFS that resulted in the performance boost in version 9.5. These areas are L2 cache, backend networking, and prefetch.
L2 cache -- Being smart about how and when to bypass L2 cache and read directly from NVMe is one part of the OneFS 9.5 performance story. PowerScale OneFS clusters maintain a globally accessible L2 cache for all nodes in the cluster. Manipulating L2 cache can be “expensive” computationally speaking. During a read, the cluster needs to determine what data is in cache, whether the read should be added to cache, and what data should be expired from cache. NVMe storage is so performant that bypassing the L2 cache and reading data directly from NVMe frees up cluster resources. Doing so results in even faster reads on nodes that support it.
Backend networking -- OneFS uses a private backend network for internode communication. With the massive performance of NVMe based storage and the introduction of 100 GbE, limits were getting reached on this private network. OneFS 9.5 gets around these limitations with a custom multichannel approach (similar in concept to nconnect from the NFS world for the Linux folks out there). In OneFS 9.5, the connection channels on the backend network are bonded in a carefully orchestrated way to parallelize some aspects, while still keeping a predictable message ordering.
Prefetch -- The last part of the performance boost for OneFS 9.5 comes from improved file prefetch. How OneFS prefetches file system metadata was reworked to more optimally read ahead at the different depths of the metadata tree. Efficiency was improved and “jitter” between file system processes minimized.
First a little background on PowerScale and OneFS. PowerScale is the updated name for the Isilon product line. The new PowerScale nodes are based on Dell servers with compute, RAM, networking, and storage. PowerScale is a scale-out, clustered network-attached-storage (NAS) solution. To build a OneFS file system, PowerScale nodes are joined to create cluster. The cluster creates a single NAS file system with the aggregate resources of all the nodes in the cluster. Client systems connect using a DNS name, and OneFS SmartConnect balances client connections between the various nodes. No matter which node the client connects to, that client has the potential to access all the data on the entire cluster. Further, the client systems benefit from the all the nodes acting in concert.
Even before the performance enhancements in OneFS 9.5, the NVMe-based PowerScale nodes were speedy, so a robust lab environment was going to be needed to stress the system. For this particular set of tests, I had access to 16 workstations running the latest version of Adobe Premiere Pro 2023. Each workstation ran Windows 10 with Nvidia GPU, Intel processor, and 10 GbE networking. On the storage side, the tests were performed against a minimum sized 3-node F900 PowerScale cluster with 100 GbE networking.
Adobe Premiere Pro excels at compressed video editing. The trick with compressed video is that an individual client workstation will get overwhelmed long before the storage system. As such, it is critical to evaluate whether any dropped frames are the result of storage or an overwhelmed workstation. A simple test is to take a single workstation and start playing back parallel compressed video streams, such as ProRes 422. Keeping a close watch on the workstation performance monitors, at a certain point CPU and GPU usage will spike and frames will drop. This test will show the maximum number of streams that a single workstation can handle. Because this test is all about storage performance, keeping the number of streams per workstation to a healthy range takes individual workstation performance out of the equation.
I settled on 10x streams of ProRes 422 UHD video running at 30 frames per second per workstation. Each individual video stream was ~70 MBps (560mbps). Running ten of these streams meant each workstation was pulling around 700 MBps (though with Premiere Pro prefetching this number was closer to 800 MBps). With this number of video streams, the workstation wasn’t working too hard and it was well within what would fit down a 10 GbE network pipe.
Running some quick math here, 16 workstations each pulling 800-ish MBps works out to about 12.5 GBps of total throughput. This throughput is not enough throughput to overwhelm even a small 3-node F900 cluster. In order to stress the system, all 16 workstations were manually pointed to single 100 GbE port on a single F900 node. Due to the clustered nature of OneFS, the clients will get benefit from the entire cluster. But even with the rest of the cluster behind it, at a certain point, a single F900 node is going to get overwhelmed.
Figure 1. OneFS Lab configuration
The first step was to import test media for playback. Each workstation accessed its own unique set of 10x one-hour long UHD ProRes422 clips. Then a separate Premiere Pro project was created for each workstation with 10 simultaneous layers of video. The plan was to start playback one by one on each workstation and see where the tipping point was for that single PowerScale F900 node. The test was to be run first with OneFS 9.4 and then with OneFS 9.5.
Adobe Premiere Pro has a debug overlay called DogEars. In addition to showing dropped frames, DogEars provides some useful metrics about how “healthy” video playback is in Premiere Pro. Even before a system starts to drop frames, latency spikes and low prefetch buffers show when Premiere Pro is struggling to sustain playback.
The metrics in DogEars that I was focused on were the following:
Dropped frames: This metric is obvious, dropped frames are unacceptable. However, at times Premiere Pro will show single digit dropped frames at playback start.
FramePrefetchLatency: This metric only shows up during playback. The latency starts high while the prefetch frame buffer is filling. When that buffer gets up to slightly over 300 frames, the latency drops down to around 20 to 30 milliseconds. When the storage system was overwhelmed, this prefetch latency goes well above 30 milliseconds and stays there.
CompleteAheadOfPlay: This metric also only shows up during playback. The number of frames creeps up during playback and settles in at slightly over 300 prefetched frames. The FramePrefetchLatency above will be high (in the 100ms range or so) until the 300 frames are prefetched, at which point the latency will drop down to 30ms or lower. When the storage system is stressed, Premiere Pro is never able to fill this prefetch buffer, and it never gets up to the 300+ frames.
Figure 2. Premiere Pro with Dogears overlay
With the test environment configured and the individual projects loaded, it was time to see what the system could provide.
With the PowerScale cluster running OneFS 9.4, playback was initiated on each Adobe Premiere workstation. Keep in mind that all the workstations were artificially pointed to a single node in this 3-node F900 cluster. That single F900 node running OneFS 9.4 could handle 10x of the workstations, each playing back 10x UHD streams. That’s 100x streams of UHD ProRes 422 video from one node. Not too shabby.
At 110x streams (11 workstations), no frames were dropped, but the CompleteAheadOfPlay number on all the workstations started to go below 300. Also, the FramePreFetchLatency spiked to over 100 milliseconds. Clearly, the storage node was unable to provide more performance.
After reproducing these results several times to confirm accuracy, we unmounted the storage from each workstation and upgraded the F900 cluster to OneFS 9.5. Time to see how much of a difference the OneFS 9.5 performance boost would make for Premiere Pro.
As before, each workstation loaded a unique project with unique ProRes media. At 100x streams of video, playback chugged along fine. Time to load up additional streams and see where things break. 110, 120, 130, 140… playback from the single F900 node continued to chug along with no drops and acceptable latency. It was only at 150 streams of video that playback began to suffer. By this time, that single F900 node was pumping close to 10GBps out of that single 100 GbE NIC port. These 14x workstations were not entirely saturating the connection, but getting close. And the performance was a 40% bump from the OneFS 9.4 numbers. Impressive.
Figure 3. isi statistics output with 140 streams of video from a single node
These results exceeded my expectations going into the project. Getting a 40% performance boost with a code upgrade to existing hardware is impressive. This increase lined up with some of the benchmarking tools used by engineering. But performance from a benchmark tool vs. a real-world application are often two entirely different things. Benchmark tools are particularly inaccurate for video playback where small increases in latency can result in unacceptable results. Because Adobe Premiere is one of the most widely used applications with PowerScale storage, it made sense as a test platform to gauge these differences. For more information about PowerScale storage and media, check out https://Dell.to/media.
Click here to learn more about the author, Gregory Shiff
Wed, 19 Jul 2023 18:19:27 -0000
|Read Time: 0 minutes
In my role as technical lead for media workflows at Dell Technologies, I’m fortunate to partner with companies making some of the best tools for creatives. FilmLight is undeniably one of those companies. Baselight by FilmLight is used in the highest end of feature film production. I was eager to put the latest all-flash PowerScale OneFS nodes to the test and see how those storage nodes could support Baselight workflows. I’m pleased to say that PowerScale supports Baselight very well, and I’m able to share best practices for integrating PowerScale into Baselight environments.
Baselight is a color grading and image-processing system that is widely used in cinematic production. Traditionally, Baselight DI workflows are the domain of SAN or block storage. The journey towards supporting modern DI workflows on PowerScale started with OneFS’s support of NFS-over-RDMA. Using the RDMA protocol with PowerScale all flash storage allows for high throughput workflows that are unobtainable with TCP. Using RDMA for media applications is well documented in the blog and white paper: NFS over RDMA for Media Workflows.
With successful RDMA testing on other color correction software complete, I was confident that we could add Baselight to the list of supported platforms. The time seemed ripe, and FilmLight agreed to work with us on getting it done. In partnership with the FilmLight team in LA, we got Baselight One up and running in the Seattle media lab.
FilmLightOS already has a driver installed that supports RDMA for the NIC in the workstation. This made configuration easy, because no additional software had to be installed to support the protocol (at least in our case). While RDMA remains the best choice for using PowerScale with Baselight, not all networks can support RDMA. The good news here is that there is another option: nconnect.
The Linux distribution that Baselight runs on also supports the NFS nconnect mount option. Nconnect allows for multiple TCP connections between the Baselight client and the PowerScale storage. Testing with nconnect demonstrated enough throughput to support 8K uncompressed playback from PowerScale. While RDMA is preferred, it is not an absolute requirement.
With the storage mounted and performing as expected, we set about adjusting Baselight threads and DirectIO settings to optimize the interaction of Baselight and PowerScale. The results of this testing showed that increasing BaseLight’s thread count to 16 improved performance. (These threads were unrelated to the nconnect connections mentioned above.) DirectIO is a mechanism that bypasses some caching layers in Linux. DirectIO improved Baselight’s write performance and degraded read performance. Thankfully, Baselight is flexible enough to selectively enable DirectIO only for writes.
PowerScale is an easy win for Baselight One. However, Baselight also comes in other variations: Baselight Two and Baselight X. These versions of Baselight have separate processing nodes and host UI devices to tackle the most challenging workflows. These Baselight systems share configuration files that can cause issues with how the storage is mounted on the processing nodes as compared to the host UI nodes. When using RDMA, the processing nodes will use an RDMA mount while the host UI will use TCP. Working with the FilmLight team in LA, changes were made to support separate mount options for the processing nodes vs, host UI node.
Getting to know Baselight and partnering with FilmLight on this project was highly satisfying. It would not have been easy to understand the finer intricacies of how Baselight interacts with storage without their help (the rendering and caching mechanisms within Baselight are awesome).
For more details about how to use PowerScale with Baselight, check out the full white paper: PowerScale OneFS: Baselight by FilmLight Best Practices and Configuration.
For more information, and the latest content on Dell Media and Entertainment storage solutions, visit us online.
Tue, 27 Jun 2023 20:37:27 -0000
|Read Time: 0 minutes
Complementary to the restricted shell itself, which was covered in the previous article in this series, OneFS 9.5 also sees the addition of a new log viewer, plus a recovery shell option.
The new isi_log_access CLI utility enables an SSH user to read, page, and query the log files in the /var/log directory. The ability to run this tool is governed by the user’s role being granted the ISI_PRIV_SYS_SUPPORT role-based access control (RBAC) privilege.
OneFS RBAC is used to explicitly limit who has access to the range of cluster configurations and operations. This granular control allows for crafting of administrative roles, which can create and manage the various OneFS core components and data services, isolating each to specific security roles or to admin only, and so on.
In this case, a cluster security administrator selects the access zone, creates a zone-aware role within it, assigns the ISI_PRIV_SYS_SUPPORT privileges for isi_log_access use, and then assigns users to the role.
Note that the integrated OneFS AuditAdmin RBAC role does not contain the ISI_PRIV_SYS_SUPPORT privileges by default. Also, the integrated RBAC roles cannot be reconfigured:
# isi auth roles modify AuditAdmin --add-priv=ISI_PRIV_SYS_SUPPORT The privileges of built-in role AuditAdmin cannot be modified
Therefore, the ISI_PRIV_SYS_SUPPORT role has to be added to a custom role.
For example, the following CLI syntax adds the user usr_admin_restricted to the rl_ssh role and adds the privilege ISI_PRIV_SYS_SUPPORT to the rl_ssh role:
# isi auth roles modify rl_ssh --add-user=usr_admin_restricted # isi auth roles modify rl_ssh --add-priv=ISI_PRIV_SYS_SUPPORT # isi auth roles view rl_ssh Name: rl_ssh Description: - Members: u_ssh_restricted u_admin_restricted Privileges ID: ISI_PRIV_LOGIN_SSH Permission: r ID: ISI_PRIV_SYS_SUPPORT Permission: r
The usr_admin_restricted user could also be added to the AuditAdmin role:
# isi auth roles modify AuditAdmin --add-user=usr_admin_restricted # isi auth roles view AuditAdmin | grep -i member Members: usr_admin_restricted
The isi_log_access tool supports the following command options and arguments:
Option | Description |
---|---|
–grep | Match a pattern against the file and display on stdout |
–help | Display the command description and usage message |
–list | List all the files in the /var/log tree |
–less | Display the file on stdout with a pager in secure_mode |
–more | Display the file on stdout with a pager in secure_mode |
–view | Display the file on stdout |
–watch | Display the end of the file and new content as it is written |
–zgrep | Match a pattern against the unzipped file contents and display on stdout |
–zview | Display an unzipped version of the file on stdout |
Here the u_admin_restricted user logs in to the SSH and runs the isi_log_access utility to list the /var/log/messages log file:
# ssh u_admin_restricted@10.246.178.121 (u_admin_restricted@10.246.178.121) Password: Last login: Wed May 3 18:02:18 2023 from 10.246.159.107 Copyright (c) 2001-2023 Dell Inc. or its subsidiaries. All Rights Reserved. Copyright (c) 1992-2018 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. PowerScale OneFS 9.5.0.0 Allowed commands are clear ... isi ... isi_recovery_shell ... isi_log_access ... exit logout # isi_log_access –list LAST MODIFICATION TIME SIZE FILE Mon Apr 10 14:22:18 2023 56 alert.log Fri May 5 00:30:00 2023 62 all.log Fri May 5 00:30:00 2023 99 all.log.0.gz Fri May 5 00:00:00 2023 106 all.log.1.gz Thu May 4 00:30:00 2023 100 all.log.2.gz Thu May 4 00:00:00 2023 107 all.log.3.gz Wed May 3 00:30:00 2023 99 all.log.4.gz Wed May 3 00:00:00 2023 107 all.log.5.gz Tue May 2 00:30:00 2023 100 all.log.6.gz Mon Apr 10 14:22:18 2023 56 audit_config.log Mon Apr 10 14:22:18 2023 56 audit_protocol.log Fri May 5 17:23:53 2023 82064 auth.log Sat Apr 22 12:09:31 2023 10750 auth.log.0.gz Mon Apr 10 15:31:36 2023 0 bam.log Mon Apr 10 14:22:18 2023 56 boxend.log Mon Apr 10 14:22:18 2023 56 bwt.log Mon Apr 10 14:22:18 2023 56 cloud_interface.log Mon Apr 10 14:22:18 2023 56 console.log Fri May 5 18:20:32 2023 23769 cron Fri May 5 15:30:00 2023 8803 cron.0.gz Fri May 5 03:10:00 2023 9013 cron.1.gz Thu May 4 15:00:00 2023 8847 cron.2.gz Fri May 5 03:01:02 2023 3012 daily.log Fri May 5 00:30:00 2023 101 daily.log.0.gz Fri May 5 00:00:00 2023 1201 daily.log.1.gz Thu May 4 00:30:00 2023 102 daily.log.2.gz Thu May 4 00:00:00 2023 1637 daily.log.3.gz Wed May 3 00:30:00 2023 101 daily.log.4.gz Wed May 3 00:00:00 2023 1200 daily.log.5.gz Tue May 2 00:30:00 2023 102 daily.log.6.gz Mon Apr 10 14:22:18 2023 56 debug.log Tue Apr 11 12:29:37 2023 3694 diskpools.log Fri May 5 03:01:00 2023 244566 dmesg.today Thu May 4 03:01:00 2023 244662 dmesg.yesterday Tue Apr 11 11:49:32 2023 788 drive_purposing.log Mon Apr 10 14:22:18 2023 56 ethmixer.log Mon Apr 10 14:22:18 2023 56 gssd.log Fri May 5 00:00:35 2023 41641 hardening.log Mon Apr 10 15:31:05 2023 17996 hardening_engine.log Mon Apr 10 14:22:18 2023 56 hdfs.log Fri May 5 15:51:28 2023 31359 hw_ata.log Fri May 5 15:51:28 2023 56527 hw_da.log Mon Apr 10 14:22:18 2023 56 hw_nvd.log Mon Apr 10 14:22:18 2023 56 idi.log
In addition to parsing an entire log file with the more and less flags, the isi_log_access utility can also be used to watch (that is, tail) a log. For example, the /var/log/messages log file:
% isi_log_access --watch messages 2023-05-03T18:00:12.233916-04:00 <1.5> h7001-2(id2) limited[68236]: Called ['/usr/bin/isi_log_access', 'messages'], which returned 2. 2023-05-03T18:00:23.759198-04:00 <1.5> h7001-2(id2) limited[68236]: Calling ['/usr/bin/isi_log_access']. 2023-05-03T18:00:23.797928-04:00 <1.5> h7001-2(id2) limited[68236]: Called ['/usr/bin/isi_log_access'], which returned 0. 2023-05-03T18:00:36.077093-04:00 <1.5> h7001-2(id2) limited[68236]: Calling ['/usr/bin/isi_log_access', '--help']. 2023-05-03T18:00:36.119688-04:00 <1.5> h7001-2(id2) limited[68236]: Called ['/usr/bin/isi_log_access', '--help'], which returned 0. 2023-05-03T18:02:14.545070-04:00 <1.5> h7001-2(id2) limited[68236]: Command not in list of allowed commands. 2023-05-03T18:02:50.384665-04:00 <1.5> h7001-2(id2) limited[68594]: Calling ['/usr/bin/isi_log_access', '--list']. 2023-05-03T18:02:50.440518-04:00 <1.5> h7001-2(id2) limited[68594]: Called ['/usr/bin/isi_log_access', '--list'], which returned 0. 2023-05-03T18:03:13.362411-04:00 <1.5> h7001-2(id2) limited[68594]: Command not in list of allowed commands. 2023-05-03T18:03:52.107538-04:00 <1.5> h7001-2(id2) limited[68738]: Calling ['/usr/bin/isi_log_access', '--watch', 'messages'].
As expected, the last few lines of the messages log file are displayed. These log entries include the command audit entries for the usr_admin_secure user running the isi_log_access utility with both the --help, --list, and --watch arguments.
The isi_log_access utility also allows zipped log files to be read (–zview) or searched (–zgrep) without uncompressing them. For example, to find all the usr_admin entries in the zipped vmlog.0.gz file:
# isi_log_access --zgrep usr_admin vmlog.0.gz 0.0 64468 usr_admin_restricted /usr/local/bin/zsh 0.0 64346 usr_admin_restricted python /usr/local/restricted_shell/bin/restricted_shell.py (python3.8) 0.0 64468 usr_admin_restricted /usr/local/bin/zsh 0.0 64346 usr_admin_restricted python /usr/local/restricted_shell/bin/restricted_shell.py (python3.8) 0.0 64342 usr_admin_restricted sshd: usr_admin_restricted@pts/3 (sshd) 0.0 64331 root sshd: usr_admin_restricted [priv] (sshd) 0.0 64468 usr_admin_restricted /usr/local/bin/zsh 0.0 64346 usr_admin_restricted python /usr/local/restricted_shell/bin/restricted_shell.py (python3.8) 0.0 64342 usr_admin_restricted sshd: usr_admin_restricted@pts/3 (sshd) 0.0 64331 root sshd: usr_admin_restricted [priv] (sshd) 0.0 64468 usr_admin_restricted /usr/local/bin/zsh 0.0 64346 usr_admin_restricted python /usr/local/restricted_shell/bin/restricted_shell.py (python3.8) 0.0 64342 usr_admin_restricted sshd: usr_admin_restricted@pts/3 (sshd) 0.0 64331 root sshd: usr_admin_restricted [priv] (sshd) 0.0 64468 usr_admin_restricted /usr/local/bin/zsh 0.0 64346 usr_admin_restricted python /usr/local/restricted_shell/bin/restricted_shell.py (python3.8) 0.0 64342 usr_admin_restricted sshd: u_admin_restricted@pts/3 (sshd) 0.0 64331 root sshd: usr_admin_restricted [priv] (sshd)
The purpose of the recovery shell is to allow a restricted shell user to access a regular UNIX shell and its associated command set, if needed. As such, the recovery shell is primarily designed and intended for reactive cluster recovery operations and other unforeseen support issues. Note that the isi_recovery_shell CLI command can only be run, and the recovery shell entered, from within the restricted shell.
The ISI_PRIV_RECOVERY_SHELL privilege is required for a user to elevate their shell from restricted to recovery. The following syntax can be used to add this privilege to a role, in this case the rl_ssh role:
% isi auth roles modify rl_ssh --add-priv=ISI_PRIV_RECOVERY_SHELL % isi auth roles view rl_ssh Name: rl_ssh Description: - Members: usr_ssh_restricted usr_admin_restricted Privileges ID: ISI_PRIV_LOGIN_SSH Permission: r ID: ISI_PRIV_SYS_SUPPORT Permission: r ID: ISI_PRIV_RECOVERY_SHELL Permission: r
However, note that the –-restricted-shell-enabled security parameter must be set to true before a user with the ISI_PRIV_RECOVERY_SHELL privilege can enter the recovery shell. For example:
% isi security settings view | grep -i restr Restricted shell Enabled: No % isi security settings modify –restricted-shell-enabled=true % isi security settings view | grep -i restr Restricted shell Enabled: Yes
The restricted shell user must enter the cluster’s root password to successfully enter the recovery shell. For example:
% isi_recovery_shell -h Description: This command is used to enter the Recovery shell i.e. normal zsh shell from the PowerScale Restricted shell. This command is supported only in the PowerScale Restricted shell. Required Privilege: ISI_PRIV_RECOVERY_SHELL Usage: isi_recovery_shell [{--help | -h}]
If the root password is entered incorrectly, the following error is displayed:
% isi_recovery_shell Enter 'root' credentials to enter the Recovery shell Password: Invalid credentials. isi_recovery_shell: PAM Auth Failed
A successful recovery shell launch is as follows:
$ ssh u_admin_restricted@10.246.178.121 (u_admin_restricted@10.246.178.121) Password: Last login: Thu May 4 17:26:10 2023 from 10.246.159.107 Copyright (c) 2001-2023 Dell Inc. or its subsidiaries. All Rights Reserved. Copyright (c) 1992-2018 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. PowerScale OneFS 9.5.0.0 Allowed commands are clear ... isi ... isi_recovery_shell ... isi_log_access ... exit logout % isi_recovery_shell Enter 'root' credentials to enter the Recovery shell Password: %
At this point, regular shell/UNIX commands (including the vi editor) are available again:
% whoami u_admin_restricted % pwd /ifs/home/u_admin_restricted
% top | head -n 10 last pid: 65044; load averages: 0.12, 0.24, 0.29 up 24+04:17:23 18:38:39 118 processes: 1 running, 117 sleeping CPU: 0.1% user, 0.0% nice, 0.9% system, 0.1% interrupt, 98.9% idle Mem: 233M Active, 19G Inact, 2152K Laundry, 137G Wired, 60G Buf, 13G Free Swap: PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 3955 root 1 -22 r30 50M 14M select 24 142:28 0.54% isi_drive_d 5715 root 20 20 0 231M 69M kqread 5 55:53 0.15% isi_stats_d 3864 root 14 20 0 81M 21M kqread 16 133:02 0.10% isi_mcp
The specifics of the recovery shell (ZSH) for the u_admin_restricted user are reported as follows:
% printenv $SHELL _=/usr/bin/printenv PAGER=less SAVEHIST=2000 HISTFILE=/ifs/home/u_admin_restricted/.zsh_history HISTSIZE=1000 OLDPWD=/ifs/home/u_admin_restricted PWD=/ifs/home/u_admin_restricted SHLVL=1 LOGNAME=u_admin_restricted HOME=/ifs/home/u_admin_restricted RECOVERY_SHELL=TRUE TERM=xterm PATH=/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/sbin:/usr/local/bin:/root/bin
Shell logic conditions and scripts can be run. For example:
% while true; do uptime; sleep 5; done 5:47PM up 24 days, 3:26, 5 users, load averages: 0.44, 0.38, 0.34 5:47PM up 24 days, 3:26, 5 users, load averages: 0.41, 0.38, 0.34
ISI commands can be run, and cluster management tasks can be performed.
% isi hardening list Name Description Status --------------------------------------------------- STIG Enable all STIG security settings Not Applied --------------------------------------------------- Total: 1
For example, creating and deleting a snapshot:
% isi snap snap list ID Name Path ------------ ------------ Total: 0 % isi snap snap create /ifs/data % isi snap snap list ID Name Path -------------------- 2 s2 /ifs/data -------------------- Total: 1 % isi snap snap delete 2 Are you sure? (yes/[no]): yes
Sysctls can be read and managed:
% sysctl efs.gmp.group efs.gmp.group: <10539754> (4) :{ 1:0-14, 2:0-12,14,17, 3-4:0-14, smb: 1-4, nfs: 1-4, all_enabled_protocols: 1-4, isi_cbind_d: 1-4, lsass: 1-4, external_connectivity: 1-4 }
The restricted shell can be disabled:
% isi security settings modify --restricted-shell-enabled=false % isi security settings view | grep -i restr Restricted shell Enabled: No
However, the isi underscore (isi_*) commands, such as isi_for_array, are still not permitted to run:
% /usr/bin/isi_for_array -s uptime zsh: permission denied: /usr/bin/isi_for_array % isi_gather_info zsh: permission denied: isi_gather_info % isi_cstats isi_cstats: Syscall ifs_prefetch_lin() failed: Operation not permitted
When finished, the user can either end the session entirely with the logout command or quit the recovery shell through exit and return to the restricted shell:
% exit Allowed commands are clear ... isi ... isi_recovery_shell ... isi_log_access ... exit logout %
Author: Nick Trimbee
Tue, 27 Jun 2023 19:59:59 -0000
|Read Time: 0 minutes
In contrast to many other storage appliances, PowerScale has always included an extensive, rich, and capable command line, drawing from its FreeBSD heritage. As such, it incorporates a choice of full UNIX shells (that is, ZSH), the ability to script in a variety of languages (Perl, Python, and so on), full data access, a variety of system and network management and monitoring tools, plus the comprehensive OneFS isi command set. However, what is a bonus for usability can also present a risk from a security point of view.
With this in mind, among the bevy of security features that debuted in OneFS 9.5 release is the addition of a restricted shell for the CLI. This shell heavily curtails access to cluster command line utilities, eliminating areas where commands and scripts could be run and files modified maliciously and unaudited.
The new restricted shell can help both public and private sector organizations to meet a variety of regulatory compliance and audit requirements, in addition to reducing the security threat surface when OneFS is administered.
Written in Python, the restricted shell constrains users to a tight subset of the commands available in the regular OneFS command line shells, plus a couple of additional utilities. These include:
CLI utility | Description |
---|---|
ISI commands | The isi or “isi space” commands. These include the commands such as isi status, and so on. For the full set of isi commands, run isi –help. |
Shell commands | The supported shell commands include clear, exit, logout, and CTRL+D. |
Log access | The isi_log_access tool can be used if the user possesses the ISI_PRIV_SYS_SUPPORT privilege. |
Recovery shell | The recovery shell isi_recovery_shell can be used if the user possesses the ISI_PRIV_RECOVERY_SHELL and the security setting Restricted shell Enabled is configured to true. |
For a OneFS CLI command to be audited, its handler needs to call through the platform API (pAPI). This occurs with the regular isi commands but not necessarily with the “isi underscore” commands such as isi_for_array, and so on. While some of these isi_* commands write to log files, there is no uniform or consistent auditing or logging.
On the data access side, /ifs file system auditing works through the various OneFS protocol heads (NFS, SMB, S3, and so on). So if the CLI is used with an unrestricted shell to directly access and modify /ifs, any access and changes are unrecorded and unaudited.
In OneFS 9.5, the new restricted shell is included in the permitted shells list (/etc/shells):
# grep -i restr /etc/shells /usr/local/restricted_shell/bin/restricted_shell.py
It can be easily set for a user through the CLI. For example, to configure the admin account to use the restricted shell, instead of its default of ZSH:
# isi auth users view admin | grep -i shell Shell: /usr/local/bin/zsh # isi auth users modify admin --shell=/usr/local/restricted_shell/bin/restricted_shell.py # isi auth users view admin | grep -i shell Shell: /usr/local/restricted_shell/bin/restricted_shell.py
OneFS can also be configured to limit non-root users to just the secure shell:
Restricted shell Enabled: No # isi security settings modify --restricted-shell-enabled=true # isi security settings view | grep -i restr Restricted shell Enabled: Yes
The underlying configuration changes to support this include only allowing non-root users with approved shells in /etc/shells to log in through the console or SSH and having just /usr/local/restricted_shell/bin/restricted_shell.py in the /etc/shells config file.
Note that no users’ shells are changed when the configuration commands above are enacted. If users are intended to have shell access, their login shell must be changed before they can log in. Users will also require the privileges ISI_PRIV_LOGIN_SSH and/or ISI_PRIV_LOGIN_CONSOLE to be able to log in through SSH and the console, respectively.
While the WebUI in OneFS 9.5 does not provide a secure shell configuration page, the restricted shell can be enabled from the platform API, in addition to the CLI. The pAPI security settings now include a restricted_shell_enabled key, which can be enabled by setting to value=1, from its default of 0.
Be aware that, upon configuring a OneFS 9.5 cluster to run in hardened mode with the STIG profile (that is, isi hardening enable STIG), the restricted-shell-enable security setting is automatically set to true. This means that only root and users with ISI_PRIV_LOGIN_SSH and/or ISI_PRIV_LOGIN_CONSOLE privileges and the restricted shell as their shell will be permitted to log in to the cluster. We will focus on OneFS security hardening in a future article.
So let’s take a look at some examples of the restricted shell’s configuration and operation.
First, we log in as the admin user and modify the file and local auth provider password hash types to the more secure SHA512 from their default value of NTHash:
# ssh 10.244.34.34 -l admin # isi auth file view System | grep -i hash Password Hash Type: NTHash # isi auth local view System | grep -i hash Password Hash Type: NTHash # isi auth file modify System –-password-hash-type=SHA512 # isi auth local modify System –-password-hash-type=SHA512
Note that a cluster’s default user admin uses role-based access control (RBAC), whereas root does not. As such, the root account should ideally be used as infrequently as possible and, ideally, considered solely as the account of last resort.
Next, the admin and root passwords are changed to generate new passwords using the SHA512 hash:
# isi auth users change-password root # isi auth users change-password admin
An rl_ssh role is created and the SSH access privilege is added to it:
# isi auth roles create rl_ssh # isi auth roles modify rl_ssh –-add-priv=ISI_PRIV_LOGIN_SSH
Then a regular user (usr_ssh_restricted) and an admin user (usr_admin_resticted) are created with restricted shell privileges:
# isi auth users create usr_ssh_restricted –-shell=/usr/local/restricted_shell/bin/restricted_shell.py –-set-password # isi auth users create usr_admin_restricted –shell=/usr/local/restricted_shell/bin/restricted_shell.py –-set-password
We then assign roles to the new users. For the restricted SSH user, we add to our newly created rl_ssh role:
# isi auth roles modify rl_ssh –-add-user=usr_ssh_restricted
The admin user is then added to the security admin and the system admin roles:
# isi auth roles modify SecurityAdmin –-add-user=usr_admin_restricted # isi auth roles modify SystemAdmin –-add-user=usr_admin_restricted
Next, we connect to the cluster through SSH and authenticate as the usr_ssh_restricted user:
$ ssh usr_ssh_restricted@10.246.178.121 (usr_ssh_restricted@10.246.178.121) Password: Copyright (c) 2001-2023 Dell Inc. or its subsidiaries. All Rights Reserved. Copyright (c) 1992-2018 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. PowerScale OneFS 9.5.0.0 Allowed commands are clear ... isi ... isi_recovery_shell ... isi_log_access ... exit logout %
This account has no cluster RBAC privileges beyond SSH access so cannot run the various isi commands. For example, attempting to run isi status returns no data and, instead, warns of the need for event, job engine, and statistics privileges:
% isi status Cluster Name: h7001 __ *** Capacity and health information require *** *** the privilege: ISI_PRIV_STATISTICS. *** Critical Events: *** Requires the privilege: ISI_PRIV_EVENT. *** Cluster Job Status: __ *** Requires the privilege: ISI_PRIV_JOB_ENGINE. *** Allowed commands are clear ... isi ... isi_recovery_shell ... isi_log_access ... exit logout %
Similarly, standard UNIX shell commands, such as pwd and whoami, are also prohibited:
% pwd Allowed commands are clear ... isi ... isi_recovery_shell ... isi_log_access ... exit logout % whoami Allowed commands are clear ... isi ... isi_recovery_shell ... isi_log_access ... exit logout
Indeed, without additional OneFS RBAC privileges, the only commands the usr_ssh_restricted user can actually run in the restricted shell are clear, exit, and logout:
Note that the restricted shell automatically logs out an inactive session after a short period of inactivity.
Next, we log in in with the usr_admin_restricted account:
$ ssh usr_admin_restricted@10.246.178.121 (usr_admin_restricted@10.246.178.121) Password: Copyright (c) 2001-2023 Dell Inc. or its subsidiaries. All Rights Reserved. Copyright (c) 1992-2018 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. PowerScale OneFS 9.5.0.0 Allowed commands are clear ... isi ... isi_recovery_shell ... isi_log_access ... exit logout %
The isi commands now work because the user has the SecurityAdmin and SystemAdmin roles and privileges:
% isi auth roles list Name --------------- AuditAdmin BackupAdmin BasicUserRole SecurityAdmin StatisticsAdmin SystemAdmin VMwareAdmin rl_console rl_ssh --------------- Total: 9 Allowed commands are clear ... isi ... isi_recovery_shell ... isi_log_access ... exit logout % isi auth users view usr_admin_restricted Name: usr_admin_restricted DN: CN=usr_admin_restricted,CN=Users,DC=H7001 DNS Domain: - Domain: H7001 Provider: lsa-local-provider:System Sam Account Name: usr_admin_restricted UID: 2003 SID: S-1-5-21-3745626141-289409179-1286507423-1003 Enabled: Yes Expired: No Expiry: - Locked: No Email: - GECOS: - Generated GID: No Generated UID: No Generated UPN: Yes Primary Group ID: GID:1800 Name: Isilon Users Home Directory: /ifs/home/usr_admin_restricted Max Password Age: 4W Password Expired: No Password Expiry: 2023-05-30T17:16:53 Password Last Set: 2023-05-02T17:16:53 Password Expires: Yes Last Logon: - Shell: /usr/local/restricted_shell/bin/restricted_shell.py UPN: usr_admin_restricted@H7001 User Can Change Password: Yes Disable When Inactive: No Allowed commands are clear ... isi ... isi_recovery_shell ... isi_log_access ... exit logout %
However, the OneFS “isi underscore” commands are not supported under the restricted shell. For example, attempting to use the isi_for_array command:
% isi_for_array -s uname -a Allowed commands are clear ... isi ... isi_recovery_shell ... isi_log_access ... exit logout
Note that, by default, the SecurityAdmin and SystemAdmin roles do not grant the usr_admin_restricted user the privileges needed to run the new isi_log_access and isi_recovery_shell commands.
In the next article in this series, we’ll take a look at these associated isi_log_access and isi_recovery_shell utilities that are also introduced in OneFS 9.5.
Author: Nick Trimbee
Thu, 25 May 2023 14:41:59 -0000
|Read Time: 0 minutes
In the final blog in this series, we’ll focus on step five of the OneFS firewall provisioning process and turn our attention to some of the management and monitoring considerations and troubleshooting tools associated with the firewall.
One can manage and monitor the firewall in OneFS 9.5 using the CLI, platform API, or WebUI. Because data security threats come from inside an environment as well as out, such as from a rogue IT employee, a good practice is to constrain the use of all-powerful ‘root’, ‘administrator’, and ‘sudo’ accounts as much as possible. Instead of granting cluster admins full rights, a preferred approach is to use OneFS’ comprehensive authentication, authorization, and accounting framework.
OneFS role-based access control (RBAC) can be used to explicitly limit who has access to configure and monitor the firewall. A cluster security administrator selects the desired access zone, creates a zone-aware role within it, assigns privileges, and then assigns members. For example, from the WebUI under Access > Membership and roles > Roles:
When these members login to the cluster from a configuration interface (WebUI, Platform API, or CLI) they inherit their assigned privileges.
Accessing the firewall from the WebUI and CLI in OneFS 9.5 requires the new ISI_PRIV_FIREWALL administration privilege.
# isi auth privileges -v | grep -i -A 2 firewall ID: ISI_PRIV_FIREWALL Description: Configure network firewall Name: Firewall Category: Configuration Permission: w
This privilege can be assigned one of four permission levels for a role, including:
Permission Indicator | Description |
– | No permission. |
R | Read-only permission. |
X | Execute permission. |
W | Write permission. |
By default, the built-in ‘SystemAdmin’ roles is granted write privileges to administer the firewall, while the built-in ‘AuditAdmin’ role has read permission to view the firewall configuration and logs.
With OneFS RBAC, an enhanced security approach for a site could be to create two additional roles on a cluster, each with an increasing realm of trust. For example:
1. An IT ops/helpdesk role with ‘read’ access to the snapshot attributes would permit monitoring and troubleshooting the firewall, but no changes:
RBAC Role | Firewall Privilege | Permission |
IT_Ops | ISI_PRIV_FIREWALL | Read |
For example:
# isi auth roles create IT_Ops # isi auth roles modify IT_Ops --add-priv-read ISI_PRIV_FIREWALL # isi auth roles view IT_Ops | grep -A2 -i firewall ID: ISI_PRIV_FIREWALL Permission: r
2. A Firewall Admin role would provide full firewall configuration and management rights:
RBAC Role | Firewall Privilege | Permission |
FirewallAdmin | ISI_PRIV_FIREWALL | Write |
For example:
# isi auth roles create FirewallAdmin # isi auth roles modify FirewallAdmin –add-priv-write ISI_PRIV_FIREWALL # isi auth roles view FirewallAdmin | grep -A2 -i firewall ID: ISI_PRIV_FIREWALL Permission: w
Note that when configuring OneFS RBAC, remember to remove the ‘ISI_PRIV_AUTH’ and ‘ISI_PRIV_ROLE’ privilege from all but the most trusted administrators.
Additionally, enterprise security management tools such as CyberArk can also be incorporated to manage authentication and access control holistically across an environment. These can be configured to change passwords on trusted accounts frequently (every hour or so), require multi-Level approvals prior to retrieving passwords, and track and audit password requests and trends.
When working with the OneFS firewall, there are some upper bounds to the configurable attributes to keep in mind. These include:
Name | Value | Description |
MAX_INTERFACES | 500 | Maximum number of L2 interfaces including Ethernet, VLAN, LAGG interfaces on a node. |
MAX _SUBNETS | 100 | Maximum number of subnets within a OneFS cluster |
MAX_POOLS | 100 | Maximum number of network pools within a OneFS cluster |
DEFAULT_MAX_RULES | 100 | Default value of maximum rules within a firewall policy |
MAX_RULES | 200 | Upper limit of maximum rules within a firewall policy |
MAX_ACTIVE_RULES | 5000 | Upper limit of total active rules across the whole cluster |
MAX_INACTIVE_POLICIES | 200 | Maximum number of policies that are not applied to any network subnet or pool. They will not be written into ipfw tables. |
Be aware that, while the OneFS firewall can greatly enhance the network security of a cluster, by nature of its packet inspection and filtering activity, it does come with a slight performance penalty (generally less than 5%).
If OneFS STIG hardening (that is, from ‘isi hardening apply’) is applied to a cluster with the OneFS firewall disabled, the firewall will be automatically activated. On the other hand, if the firewall is already enabled, then there will be no change and it will remain active.
Some OneFS services allow the TCP/UDP ports on which the daemon listens to be changed. These include:
Service | CLI Command | Default Port |
NDMP | isi ndmp settings global modify –port | 10000 |
S3 | isi s3 settings global modify –https-port | 9020, 9021 |
SSH | isi ssh settings modify –port | 22 |
The default ports for these services are already configured in the associated global policy rules. For example, for the S3 protocol:
# isi network firewall rules list | grep s3 default_pools_policy.rule_s3 55 Firewall rule on s3 service allow # isi network firewall rules view default_pools_policy.rule_s3 ID: default_pools_policy.rule_s3 Name: rule_s3 Index: 55 Description: Firewall rule on s3 service Protocol: TCP Dst Ports: 9020, 9021 Src Networks: - Src Ports: - Action: allow
Note that the global policies, or any custom policies, do not auto-update if these ports are reconfigured. This means that the firewall policies must be manually updated when changing ports. For example, if the NDMP port is changed from 10000 to 10001:
# isi ndmp settings global view Service: False Port: 10000 DMA: generic Bre Max Num Contexts: 64 MSB Context Retention Duration: 300 MSR Context Retention Duration: 600 Stub File Open Timeout: 15 Enable Redirector: False Enable Throttler: False Throttler CPU Threshold: 50 # isi ndmp settings global modify --port 10001 # isi ndmp settings global view | grep -i port Port: 10001
The firewall’s NDMP rule port configuration must also be reset to 10001:
# isi network firewall rule list | grep ndmp default_pools_policy.rule_ndmp 44 Firewall rule on ndmp service allow # isi network firewall rule modify default_pools_policy.rule_ndmp --dst-ports 10001 --live # isi network firewall rule view default_pools_policy.rule_ndmp | grep -i dst Dst Ports: 10001
Note that the –live flag is specified to enact this port change immediately.
Under the hood, OneFS source-based routing (SBR) and the OneFS firewall both leverage ‘ipfw’. As such, SBR and the firewall share the single ipfw table in the kernel. However, the two features use separate ipfw table partitions.
This allows SBR and the firewall to be activated independently of each other. For example, even if the firewall is disabled, SBR can still be enabled and any configured SBR rules displayed as expected (that is, using ipfw set 0 show).
Note that the firewall’s global default policies have a rule allowing ICMP6 by default. For IPv6 enabled networks, ICMP6 is critical for the functioning of NDP (Neighbor Discovery Protocol). As such, when creating custom firewall policies and rules for IPv6-enabled network subnets/pools, be sure to add a rule allowing ICMP6 to support NDP. As discussed in a previous blog, an alternative (and potentially easier) approach is to clone a global policy to a new one and just customize its ruleset instead.
The OneFS FTP service can work in two modes: Active and Passive. Passive mode is the default, where FTP data connections are created on top of random ephemeral ports. However, because the OneFS firewall requires fixed ports to operate, it only supports the FTP service in Active mode. Attempts to enable the firewall with FTP running in Passive mode will generate the following warning:
# isi ftp settings view | grep -i active Active Mode: No # isi network firewall settings modify --enabled yes FTP service is running in Passive mode. Enabling network firewall will lead to FTP clients having their connections blocked. To avoid this, please enable FTP active mode and ensure clients are configured in active mode before retrying. Are you sure you want to proceed and enable network firewall? (yes/[no]):
To activate the OneFS firewall in conjunction with the FTP service, first ensure that the FTP service is running in Active mode before enabling the firewall. For example:
# isi ftp settings view | grep -i enable FTP Service Enabled: Yes # isi ftp settings view | grep -i active Active Mode: No # isi ftp setting modify –active-mode true # isi ftp settings view | grep -i active Active Mode: Yes # isi network firewall settings modify --enabled yes
Note: Verify FTP active mode support and/or firewall settings on the client side, too.
When it comes to monitoring the OneFS firewall, the following logfiles and utilities provide a variety of information and are a good source to start investigating an issue:
Utility | Description |
/var/log/isi_firewall_d.log | Main OneFS firewall log file, which includes information from firewall daemon. |
/var/log/isi_papi_d.log | Logfile for platform AP, including Firewall related handlers. |
isi_gconfig -t firewall | CLI command that displays all firewall configuration info. |
ipfw show | CLI command that displays the ipfw table residing in the FreeBSD kernel. |
Note that the preceding files and command output are automatically included in logsets generated by the ‘isi_gather_info’ data collection tool.
You can run the isi_gconfig command with the ‘-q’ flag to identify any values that are not at their default settings. For example, the stock (default) isi_firewall_d gconfig context will not report any configuration entries:
# isi_gconfig -q -t firewall [root] {version:1}
The firewall can also be run in the foreground for additional active rule reporting and debug output. For example, first shut down the isi_firewall_d service:
# isi services -a isi_firewall_d disable The service 'isi_firewall_d' has been disabled.
Next, start up the firewall with the ‘-f’ flag.
# isi_firewall_d -f Acquiring kevents for flxconfig Acquiring kevents for nodeinfo Acquiring kevents for firewall config Initialize the firewall library Initialize the ipfw set ipfw: Rule added by ipfw is for temporary use and will be auto flushed soon. Use isi firewall instead. cmd:/sbin/ipfw set enable 0 normal termination, exit code:0 isi_firewall_d is now running Loaded master FlexNet config (rev:312) Update the local firewall with changed files: flx_config, Node info, Firewall config Start to update the firewall rule... flx_config version changed! latest_flx_config_revision: new:312, orig:0 node_info version changed! latest_node_info_revision: new:1, orig:0 firewall gconfig version changed! latest_fw_gconfig_revision: new:17, orig:0 Start to update the firewall rule for firewall configuration (gconfig) Start to handle the firewall configure (gconfig) Handle the firewall policy default_pools_policy ipfw: Rule added by ipfw is for temporary use and will be auto flushed soon. Use isi firewall instead. 32043 allow tcp from any to any 10000 in cmd:/sbin/ipfw add 32043 set 8 allow TCP from any to any 10000 in normal termination, exit code:0 ipfw: Rule added by ipfw is for temporary use and will be auto flushed soon. Use isi firewall instead. 32044 allow tcp from any to any 389,636 in cmd:/sbin/ipfw add 32044 set 8 allow TCP from any to any 389,636 in normal termination, exit code:0 Snip...
If the OneFS firewall is enabled and some network traffic is blocked, either this or the ipfw show CLI command will often provide the first clues.
Please note that the ipfw command should NEVER be used to modify the OneFS firewall table!
For example, say a rule is added to the default pools policy denying traffic on port 9876 from all source networks (0.0.0.0/0):
# isi network firewall rules create default_pools_policy.rule_9876 --index=100 --dst-ports 9876 --src-networks 0.0.0.0/0 --action deny –live # isi network firewall rules view default_pools_policy.rule_9876 ID: default_pools_policy.rule_9876 Name: rule_9876 Index: 100 Description: Protocol: ALL Dst Ports: 9876 Src Networks: 0.0.0.0/0 Src Ports: - Action: deny
Running ipfw show and grepping for the port will show this new rule:
# ipfw show | grep 9876 32099 0 0 deny ip from any to any 9876 in
The ipfw show command output also reports the statistics of how many IP packets have matched each rule This can be incredibly useful when investigating firewall issues. For example, a telnet session is initiated to the cluster on port 9876 from a client:
# telnet 10.224.127.8 9876 Trying 10.224.127.8... telnet: connect to address 10.224.127.8: Operation timed out telnet: Unable to connect to remote host
The connection attempt will time out because the port 9876 ‘deny’ rule will silently drop the packets. At the same time, the ipfw show command will increment its counter to report on the denied packets. For example:
# ipfw show | grep 9876 32099 9 540 deny ip from any to any 9876 in
If this behavior is not anticipated or desired, you can find the rule name by searching the rules list for the port number, in this case port 9876:
# isi network firewall rules list | grep 9876 default_pools_policy.rule_9876 100 deny
The offending rule can then be reverted to ‘allow’ traffic on port 9876:
# isi network firewall rules modify default_pools_policy.rule_9876 --action allow --live
Or easily deleted, if preferred:
# isi network firewall rules delete default_pools_policy.rule_9876 --live Are you sure you want to delete firewall rule default_pools_policy.rule_9876? (yes/[no]): yes
Author: Nick Trimbee
Tue, 13 Jun 2023 14:54:29 -0000
|Read Time: 0 minutes
PowerScale OneFS 9.6 now brings a new offering in AWS cloud — APEX File Storage for AWS. APEX File Storage for AWS is a software-defined cloud file storage service that provides high-performance, flexible, secure, and scalable file storage for AWS environments. It is a fully customer managed service that is designed to meet the needs of enterprise-scale file workloads running on AWS.
APEX File Storage for AWS brings the OneFS distributed file system software into the public cloud, allowing users to have the same management experience in the cloud as with their on-premises PowerScale appliance.
With APEX File Storage for AWS, you can easily deploy and manage file storage on AWS, without the need for hardware or software management. The service provides a scalable and elastic storage infrastructure that can grow or shrink, according to your actual business needs.
Some of the key features and benefits of APEX File Storage for AWS include:
The architecture of APEX File Storage for AWS is based on the OneFS distributed file system, which consists of multiple cluster nodes to provide a single global namespace. Each cluster node is an instance of OneFS software that runs on an AWS EC2 instance and provides storage capacity and compute resources. The following diagram shows the architecture of APEX File Storage for AWS.
Supported cluster configuration
APEX File Storage for AWS provides two types of cluster configurations:
Configuration items | Supported options |
Cluster size | 4 to 6 nodes |
EC2 instance type | All nodes in a cluster must be same instance size. The supported instance sizes are m5dn.8xlarge, m5dn.12xlarge, m5dn.16xlarge, or m5dn.24xlarge. See Amazon EC2 m5 instances for more details. |
EBS volume (disk) type | gp3 |
EBS volume (disk) counts per node | 5, 6, 10, 12, 15, 18, or 20 |
Single EBS volume sizes | 1TiB - 16TiB |
Cluster raw capacity | 24TiB - 1PiB |
Cluster protection level | +2n |
Configuration items | Supported options |
Cluster size | 4 to 6 nodes |
EC2 instance type | All nodes in a cluster must be same instance size. The supported instance sizes are m5dn.8xlarge, m5dn.12xlarge, m5dn.16xlarge, or m5dn.24xlarge. See Amazon EC2 m5 instances for more details. |
EBS volume (disk) type | st1 |
EBS volume (disk) counts per node | 5 or 6 |
Single EBS volume sizes | 4TiB or 10TiB |
Cluster raw capacity | 80TiB - 360TiB |
Cluster protection level | +2n |
APEX File Storage for AWS can deliver 10GB/s seq read and 4GB/s seq write performance as the cluster size grows. To learn more details about APEX File Storage for AWS, see the following documentation.
Author: Lieven Lin
Wed, 17 May 2023 19:13:33 -0000
|Read Time: 0 minutes
In the previous article in this OneFS firewall series, we reviewed the upgrade, activation, and policy selection components of the firewall provisioning process.
Now, we turn our attention to the firewall rule configuration step of the process.
As stated previously, role-based access control (RBAC) explicitly limits who has access to manage the OneFS firewall. So, ensure that the user account that will be used to enable and configure the OneFS firewall belongs to a role with the ‘ISI_PRIV_FIREWALL’ write privilege.
4. Configuring Firewall Rules
When the desired policy is created, the next step is to configure the rules. Clearly, the first step here is to decide which ports and services need securing or opening, beyond the defaults.
The following CLI syntax returns a list of all the firewall’s default services, plus their respective ports, protocols, and aliases, sorted by ascending port number:
# isi network firewall services list Service Name Port Protocol Aliases --------------------------------------------- ftp-data 20 TCP - ftp 21 TCP - ssh 22 TCP - smtp 25 TCP - dns 53 TCP domain UDP http 80 TCP www www-http kerberos 88 TCP kerberos-sec UDP rpcbind 111 TCP portmapper UDP sunrpc rpc.bind ntp 123 UDP - dcerpc 135 TCP epmap UDP loc-srv netbios-ns 137 UDP - netbios-dgm 138 UDP - netbios-ssn 139 UDP - snmp 161 UDP - snmptrap 162 UDP snmp-trap mountd 300 TCP nfsmountd UDP statd 302 TCP nfsstatd UDP lockd 304 TCP nfslockd UDP nfsrquotad 305 TCP - UDP nfsmgmtd 306 TCP - UDP ldap 389 TCP - UDP https 443 TCP - smb 445 TCP microsoft-ds hdfs-datanode 585 TCP - asf-rmcp 623 TCP - UDP ldaps 636 TCP sldap asf-secure-rmcp 664 TCP - UDP ftps-data 989 TCP - ftps 990 TCP - nfs 2049 TCP nfsd UDP tcp-2097 2097 TCP - tcp-2098 2098 TCP - tcp-3148 3148 TCP - tcp-3149 3149 TCP - tcp-3268 3268 TCP - tcp-3269 3269 TCP - tcp-5667 5667 TCP - tcp-5668 5668 TCP - isi_ph_rpcd 6557 TCP - isi_dm_d 7722 TCP - hdfs-namenode 8020 TCP - isi_webui 8080 TCP apache2 webhdfs 8082 TCP - tcp-8083 8083 TCP - ambari-handshake 8440 TCP - ambari-heartbeat 8441 TCP - tcp-8443 8443 TCP - tcp-8470 8470 TCP - s3-http 9020 TCP - s3-https 9021 TCP - isi_esrs_d 9443 TCP - ndmp 10000 TCP - cee 12228 TCP - nfsrdma 20049 TCP - UDP tcp-28080 28080 TCP - --------------------------------------------- Total: 55
Similarly, the following CLI command generates a list of existing rules and their associated policies, sorted in alphabetical order. For example, to show the first five rules:
# isi network firewall rules list –-limit 5 ID Index Description Action ---------------------------------------------------------------------------------------------------------------------------------------------------- default_pools_policy.rule_ambari_handshake 41 Firewall rule on ambari-handshake service allow default_pools_policy.rule_ambari_heartbeat 42 Firewall rule on ambari-heartbeat service allow default_pools_policy.rule_catalog_search_req 50 Firewall rule on service for global catalog search requests allow default_pools_policy.rule_cee 52 Firewall rule on cee service allow default_pools_policy.rule_dcerpc_tcp 18 Firewall rule on dcerpc(TCP) service allow ---------------------------------------------------------------------------------------------------------------------------------------------------- Total: 5
Both the ‘isi network firewall rules list’ and the ‘isi network firewall services list’ commands also have a ‘-v’ verbose option, and can return their output in csv, list, table, or json formats with the ‘–flag’.
To view the detailed info for a given firewall rule, in this case the default SMB rule, use the following CLI syntax:
# isi network firewall rules view default_pools_policy.rule_smb ID: default_pools_policy.rule_smb Name: rule_smb Index: 3 Description: Firewall rule on smb service Protocol: TCP Dst Ports: smb Src Networks: - Src Ports: - Action: allow
Existing rules can be modified and new rules created and added into an existing firewall policy with the ‘isi network firewall rules create’ CLI syntax. Command options include:
Option | Description |
–action | Allow, which mean pass packets.
Deny, which means silently drop packets.
Reject which means reply with ICMP error code. |
id | Specifies the ID of the new rule to create. The rule must be added to an existing policy. The ID can be up to 32 alphanumeric characters long and can include underscores or hyphens, but cannot include spaces or other punctuation. Specify the rule ID in the following format:
<policy_name>.<rule_name>
The rule name must be unique in the policy. |
–index | The rule index in the pool. The valid value is between 1 and 99. The lower value has the higher priority. If not specified, automatically go to the next available index (before default rule 100). |
–live | The live option must only be used when a user issues a command to create/modify/delete a rule in an active policy. Such changes will take effect immediately on all network subnets and pools associated with this policy. Using the live option on a rule in an inactive policy will be rejected, and an error message will be returned. |
–protocol | Specify the protocol matched for the inbound packets. Available values are tcp, udp, icmp, and all. if not configured, the default protocol all will be used. |
–dst-ports | Specify the network ports/services provided in the storage system which is identified by destination port(s). The protocol specified by –protocol will be applied on these destination ports. |
–src-networks | Specify one or more IP addresses with corresponding netmasks that are to be allowed by this firewall policy. The correct format for this parameter is address/netmask, similar to “192.0.2.128/25”. Separate multiple address/netmask pairs with commas. Use the value 0.0.0.0/0 for “any”. |
–src-ports | Specify the network ports/services provided in the storage system which is identified by source port(s). The protocol specified by –protocol will be applied on these source ports. |
Note that, unlike for firewall policies, there is no provision for cloning individual rules.
The following CLI syntax can be used to create new firewall rules. For example, to add ‘allow’ rules for the HTTP and SSH protocols, plus a ‘deny’ rule for port TCP 9876, into firewall policy fw_test1:
# isi network firewall rules create fw_test1.rule_http --index 1 --dst-ports http --src-networks 10.20.30.0/24,20.30.40.0/24 --action allow # isi network firewall rules create fw_test1.rule_ssh --index 2 --dst-ports ssh --src-networks 10.20.30.0/24,20.30.40.0/16 --action allow # isi network firewall rules create fw_test1.rule_tcp_9876 --index 3 --protocol tcp --dst-ports 9876 --src-networks 10.20.30.0/24,20.30.40.0/24 -- action deny
When a new rule is created in a policy, if the index value is not specified, it will automatically inherit the next available number in the series (such as index=4 in this case).
# isi network firewall rules create fw_test1.rule_2049 --protocol udp -dst-ports 2049 --src-networks 30.1.0.0/16 -- action deny
For a more draconian approach, a ‘deny’ rule could be created using the match-everything ‘*’ wildcard for destination ports and a 0.0.0.0/0 network and mask, which would silently drop all traffic:
# isi network firewall rules create fw_test1.rule_1234 --index=100--dst-ports * --src-networks 0.0.0.0/0 --action deny
When modifying existing firewall rules, use the following CLI syntax, in this case to change the source network of an HTTP allow rule (index 1) in firewall policy fw_test1:
# isi network firewall rules modify fw_test1.rule_http --index 1 --protocol ip --dst-ports http --src-networks 10.1.0.0/16 -- action allow
Or to modify an SSH rule (index 2) in firewall policy fw_test1, changing the action from ‘allow’ to ‘deny’:
# isi network firewall rules modify fw_test1.rule_ssh --index 2 --protocol tcp --dst-ports ssh --src-networks 10.1.0.0/16,20.2.0.0/16 -- action deny
Also, to re-order the custom TCP 9876 rule form the earlier example from index 3 to index 7 in firewall policy fw_test1.
# isi network firewall rules modify fw_test1.rule_tcp_9876 --index 7
Note that all rules equal or behind index 7 will have their index values incremented by one.
When deleting a rule from a firewall policy, any rule reordering is handled automatically. If the policy has been applied to a network pool, the ‘–live’ option can be used to force the change to take effect immediately. For example, to delete the HTTP rule from the firewall policy ‘fw_test1’:
# isi network firewall policies delete fw_test1.rule_http --live
Firewall rules can also be created, modified, and deleted within a policy from the WebUI by navigating to Cluster management > Firewall Configuration > Firewall Policies. For example, to create a rule that permits SupportAssist and Secure Gateway traffic on the 10.219.0.0/16 network:
Once saved, the new rule is then displayed in the Firewall Configuration page:
5. Firewall management and monitoring.
In the next and final article in this series, we’ll turn our attention to managing, monitoring, and troubleshooting the OneFS firewall (Step 5).
Author: Nick Trimbee
Tue, 02 May 2023 17:21:12 -0000
|Read Time: 0 minutes
The new firewall in OneFS 9.5 enhances the security of the cluster and helps prevent unauthorized access to the storage system. When enabled, the default firewall configuration allows remote systems access to a specific set of default services for data, management, and inter-cluster interfaces (network pools).
The basic OneFS firewall provisioning process is as follows:
Note that role-based access control (RBAC) explicitly limits who has access to manage the OneFS firewall. In addition to the ubiquitous root, the cluster’s built-in SystemAdmin role has write privileges to configure and administer the firewall.
1. Upgrade cluster to OneFS 9.5.
First, to provision the firewall, the cluster must be running OneFS 9.5.
If you are upgrading from an earlier release, the OneFS 9.5 upgrade must be committed before enabling the firewall.
Also, be aware that configuration and management of the firewall in OneFS 9.5 requires the new ISI_PRIV_FIREWALL administration privilege.
# isi auth privilege | grep -i firewall ISI_PRIV_FIREWALL Configure network firewall
This privilege can be granted to a role with either read-only or read/write permissions. By default, the built-in SystemAdmin role is granted write privileges to administer the firewall:
# isi auth roles view SystemAdmin | grep -A2 -i firewall ID: ISI_PRIV_FIREWALL Permission: w
Additionally, the built-in AuditAdmin role has read permission to view the firewall configuration and logs, and so on:
# isi auth roles view AuditAdmin | grep -A2 -i firewall ID: ISI_PRIV_FIREWALL Permission: r
Ensure that the user account that will be used to enable and configure the OneFS firewall belongs to a role with the ISI_PRIV_FIREWALL write privilege.
2. Activate firewall.
The OneFS firewall can be either enabled or disabled, with the latter as the default state.
The following CLI syntax will display the firewall’s global status (in this case disabled, the default):
# isi network firewall settings view Enabled: False
Firewall activation can be easily performed from the CLI as follows:
# isi network firewall settings modify --enabled true # isi network firewall settings view Enabled: True
Or from the WebUI under Cluster management > Firewall Configuration > Settings:
Note that the firewall is automatically enabled when STIG hardening is applied to a cluster.
3. Select policies.
A cluster’s existing firewall policies can be easily viewed from the CLI with the following command:
# isi network firewall policies list ID Pools Subnets Rules ----------------------------------------------------------------------------- fw_test1 groupnet0.subnet0.pool0 groupnet0.subnet1 test_rule1 ----------------------------------------------------------------------------- Total: 1
Or from the WebUI under Cluster management > Firewall Configuration > Firewall Policies:
The OneFS firewall offers four main strategies when it comes to selecting a firewall policy:
We’ll consider each of these strategies in order:
a. Retaining the default policy
In many cases, the default OneFS firewall policy value provides acceptable protection for a security-conscious organization. In these instances, once the OneFS firewall has been enabled on a cluster, no further configuration is required, and the cluster administrators can move on to the management and monitoring phase.
The firewall policy for all front-end cluster interfaces (network pool) is the default. While the default policy can be modified, be aware that this default policy is global. As such, any change against it will affect all network pools using this default policy.
The following table describes the default firewall policies that are assigned to each interface:
Policy | Description |
Default pools policy | Contains rules for the inbound default ports for TCP and UDP services in OneFS |
Default subnets policy | Contains rules for:
|
These can be viewed from the CLI as follows:
# isi network firewall policies view default_pools_policy ID: default_pools_policy Name: default_pools_policy Description: Default Firewall Pools Policy Default Action: deny Max Rules: 100 Pools: groupnet0.subnet0.pool0, groupnet0.subnet0.testpool1, groupnet0.subnet0.testpool2, groupnet0.subnet0.testpool3, groupnet0.subnet0.testpool4, groupnet0.subnet0.poolcava Subnets: - Rules: rule_ldap_tcp, rule_ldap_udp, rule_reserved_for_hw_tcp, rule_reserved_for_hw_udp, rule_isi_SyncIQ, rule_catalog_search_req, rule_lwswift, rule_session_transfer, rule_s3, rule_nfs_tcp, rule_nfs_udp, rule_smb, rule_hdfs_datanode, rule_nfsrdma_tcp, rule_nfsrdma_udp, rule_ftp_data, rule_ftps_data, rule_ftp, rule_ssh, rule_smtp, rule_http, rule_kerberos_tcp, rule_kerberos_udp, rule_rpcbind_tcp, rule_rpcbind_udp, rule_ntp, rule_dcerpc_tcp, rule_dcerpc_udp, rule_netbios_ns, rule_netbios_dgm, rule_netbios_ssn, rule_snmp, rule_snmptrap, rule_mountd_tcp, rule_mountd_udp, rule_statd_tcp, rule_statd_udp, rule_lockd_tcp, rule_lockd_udp, rule_nfsrquotad_tcp, rule_nfsrquotad_udp, rule_nfsmgmtd_tcp, rule_nfsmgmtd_udp, rule_https, rule_ldaps, rule_ftps, rule_hdfs_namenode, rule_isi_webui, rule_webhdfs, rule_ambari_handshake, rule_ambari_heartbeat, rule_isi_esrs_d, rule_ndmp, rule_isi_ph_rpcd, rule_cee, rule_icmp, rule_icmp6, rule_isi_dm_d
# isi network firewall policies view default_subnets_policy ID: default_subnets_policy Name: default_subnets_policy Description: Default Firewall Subnets Policy Default Action: deny Max Rules: 100 Pools: - Subnets: groupnet0.subnet0 Rules: rule_subnets_dns_tcp, rule_subnets_dns_udp, rule_icmp, rule_icmp6
Or from the WebUI under Cluster management > Firewall Configuration > Firewall Policies:
b. Reconfiguring the default policy
Depending on an organization’s threat levels or security mandates, there may be a need to restrict access to certain additional IP addresses and/or management service protocols.
If the default policy is deemed insufficient, reconfiguring the default firewall policy can be a good option if only a small number of rule changes are required. The specifics of creating, modifying, and deleting individual firewall rules is covered later in this article (step 3).
Note that if new rule changes behave unexpectedly, or firewall configuration generally goes awry, OneFS does provide a “get out of jail free” card. In a pinch, the global firewall policy can be quickly and easily restored to its default values. This can be achieved with the following CLI syntax:
# isi network firewall reset-global-policy
This command will reset the global firewall policies to the original system defaults. Are you sure you want to continue? (yes/[no]):
Alternatively, the default policy can also be easily reverted from the WebUI by clicking the Reset default policies:
c. Cloning the default policy and reconfiguring
Another option is cloning, which can be useful when batch modification or a large number of changes to the current policy are required. By cloning the default firewall policy, an exact copy of the existing policy and its rules is generated, but with a new policy name. For example:
# isi network firewall policies clone default_pools_policy clone_default_pools_policy # isi network firewall policies list | grep -i clone clone_default_pools_policy -
Cloning can also be initiated from the WebUI under Firewall Configuration > Firewall Policies > More Actions > Clone Policy:
Enter a name for the clone in the Policy Name field in the pop-up window, and click Save:
Once cloned, the policy can then be easily reconfigured to suit. For example, to modify the policy fw_test1 and change its default-action from deny-all to allow-all:
# isi network firewall policies modify fw_test1 --default-action allow-all
When modifying a firewall policy, you can use the --live CLI option to force it to take effect immediately. Note that the --live option is only valid when issuing a command to modify or delete an active custom policy and to modify default policy. Such changes will take effect immediately on all network subnets and pools associated with this policy. Using the --live option on an inactive policy will be rejected, and an error message returned.
Options for creating or modifying a firewall policy include:
Option | Description |
--default-action | Automatically add one rule to deny all or allow all to the bottom of the rule set for this created policy (Index = 100). |
--max-rule-num | By default, each policy when created could have a maximum of 100 rules (including one default rule), so user could configure a maximum of 99 rules. User could expand the maximum rule number to a specified value. Currently this value is limited to 200 (and user could configure a maximum of 199 rules). |
--add-subnets | Specify the network subnet(s) to add to policy, separated by a comma. |
--remove-subnets | Specify the networks subnets to remove from policy and fall back to global policy. |
--add-pools | Specify the network pool(s) to add to policy, separated by a comma. |
--remove-pools | Specify the networks pools to remove from policy and fall back to global policy. |
When you modify firewall policies, OneFS issues the following warning to verify the changes and help avoid the risk of a self-induced denial-of-service:
# isi network firewall policies modify --pools groupnet0.subnet0.pool0 fw_test1
Changing the Firewall Policy associated with a subnet or pool may change the networks and/or services allowed to connect to OneFS. Please confirm you have selected the correct Firewall Policy and Subnets/Pools. Are you sure you want to continue? (yes/[no]): yes
Once again, having the following CLI command handy, plus console access to the cluster is always a prudent move:
# isi network firewall reset-global-policy
So adding network pools or subnets to a firewall policy will cause the previous policy to be removed from them. Similarly, adding network pools or subnets to the global default policy will revert any custom policy configuration they might have. For example, to apply the firewall policy fw_test1 to IP Pool groupnet0.subnet0.pool0 and groupnet0.subnet0.pool1:
# isi network pools view groupnet0.subnet0.pool0 | grep -i firewall Firewall Policy: default_pools_policy # isi network firewall policies modify fw_test1 --add-pools groupnet0.subnet0.pool0, groupnet0.subnet0.pool1 # isi network pools view groupnet0.subnet0.pool0 | grep -i firewall Firewall Policy: fw_test1
Or to apply the firewall policy fw_test1 to IP Pool groupnet0.subnet0.pool0 and groupnet0.subnet0:
# isi network firewall policies modify fw_test1 --apply-subnet groupnet0.subnet0.pool0, groupnet0.subnet0 # isi network pools view groupnet0.subnet0.pool0 | grep -i firewall Firewall Policy: fw_test1 # isi network subnets view groupnet0.subnet0 | grep -i firewall Firewall Policy: fw_test1
To reapply global policy at any time, either add the pools to the default policy:
# isi network firewall policies modify default_pools_policy --add-pools groupnet0.subnet0.pool0, groupnet0.subnet0.pool1 # isi network pools view groupnet0.subnet0.pool0 | grep -i firewall Firewall Policy: default_subnets_policy # isi network subnets view groupnet0.subnet1 | grep -i firewall Firewall Policy: default_subnets_policy
Or remove the pool from the custom policy:
# isi network firewall policies modify fw_test1 --remove-pools groupnet0.subnet0.pool0 groupnet0.subnet0.pool1
You can also manage firewall policies on a network pool in the OneFS WebUI by going to Cluster configuration > Network configuration > External network > Edit pool details. For example:
Be aware that cloning is also not limited to the default policy because clones can be made of any custom policies too. For example:
# isi network firewall policies clone clone_default_pools_policy fw_test1
d. Creating a custom firewall policy
Alternatively, a custom firewall policy can also be created from scratch. This can be accomplished from the CLI using the following syntax, in this case to create a firewall policy named fw_test1:
# isi network firewall policies create fw_test1 --default-action deny # isi network firewall policies view fw_test1 ID: fw_test1 Name: fw_test1 Description: Default Action: deny Max Rules: 100 Pools: - Subnets: - Rules: -
Note that if a default-action is not specified in the CLI command syntax, it will automatically default to deny.
Firewall policies can also be configured in the OneFS WebUI by going to Cluster management > Firewall Configuration > Firewall Policies > Create Policy:
However, in contrast to the CLI, if a default-action is not specified when a policy is created in the WebUI, the automatic default is to Allow because the drop-down list works alphabetically.
If and when a firewall policy is no longer required, it can be swiftly and easily removed. For example, the following CLI syntax deletes the firewall policy fw_test1, clearing out any rules within this policy container:
# isi network firewall policies delete fw_test1 Are you sure you want to delete firewall policy fw_test1? (yes/[no]): yes
Note that the default global policies cannot be deleted.
# isi network firewall policies delete default_subnets_policy Are you sure you want to delete firewall policy default_subnets_policy? (yes/[no]): yes Firewall policy: Cannot delete default policy default_subnets_policy.
4. Configure firewall rules.
In the next article in this series, we’ll turn our attention to this step, configuring the OneFS firewall rules.
Wed, 26 Apr 2023 15:40:15 -0000
|Read Time: 0 minutes
Among the array of security features introduced in OneFS 9.5 is a new host-based firewall. This firewall allows cluster administrators to configure policies and rules on a PowerScale cluster in order to meet the network and application management needs and security mandates of an organization.
The OneFS firewall protects the cluster’s external, or front-end, network and operates as a packet filter for inbound traffic. It is available upon installation or upgrade to OneFS 9.5 but is disabled by default in both cases. However, the OneFS STIG hardening profile automatically enables the firewall and the default policies, in addition to manual activation.
The firewall generally manages IP packet filtering in accordance with the OneFS Security Configuration Guide, especially in regards to the network port usage. Packet control is governed by firewall policies, which have one or more individual rules.
Item | Description | Match | Action |
Firewall Policy | Each policy is a set of firewall rules. | Rules are matched by index in ascending order. | Each policy has a default action. |
Firewall Rule | Each rule specifies what kinds of network packets should be matched by Firewall engine and what action should be taken upon them. | Matching criteria includes protocol, source ports, destination ports, source network address). | Options are allow, deny, or reject. |
A security best practice is to enable the OneFS firewall using the default policies, with any adjustments as required. The recommended configuration process is as follows:
Step | Details |
1. Access | Ensure that the cluster uses a default SSH or HTTP port before enabling. The default firewall policies block all nondefault ports until you change the policies. |
2. Enable | Enable the OneFS firewall. |
3. Compare | Compare your cluster network port configurations against the default ports listed in Network port usage. |
4. Configure | Edit the default firewall policies to accommodate any non-standard ports in use in the cluster. NOTE: The firewall policies do not automatically update when port configurations are changed. |
5. Constrain | Limit access to the OneFS Web UI to specific administrator terminals. |
Under the hood, the OneFS firewall is built upon the ubiquitous ipfirewall, or ipfw, which is FreeBSD’s native stateful firewall, packet filter, and traffic accounting facility.
Firewall configuration and management is through the CLI, or platform API, or WebUI, and OneFS 9.5 introduces a new Firewall Configuration page to support this. Note that the firewall is only available once a cluster is already running OneFS 9.5 and the feature has been manually enabled, activating the isi_firewall_d service. The firewall’s configuration is split between gconfig, which handles the settings and policies, and the ipfw table, which stores the rules themselves.
The firewall gracefully handles SmartConnect dynamic IP movement between nodes since firewall policies are applied per network pool. Additionally, being network pool based allows the firewall to support OneFS access zones and shared/multitenancy models.
The individual firewall rules, which are essentially simplified wrappers around ipfw rules, work by matching packets through the 5-tuples that uniquely identify an IPv4 UDP or TCP session:
The rules are then organized within a firewall policy, which can be applied to one or more network pools.
Note that each pool can only have a single firewall policy applied to it. If there is no custom firewall policy configured for a network pool, it automatically uses the global default firewall policy.
When enabled, the OneFS firewall function is cluster wide, and all inbound packets from external interfaces will go through either the custom policy or default global policy before reaching the protocol handling pathways. Packets passed to the firewall are compared against each of the rules in the policy, in rule-number order. Multiple rules with the same number are permitted, in which case they are processed in order of insertion. When a match is found, the action corresponding to that matching rule is performed. A packet is checked against the active ruleset in multiple places in the protocol stack, and the basic flow is as follows:
The OneFS firewall automatically reserves 20,000 rules in the ipfw table for its custom and default policies and rules. By default, each policy can have a maximum of 100 rules, including one default rule. This translates to an effective maximum of 99 user-defined rules per policy, because the default rule is reserved and cannot be modified. As such, a maximum of 198 policies can be applied to pools or subnets since the default-pools-policy and default-subnets-policy are reserved and cannot be deleted.
Additional firewall bounds and limits to keep in mind include:
Name | Value | Description |
MAX_INTERFACES | 500 | Maximum number of Layer 2 interfaces per node (including Ethernet, VLAN, LAGG interfaces). |
MAX _SUBNETS | 100 | Maximum number of subnets within a OneFS cluster. |
MAX_POOLS | 100 | Maximum number of network pools within a OneFS cluster. |
DEFAULT_MAX_RULES | 100 | Default value of maximum rules within a firewall policy. |
MAX_RULES | 200 | Upper limit of maximum rules within a firewall policy. |
MAX_ACTIVE_RULES | 5000 | Upper limit of total active rules across the whole cluster. |
MAX_INACTIVE_POLICIES | 200 | Maximum number of policies that are not applied to any network subnet or pool. They will not be written into ipfw table. |
The firewall default global policy is ready to use out of the box and, unless a custom policy has been explicitly configured, all network pools use this global policy. Custom policies can be configured by either cloning and modifying an existing policy or creating one from scratch.
Component | Description |
Custom policy | A user-defined container with a set of rules. A policy can be applied to multiple network pools, but a network pool can only apply one policy. |
Firewall rule | An ipfw-like rule that can be used to restrict remote access. Each rule has an index that is valid within the policy. Index values range from 1 to 99, with lower numbers having higher priority. Source networks are described by IP and netmask, and services can be expressed either by port number (i.e., 80) or service name (i.e., http, ssh, smb). The * wildcard can also be used to denote all services. Supported actions include allow, drop, and reject. |
Default policy | A global policy to manage all default services, used for maintaining OneFS minimum running and management. While Deny any is the default action of the policy, the defined service rules have a default action to allow all remote access. All packets not matching any of the rules are automatically dropped. Two default policies:
Note that these two default policies cannot be deleted, but individual rule modification is permitted in each. |
Default services | The firewall’s default predefined services include the usual suspects, such as: DNS, FTP, HDFS, HTTP, HTTPS, ICMP, NDMP, NFS, NTP, S3, SMB, SNMP, SSH, and so on. A full listing is available in the isi network firewall services list CLI command output. |
For a given network pool, either the global policy or a custom policy is assigned and takes effect. Additionally, all configuration changes to either policy type are managed by gconfig and are persistent across cluster reboots.
In the next article in this series we’ll take a look at the CLI and WebUI configuration and management of the OneFS firewall.
Fri, 21 Apr 2023 17:11:00 -0000
|Read Time: 0 minutes
In this era of elevated cyber-crime and data security threats, there is increasing demand for immutable, tamper-proof snapshots. Often this need arises as part of a broader security mandate, ideally proactively, but oftentimes as a response to a security incident. OneFS addresses this requirement in the following ways:
On-cluster | Off-cluster |
|
|
At its core, OneFS SnapshotIQ generates read-only, point-in-time, space efficient copies of a defined subset of a cluster’s data.
Only the changed blocks of a file are stored when updating OneFS snapshots, ensuring efficient storage utilization. They are also highly scalable and typically take less than a second to create, while generating little performance overhead. As such, the RPO (recovery point objective) and RTO (recovery time objective) of a OneFS snapshot can be very small and highly flexible, with the use of rich policies and schedules.
OneFS Snapshots are created manually, on a schedule, or automatically generated by OneFS to facilitate system operations. But whatever the generation method, when a snapshot has been taken, its contents cannot be manually altered.
In addition to snapshot contents immutability, for an enhanced level of tamper-proofing, SnapshotIQ also provides the ability to lock snapshots with the ‘isi snapshot locks’ CLI syntax. This prevents snapshots from being accidentally or unintentionally deleted.
For example, a manual snapshot, ‘snaploc1’ is taken of /ifs/test:
# isi snapshot snapshots create /ifs/test --name snaploc1 # isi snapshot snapshots list | grep snaploc1 79188 snaploc1 /ifs/test
A lock is then placed on it (in this case lock ID=1):
# isi snapshot locks create snaplock1 # isi snapshot locks list snaploc1 ID ---- 1 ---- Total: 1
Attempts to delete the snapshot fail because the lock prevents its removal:
# isi snapshot snapshots delete snaploc1 Are you sure? (yes/[no]): yes Snapshot "snaploc1" can't be deleted because it is locked
The CLI command ‘isi snapshot locks delete <lock_ID>’ can be used to clear existing snapshot locks, if desired. For example, to remove the only lock (ID=1) from snapshot ‘snaploc1’:
# isi snapshot locks list snaploc1 ID ---- 1 ---- Total: 1 # isi snapshot locks delete snaploc1 1 Are you sure you want to delete snapshot lock 1 from snaploc1? (yes/[no]): yes # isi snap locks view snaploc1 1 No such lock
When the lock is removed, the snapshot can then be deleted:
# isi snapshot snapshots delete snaploc1 Are you sure? (yes/[no]): yes # isi snapshot snapshots list| grep -i snaploc1 | wc -l 0
Note that a snapshot can have up to a maximum of sixteen locks on it at any time. Also, lock numbers are continually incremented and not recycled upon deletion.
Like snapshot expiration, snapshot locks can also have an expiration time configured. For example, to set a lock on snapshot ‘snaploc1’ that expires at 1am on April 1st, 2024:
# isi snap lock create snaploc1 --expires '2024-04-01T01:00:00' # isi snap lock list snaploc1 ID ---- 36 ---- Total: 1 # isi snap lock view snaploc1 33 ID: 36 Comment: Expires: 2024-04-01T01:00:00 Count: 1
Note that if the duration period of a particular snapshot lock expires but others remain, OneFS will not delete that snapshot until all the locks on it have been deleted or expired.
The following table provides an example snapshot expiration schedule, with monthly locked snapshots to prevent deletion:
Snapshot Frequency | Snapshot Time | Snapshot Expiration | Max Retained Snapshots |
Every other hour | Start at 12:00AM End at 11:59AM | 1 day | 27 |
Every day | At 12:00AM | 1 week | |
Every week | Saturday at 12:00AM | 1 month | |
Every month | First Saturday of month at 12:00AM | Locked |
Read-only snapshots plus locks present physically secure snapshots on a cluster. However, if you can login to the cluster and have the required elevated administrator privileges to do so, you can still remove locks and/or delete snapshots.
Because data security threats come from inside an environment as well as out, such as from a disgruntled IT employee or other internal bad actor, another key to a robust security profile is to constrain the use of all-powerful ‘root’, ‘administrator’, and ‘sudo’ accounts as much as possible. Instead, of granting cluster admins full rights, a preferred security best practice is to leverage the comprehensive authentication, authorization, and accounting framework that OneFS natively provides.
OneFS role-based access control (RBAC) can be used to explicitly limit who has access to manage and delete snapshots. This granular control allows you to craft administrative roles that can create and manage snapshot schedules, but prevent their unlocking and/or deletion. Similarly, lock removal and snapshot deletion can be isolated to a specific security role (or to root only).
A cluster security administrator selects the desired access zone, creates a zone-aware role within it, assigns privileges, and then assigns members.
For example, from the WebUI under Access > Membership and roles > Roles:
When these members access the cluster through the WebUI, PlatformAPI, or CLI, they inherit their assigned privileges.
The specific privileges that can be used to segment OneFS snapshot management include:
Privilege | Description |
ISI_PRIV_SNAPSHOT_ALIAS | Aliasing for snapshots |
ISI_PRIV_SNAPSHOT_LOCKS | Locking of snapshots from deletion |
ISI_PRIV_SNAPSHOT_PENDING | Upcoming snapshot based on schedules |
ISI_PRIV_SNAPSHOT_RESTORE | Restoring directory to a particular snapshot |
ISI_PRIV_SNAPSHOT_SCHEDULES | Scheduling for periodic snapshots |
ISI_PRIV_SNAPSHOT_SETTING | Service and access settings |
ISI_PRIV_SNAPSHOT_SNAPSHOTMANAGEMENT | Manual snapshots and locks |
ISI_PRIV_SNAPSHOT_SNAPSHOT_SUMMARY | Snapshot summary and usage details |
Each privilege can be assigned one of four permission levels for a role, including:
Permission Indicator | Description |
– | No permission |
R | Read-only permission |
X | Execute permission |
W | Write permission |
The ability for a user to delete a snapshot is governed by the ‘ISI_PRIV_SNAPSHOT_SNAPSHOTMANAGEMENT’ privilege. Similarly, the ‘ISI_PRIV_SNAPSHOT_LOCKS’ privilege governs lock creation and removal.
In the following example, the ‘snap’ role has ‘read’ rights for the ‘ISI_PRIV_SNAPSHOT_LOCKS’ privilege, allowing a user associated with this role to view snapshot locks:
# isi auth roles view snap | grep -I -A 1 locks ID: ISI_PRIV_SNAPSHOT_LOCKS Permission: r -- # isi snapshot locks list snaploc1 ID ---- 1 ---- Total: 1
However, attempts to remove the lock ‘ID 1’ from the ‘snaploc1’ snapshot fail without write privileges:
# isi snapshot locks delete snaploc1 1 Privilege check failed. The following write privilege is required: Snapshot locks (ISI_PRIV_SNAPSHOT_LOCKS)
Write privileges are added to ‘ISI_PRIV_SNAPSHOT_LOCKS’ in the ‘’snaploc1’ role:
# isi auth roles modify snap –-add-priv-write ISI_PRIV_SNAPSHOT_LOCKS # isi auth roles view snap | grep -I -A 1 locks ID: ISI_PRIV_SNAPSHOT_LOCKS Permission: w --
This allows the lock ‘ID 1’ to be successfully deleted from the ‘snaploc1’ snapshot:
# isi snapshot locks delete snaploc1 1 Are you sure you want to delete snapshot lock 1 from snaploc1? (yes/[no]): yes # isi snap locks view snaploc1 1 No such lock
Using OneFS RBAC, an enhanced security approach for a site could be to create three OneFS roles on a cluster, each with an increasing realm of trust:
1. First, an IT ops/helpdesk role with ‘read’ access to the snapshot attributes would permit monitoring and troubleshooting, but no changes:
Snapshot Privilege | Description |
ISI_PRIV_SNAPSHOT_ALIAS | Read |
ISI_PRIV_SNAPSHOT_LOCKS | Read |
ISI_PRIV_SNAPSHOT_PENDING | Read |
ISI_PRIV_SNAPSHOT_RESTORE | Read |
ISI_PRIV_SNAPSHOT_SCHEDULES | Read |
ISI_PRIV_SNAPSHOT_SETTING | Read |
ISI_PRIV_SNAPSHOT_SNAPSHOTMANAGEMENT | Read |
ISI_PRIV_SNAPSHOT_SNAPSHOT_SUMMARY | Read |
2. Next, a cluster admin role, with ‘read’ privileges for ‘ISI_PRIV_SNAPSHOT_LOCKS’ and ‘ISI_PRIV_SNAPSHOT_SNAPSHOTMANAGEMENT’ would prevent snapshot and lock deletion, but provide ‘write’ access for schedule configuration, restores, and so on.
Snapshot Privilege | Description |
ISI_PRIV_SNAPSHOT_ALIAS | Write |
ISI_PRIV_SNAPSHOT_LOCKS | Read |
ISI_PRIV_SNAPSHOT_PENDING | Write |
ISI_PRIV_SNAPSHOT_RESTORE | Write |
ISI_PRIV_SNAPSHOT_SCHEDULES | Write |
ISI_PRIV_SNAPSHOT_SETTING | Write |
ISI_PRIV_SNAPSHOT_SNAPSHOTMANAGEMENT | Read |
ISI_PRIV_SNAPSHOT_SNAPSHOT_SUMMARY | Write |
3. Finally, a cluster security admin role (root equivalence) would provide full snapshot configuration and management, lock control, and deletion rights:
Snapshot Privilege | Description |
ISI_PRIV_SNAPSHOT_ALIAS | Write |
ISI_PRIV_SNAPSHOT_LOCKS | Write |
ISI_PRIV_SNAPSHOT_PENDING | Write |
ISI_PRIV_SNAPSHOT_RESTORE | Write |
ISI_PRIV_SNAPSHOT_SCHEDULES | Write |
ISI_PRIV_SNAPSHOT_SETTING | Write |
ISI_PRIV_SNAPSHOT_SNAPSHOTMANAGEMENT | Write |
ISI_PRIV_SNAPSHOT_SNAPSHOT_SUMMARY | Write |
Note that when configuring OneFS RBAC, remember to remove the ‘ISI_PRIV_AUTH’ and ‘ISI_PRIV_ROLE’ privilege from all but the most trusted administrators.
Additionally, enterprise security management tools such as CyberArk can also be incorporated to manage authentication and access control holistically across an environment. These can be configured to frequently change passwords on trusted accounts (that is, every hour or so), require multi-Level approvals prior to retrieving passwords, and track and audit password requests and trends.
While this article focuses exclusively on OneFS snapshots, the expanded use of RBAC granular privileges for enhanced security is germane to most key areas of cluster management and data protection, such as SyncIQ replication, and so on.
In addition to using snapshots for its own checkpointing system, SyncIQ, the OneFS data replication engine, supports snapshot replication to a target cluster.
OneFS SyncIQ replication policies contain an option for triggering a replication policy when a snapshot of the source directory is completed. Additionally, at the onset of a new policy configuration, when the “Whenever a Snapshot of the Source Directory is Taken” option is selected, a checkbox appears to enable any existing snapshots in the source directory to be replicated. More information is available in this SyncIQ paper.
File data is arguably the most difficult to protect, because:
The Cyber Security Framework (CSF) from the National Institute of Standards and Technology (NIST) categorizes the threat through recovery process:
Within the ‘Protect’ phase, there are two core aspects:
Feature | Description |
Access control | Where the core data protection functions are being executed. Assess who actually needs write access. |
Immutability | Having immutable snapshots, replica versions, and so on. Augmenting backup strategy with an archiving strategy with SmartLock WORM. |
Encryption | Encrypting both data in-flight and data at rest. |
Anti-virus | Integrating with anti-virus/anti-malware protection that does content inspection. |
Security advisories | Dell Security Advisories (DSA) inform customers about fixes to common vulnerabilities and exposures. |
The combination of OneFS snapshots and SyncIQ replication allows for granular data recovery. This means that only the affected files are recovered, while the most recent changes are preserved for the unaffected data. While an on-prem air-gapped cyber vault can still provide secure network isolation, in the event of an attack, the ability to failover to a fully operational ‘clean slate’ remote site provides additional security and peace of mind.
We’ll explore PowerScale cyber protection and recovery in more depth in a future article.
Author: Nick Trimbee
Fri, 21 Apr 2023 16:41:36 -0000
|Read Time: 0 minutes
The previous article in this series looked at an overview of OneFS SupportAssist. Now, we’ll turn our attention to its core architecture and operation.
Under the hood, SupportAssist relies on the following infrastructure and services:
Service | Name |
ESE | Embedded Service Enabler. |
isi_rice_d | Remote Information Connectivity Engine (RICE). |
isi_crispies_d | Coordinator for RICE Incidental Service Peripherals including ESE Start. |
Gconfig | OneFS centralized configuration infrastructure. |
MCP | Master Control Program – starts, monitors, and restarts OneFS services. |
Tardis | Configuration service and database. |
Transaction journal | Task manager for RICE. |
Of these, ESE, isi_crispies_d, isi_rice_d, and the Transaction Journal are new in OneFS 9.5 and exclusive to SupportAssist. By contrast, Gconfig, MCP, and Tardis are all legacy services that are used by multiple other OneFS components.
The Remote Information Connectivity Engine (RICE) represents the new SupportAssist ecosystem for OneFS to connect to the Dell backend. The high level architecture is as follows:
Dell’s Embedded Service Enabler (ESE) is at the core of the connectivity platform and acts as a unified communications broker between the PowerScale cluster and Dell Support. ESE runs as a OneFS service and, on startup, looks for an on-premises gateway server. If none is found, it connects back to the connectivity pipe (SRS). The collector service then interacts with ESE to send telemetry, obtain upgrade packages, transmit alerts and events, and so on.
Depending on the available resources, ESE provides a base functionality with additional optional capabilities to enhance serviceability. ESE is multithreaded, and each payload type is handled by specific threads. For example, events are handled by event threads, binary and structured payloads are handled by web threads, and so on. Within OneFS, ESE gets installed to /usr/local/ese and runs as ‘ese’ user and group.
The responsibilities of isi_rice_d include listening for network changes, getting eligible nodes elected for communication, monitoring notifications from CRISPIES, and engaging Task Manager when ESE is ready to go.
The Task Manager is a core component of the RICE engine. Its responsibility is to watch the incoming tasks that are placed into the journal and to assign workers to step through the tasks until completion. It controls the resource utilization (Python threads) and distributes tasks that are waiting on a priority basis.
The ‘isi_crispies_d’ service exists to ensure that ESE is only running on the RICE active node, and nowhere else. It acts, in effect, like a specialized MCP just for ESE and RICE-associated services, such as IPA. This entails starting ESE on the RICE active node, re-starting it if it crashes on the RICE active node, and stopping it and restarting it on the appropriate node if the RICE active instance moves to another node. We are using ‘isi_crispies_d’ for this, and not MCP, because MCP does not support a service running on only one node at a time.
The core responsibilities of ‘isi_crispies_d’ include:
The state of ESE, and of other RICE service peripherals, is stored in the OneFS tardis configuration database so that it can be checked by RICE. Similarly, ‘isi_crispies_d’ monitors the OneFS Tardis configuration database to see which node is designated as the RICE ‘active’ node.
The ‘isi_telemetry_d’ daemon is started by MCP and runs when SupportAssist is enabled. It does not have to be running on the same node as the active RICE and ESE instance. Only one instance of ‘isi_telemetry_d’ will be active at any time, and the other nodes will be waiting for the lock.
You can query the current status and setup of SupportAssist on a PowerScale cluster by using the ‘isi supportassist settings view’ CLI command. For example:
# isi supportassist settings view Service enabled: Yes Connection State: enabled OneFS Software ID: ELMISL08224764 Network Pools: subnet0:pool0 Connection mode: direct Gateway host: - Gateway port: - Backup Gateway host: - Backup Gateway port: - Enable Remote Support: Yes Automatic Case Creation: Yes Download enabled: Yes
You can also do this from the WebUI by navigating to Cluster management > General settings > SupportAssist:
You can enable or disable SupportAssist by using the ‘isi services’ CLI command set. For example:
# isi services isi_supportassist disable The service 'isi_supportassist' has been disabled. # isi services isi_supportassist enable The service 'isi_supportassist' has been enabled. # isi services -a | grep supportassist isi_supportassist SupportAssist Monitor Enabled
You can check the core services, as follows:
# ps -auxw | grep -e 'rice' -e 'crispies' | grep -v grep root 8348 9.4 0.0 109844 60984 - Ss 22:14 0:00.06 /usr/libexec/isilon/isi_crispies_d /usr/bin/isi_crispies_d root 8183 8.8 0.0 108060 64396 - Ss 22:14 0:01.58 /usr/libexec/isilon/isi_rice_d /usr/bin/isi_rice_d
Note that when a cluster is provisioned with SupportAssist, ESRS can no longer be used. However, customers that have not previously connected their clusters to Dell Support can still provision ESRS, but will be presented with a message encouraging them to adopt the best practice of using SupportAssist.
Additionally, SupportAssist in OneFS 9.5 does not currently support IPv6 networking, so clusters deployed in IPv6 environments should continue to use ESRS until SupportAssist IPv6 integration is introduced in a future OneFS release.
Author: Nick Trimbee
Tue, 18 Apr 2023 20:07:18 -0000
|Read Time: 0 minutes
In this final article in the OneFS SupportAssist series, we turn our attention to management and troubleshooting.
Once the provisioning process above is complete, the isi supportassist settings view CLI command reports the status and health of SupportAssist operations on the cluster.
# isi supportassist settings view Service enabled: Yes Connection State: enabled OneFS Software ID: xxxxxxxxxx Network Pools: subnet0:pool0 Connection mode: direct Gateway host: - Gateway port: - Backup Gateway host: - Backup Gateway port: - Enable Remote Support: Yes Automatic Case Creation: Yes Download enabled: Yes
This can also be obtained from the WebUI by going to Cluster management > General settings > SupportAssist:
There are some caveats and considerations to keep in mind when upgrading to OneFS 9.5 and enabling SupportAssist, including:
Also, note that Secure Remote Services can no longer be used after SupportAssist has been provisioned on a cluster.
SupportAssist has a variety of components that gather and transmit various pieces of OneFS data and telemetry to Dell Support and backend services through the Embedded Service Enabler (ESE). These workflows include CELOG events; in-product activation (IPA) information; CloudIQ telemetry data; Isi-Gather-info (IGI) logsets; and provisioning, configuration, and authentication data to ESE and the various backend services.
Activity | Information |
Events and alerts | SupportAssist can be configured to send CELOG events. |
Diagnostics | The OneFS isi diagnostics gather and isi_gather_info logfile collation and transmission commands have a SupportAssist option. |
HealthChecks | HealthCheck definitions are updated using SupportAssist. |
License Activation | The isi license activation start command uses SupportAssist to connect. |
Remote Support | Remote Support uses SupportAssist and the Connectivity Hub to assist customers with their clusters. |
Telemetry | CloudIQ telemetry data is sent using SupportAssist. |
Once SupportAssist is up and running, it can be configured to send CELOG events and attachments through ESE to CLM. This can be managed by the isi event channels CLI command syntax. For example:
# isi event channels list ID Name Type Enabled ----------------------------------------------- 1 RemoteSupport connectemc No 2 Heartbeat Self-Test heartbeat Yes 3 SupportAssist supportassist No ----------------------------------------------- Total: 3 # isi event channels view SupportAssist ID: 3 Name: SupportAssist Type: supportassist Enabled: No
Or from the WebUI:
In OneFS 9.5, SupportAssist provides an option to send telemetry data to CloudIQ. This can be enabled from the CLI as follows:
# isi supportassist telemetry modify --telemetry-enabled 1 --telemetry-persist 0 # isi supportassist telemetry view Telemetry Enabled: Yes Telemetry Persist: No Telemetry Threads: 8 Offline Collection Period: 7200
Or in the SupportAssist WebUI:
Also in OneFS 9.5, the isi diagnostics gather and isi_gather_info CLI commands both include a --supportassist upload option for log gathers, which also allows them to continue to function through a new “emergency mode” when the cluster is unhealthy. For example, to start a gather from the CLI that will be uploaded through SupportAssist:
# isi diagnostics gather start –supportassist 1
Similarly, for ISI gather info:
# isi_gather_info --supportassist
Or to explicitly avoid using SupportAssist for ISI gather info log gather upload:
# isi_gather_info --nosupportassist
This can also be configured from the WebUI at Cluster management > General configuration > Diagnostics > Gather:
PowerScale License Activation (previously known as In-Product Activation) facilitates the management of the cluster's entitlements and licenses by communicating directly with Software Licensing Central through SupportAssist.
To activate OneFS product licenses through the SupportAssist WebUI:
Note that it can take up to 24 hours for the activation to occur.
Alternatively, cluster license activation codes (LAC) can also be added manually.
When it comes to troubleshooting SupportAssist, the basic process flow is as follows:
The OneFS components and services above are:
Component | Info |
ESE | Embedded Service Enabler |
isi_rice_d | Remote Information Connectivity Engine (RICE) |
isi_crispies_d | Coordinator for RICE Incidental Service Peripherals including ESE Start |
Gconfig | OneFS centralized configuration infrastructure |
MCP | Master Control Program; starts, monitors, and restarts OneFS services |
Tardis | Configuration service and database |
Transaction journal | Task manager for RICE |
Of these, ESE, isi_crispies_d, isi_rice_d, and the transaction journal are new in OneFS 9.5 and exclusive to SupportAssist. In contrast, Gconfig, MCP, and Tardis are all legacy services that are used by multiple other OneFS components.
For its connectivity, SupportAssist elects a single leader single node within the subnet pool, and NANON nodes are automatically avoided. Ports 443 and 8443 are required to be open for bi-directional communication between the cluster and Connectivity Hub, and port 9443 is for communicating with a gateway. The SupportAssist ESE component communicates with a number of Dell backend services:
Debugging backend issues might involve one or more services, and Dell Support can assist with this process.
The main log files for investigating and troubleshooting SupportAssist issues and idiosyncrasies are isi_rice_d.log and isi_crispies_d.log. There is also an ese_log, which can be useful, too. These logs can be found at:
Component | Logfile location | Info |
Rice | /var/log/isi_rice_d.log | Per node |
Crispies | /var/log/isi_crispies_d.log | Per node |
ESE | /ifs/.ifsvar/ese/var/log/ESE.log | Cluster-wide for single instance ESE |
Debug level logging can be configured from the CLI as follows:
# isi_for_array isi_ilog -a isi_crispies_d --level=debug+ # isi_for_array isi_ilog -a isi_rice_d --level=debug+
Note that the OneFS log gathers (such as the output from the isi_gather_info utility) will capture all the above log files, plus the pertinent SupportAssist Gconfig contexts and Tardis namespaces, for later analysis.
If needed, the Rice and ESE configurations can also be viewed as follows:
# isi_gconfig -t ese [root] {version:1} ese.mode (char*) = direct ese.connection_state (char*) = disabled ese.enable_remote_support (bool) = true ese.automatic_case_creation (bool) = true ese.event_muted (bool) = false ese.primary_contact.first_name (char*) = ese.primary_contact.last_name (char*) = ese.primary_contact.email (char*) = ese.primary_contact.phone (char*) = ese.primary_contact.language (char*) = ese.secondary_contact.first_name (char*) = ese.secondary_contact.last_name (char*) = ese.secondary_contact.email (char*) = ese.secondary_contact.phone (char*) = ese.secondary_contact.language (char*) = (empty dir ese.gateway_endpoints) ese.defaultBackendType (char*) = srs ese.ipAddress (char*) = 127.0.0.1 ese.useSSL (bool) = true ese.srsPrefix (char*) = /esrs/{version}/devices ese.directEndpointsUseProxy (bool) = false ese.enableDataItemApi (bool) = true ese.usingBuiltinConfig (bool) = false ese.productFrontendPrefix (char*) = platform/16/supportassist ese.productFrontendType (char*) = webrest ese.contractVersion (char*) = 1.0 ese.systemMode (char*) = normal ese.srsTransferType (char*) = ISILON-GW ese.targetEnvironment (char*) = PROD # isi_gconfig -t rice [root] {version:1} rice.enabled (bool) = false rice.ese_provisioned (bool) = false rice.hardware_key_present (bool) = false rice.supportassist_dismissed (bool) = false rice.eligible_lnns (char*) = [] rice.instance_swid (char*) = rice.task_prune_interval (int) = 86400 rice.last_task_prune_time (uint) = 0 rice.event_prune_max_items (int) = 100 rice.event_prune_days_to_keep (int) = 30 rice.jnl_tasks_prune_max_items (int) = 100 rice.jnl_tasks_prune_days_to_keep (int) = 30 rice.config_reserved_workers (int) = 1 rice.event_reserved_workers (int) = 1 rice.telemetry_reserved_workers (int) = 1 rice.license_reserved_workers (int) = 1 rice.log_reserved_workers (int) = 1 rice.download_reserved_workers (int) = 1 rice.misc_task_workers (int) = 3 rice.accepted_terms (bool) = false (empty dir rice.network_pools) rice.telemetry_enabled (bool) = true rice.telemetry_persist (bool) = false rice.telemetry_threads (uint) = 8 rice.enable_download (bool) = true rice.init_performed (bool) = false rice.ese_disconnect_alert_timeout (int) = 14400 rice.offline_collection_period (uint) = 7200
The -q flag can also be used in conjunction with the isi_gconfig command to identify any values that are not at their default settings. For example, the stock (default) Rice gconfig context will not report any configuration entries:
# isi_gconfig -q -t rice [root] {version:1}
Thu, 13 Apr 2023 21:29:24 -0000
|Read Time: 0 minutes
In this article, we turn our attention to step 5: Provisioning SupportAssist on the cluster.
As part of this process, we’ll be using the access key and PIN credentials previously obtained from the Dell Support portal in step 2 above.
SupportAssist can be configured from the OneFS 9.5 WebUI by going to Cluster management > General settings > SupportAssist. To initiate the provisioning process on a cluster, click the Connect SupportAssist link, as shown here:
If SupportAssist is not configured, the Remote support page displays the following banner, warning of the future deprecation of SRS:
Similarly, when SupportAssist is not configured, the SupportAssist WebUI page also displays verbiage recommending the adoption of SupportAssist:
There is also a Connect SupportAssist button to begin the provisioning process.
Selecting the Configure SupportAssist button initiates the setup wizard.
1. Telemetry Notice
The first step requires checking and accepting the Infrastructure Telemetry Notice:
2. Support Contract
For the next step, enter the details for the primary support contact, as prompted:
You can also provide the information from the CLI by using the isi supportassist contacts command set. For example:
# isi supportassist contacts modify --primary-first-name=Nick --primary-last-name=Trimbee --primary-email=trimbn@isilon.com
3. Establish Connections
Next, complete the Establish Connections page
This involves the following steps:
At least one statically allocated IPv4 network subnet and pool are required for provisioning SupportAssist. OneFS 9.5 does not support IPv6 networking for SupportAssist remote connectivity. However, IPv6 support is planned for a future release.
Select one or more network pools or subnets from the options displayed. In this example, we select subnet0pool0:
Or from the CLI:
Select one or more static subnets or pools for outbound communication, using the following CLI syntax:
# isi supportassist settings modify --network-pools="subnet0.pool0"
Additionally, if the cluster has the OneFS 9.5 network firewall enabled (“isi network firewall settings”), ensure that outbound traffic is allowed on port 9443.
In this next step, add the secure access key and pin. These should have been obtained in an earlier step in the provisioning procedure from the following Dell Support site: https://www.dell.com/support/connectivity/product/isilon-onefs.
Alternatively, if configuring SupportAssist from the OneFS CLI, add the key and pin by using the following syntax:
# isi supportassist provision start --access-key <key> --pin <pin>
Or, to configure direct access (the default) from the CLI, ensure that the following parameter is set:
# isi supportassist settings modify --connection-mode direct # isi supportassist settings view | grep -i "connection mode" Connection mode: direct
Alternatively, to connect through a gateway, select the Connect via Secure Connect Gateway button:
Complete the Gateway host and Gateway port fields as appropriate for the environment.
Alternatively, to set up a gateway configuration from the CLI, use the isi supportassist settings modify syntax. For example, to use the gateway FQDN secure-connect-gateway.yourdomain.com and the default port 9443:
# isi supportassist settings modify --connection-mode gateway # isi supportassist settings view | grep -i "connection mode" Connection mode: gateway # isi supportassist settings modify --gateway-host secure-connect-gateway.yourdomain.com --gateway-port 9443
When setting up the gateway connectivity option, Secure Connect Gateway v5.0 or later must be deployed within the data center. SupportAssist is incompatible with either ESRS gateway v3.52 or SAE gateway v4. However, Secure Connect Gateway v5.x is backward compatible with PowerScale OneFS ESRS, which allows the gateway to be provisioned and configured ahead of a cluster upgrade to OneFS 9.5.
Finally, configure the support options:
When you have completed the configuration, the WebUI will confirm that SmartConnect is successfully configured and enabled, as follows:
Or from the CLI:
# isi supportassist settings view Service enabled: Yes Connection State: enabled OneFS Software ID: ELMISL0223BJJC Network Pools: subnet0.pool0, subnet0.testpool1, subnet0.testpool2, subnet0.testpool3, subnet0.testpool4 Connection mode: gateway Gateway host: eng-sea-scgv5stg3.west.isilon.com Gateway port: 9443 Backup Gateway host: eng-sea-scgv5stg.west.isilon.com Backup Gateway port: 9443 Enable Remote Support: Yes Automatic Case Creation: Yes Download enabled: Yes
Thu, 13 Apr 2023 20:20:31 -0000
|Read Time: 0 minutes
In OneFS 9.5, several OneFS components now leverage SupportAssist as their secure off-cluster data retrieval and communication channel. These components include:
Component | Details |
---|---|
Events and Alerts | SupportAssist can send CELOG events and attachments through Embedded Service Enabler (ESE) to CLM. |
Diagnostics | Logfile gathers can be uploaded to Dell through SupportAssist. |
License activation | License activation uses SupportAssist for the isi license activation start CLI command. |
Telemetry | Telemetry is sent through SupportAssist to CloudIQ for analytics. |
Health check | Health check definition downloads now leverage SupportAssist. |
Remote Support | Remote Support now uses SupportAssist along with Connectivity Hub. |
For existing clusters, SupportAssist supports the same basic workflows as its predecessor, ESRS, so the transition from old to new is generally pretty seamless.
The overall process for enabling OneFS SupportAssist is as follows:
We’ll go through each of these configuration steps in order:
First, the cluster must be running OneFS 9.5 to configure SupportAssist.
There are some additional considerations and caveats to bear in mind when upgrading to OneFS 9.5 and planning on enabling SupportAssist. These include:
Also, ensure that the user account that will be used to enable SupportAssist belongs to a role with the ISI_PRIV_REMOTE_SUPPORT read and write privilege:
# isi auth privileges | grep REMOTE ISI_PRIV_REMOTE_SUPPORT Configure remote support
For example, for an ese user account:
# isi auth roles view SupportAssistRole Name: SupportAssistRole Description: - Members: ese Privileges ID: ISI_PRIV_LOGIN_PAPI Permission: r ID: ISI_PRIV_REMOTE_SUPPORT Permission: w
An access key and pin are required to provision SupportAssist, and these secure keys are held in key manager under the RICE domain. This access key and pin can be obtained from the following Dell Support site: https://www.dell.com/support/connectivity/product/isilon-onefs.
In the Quick link navigation bar, select the Generate Access key link:
On the following page, select the appropriate button:
The credentials required to obtain an access key and pin vary, depending on prior cluster configuration. Sites that have previously provisioned ESRS will need their OneFS Software ID (SWID) to obtain their access key and pin.
The isi license list CLI command can be used to determine a cluster’s SWID. For example:
# isi license list | grep "OneFS Software ID" OneFS Software ID: ELMISL999CKKD
However, customers with new clusters and/or customers who have not previously provisioned ESRS or SupportAssist will require their Site ID to obtain the access key and pin.
Note that any new cluster hardware shipping after January 2023 will already have an integrated key, so this key can be used in place of the Site ID.
For example, if this is the first time registering this cluster and it does not have an integrated key, select Yes, let’s register:
Enter the Site ID, site name, and location information for the cluster:
Choose a 4-digit PIN and save it for future reference. After that, click Create My Access Key:
The access key is then generated.
An automated email containing the pertinent key info is sent from the Dell | ServicesConnectivity Team. For example:
This access key is valid for one week, after which it automatically expires.
Next, in the cluster’s WebUI, go back to Cluster management > General settings > SupportAssist and enter the access key and PIN information in the appropriate fields. Finally, click Finish Setup to complete the SupportAssist provisioning process:
3. Deciding between direct or gateway topology
A topology decision will need to be made between implementing either direct connectivity or gateway connectivity, depending on the needs of the environment:
SupportAssist uses ports 443 and 8443 by default for bi-directional communication between the cluster and Connectivity Hub. These ports will need to be open across any firewalls or packet filters between the cluster and the corporate network edge to allow connectivity to Dell Support.
Additionally, port 9443 is used for communicating with a gateway (SCG).
# grep -i esrs /etc/services isi_esrs_d 9443/tcp #EMC Secure Remote Support outbound alerts
4. Installing Secure Connect Gateway (optional)
This step is only required when deploying Dell Secure Connect Gateway (SCG). If a direct connect topology is preferred, go directly to step 5.
When configuring SupportAssist with the gateway connectivity option, Secure Connect Gateway v5.0 or later must be deployed within the data center.
Dell SCG is available for Linux, Windows, Hyper-V, and VMware environments, and, as of this writing, the latest version is 5.14.00.16. The installation binaries can be downloaded from https://www.dell.com/support/home/en-us/product-support/product/secure-connect-gateway/drivers.
Download SCG as follows:
The following steps are required to set up SCG:
Pertinent resources for installing SCG include:
Another useful source of SCG installation, configuration, and troubleshooting information is the Dell Support forum: https://www.dell.com/community/Secure-Connect-Gateway/bd-p/SCG
5. Provisioning SupportAssist on the cluster
At this point, the off-cluster prestaging work should be complete.
In the next article in this series, we turn our attention to the SupportAssist provisioning process on the cluster itself (step 5).
Tue, 04 Apr 2023 17:15:00 -0000
|Read Time: 0 minutes
For enterprises to harness the advantages of advanced storage technologies with Dell PowerScale, a transition from an existing platform is necessary. Enterprises are challenged by how the new architecture will fit into the existing infrastructure. This blog post provides an overview of PowerScale architecture, features, and nomenclature for enterprises migrating from NetApp ONTAP.
The PowerScale OneFS operating system is based on a distributed architecture, built from the ground up as a clustered system. Each PowerScale node provides compute, memory, networking, and storage. The concepts of controllers, HA, active/standby, and disk shelves are not applicable in a pure scale-out architecture. Thus, when a node is added to a cluster, the cluster performance and capacity increase collectively.
Due to the scale-out distributed architecture with a single namespace, single volume, single file system, and one single pane of management, the system management is far simpler than with traditional NAS platforms. In addition, the data protection is software-based rather than RAID-based, eliminating all the associated complexities, including configuration, maintenance, and additional storage utilization. Administrators do not have to be concerned with RAID groups or load distribution.
NetApp’s ONTAP storage operating system has evolved into a clustered system with controllers. The system includes ONTAP FlexGroups composed of aggregates and FlexVols across nodes.
OneFS is a single volume, which makes cluster management simple. As the cluster grows in capacity, the single volume automatically grows. Administrators are no longer required to migrate data between volumes manually. OneFS repopulates and balances data between all nodes when a new node is added, making the node part of the global namespace. All the nodes in a PowerScale cluster are equal in the hierarchy. Drives share data intranode and internode.
PowerScale is easy to deploy, operate, and manage. Most enterprises require only one full-time employee to manage a PowerScale cluster.
For more information about the PowerScale OneFS architecture, see PowerScale OneFS Technical Overview and Dell PowerScale OneFS Operating System.
Figure 1. Dell PowerScale scale-out NAS architecture
The single volume and single namespace of PowerScale OneFS also lead to a unique feature set. Because the entire NAS is a single file system, the concepts of FlexVols, shares, qtrees, and FlexGroups do not apply. Each NetApp volume has specific properties associated with limited storage space. Adding more storage space to NetApp ONTAP could be an onerous process depending on the current architecture. Conversely, on a PowerScale cluster, as soon as a node is added, the cluster is rebalanced automatically, leading to minimal administrator management.
NetApp’s continued dependence on volumes creates potential added complexity for storage administrators. From a software perspective, the intricacies that arise from the concept of volumes span across all the features. Configuring software features requires administrators to base decisions on the volume concept, limiting configuration options. The volume concept is further magnified by the impacts on storage utilization.
The fact that OneFS is a single volume means that many features are not volume dependent but, rather, span the entire cluster. SnapshotIQ, NDMP backups, and SmartQuotas do not have limits based on volumes; instead, they are cluster-specific or directory-specific.
As a single-volume NAS designed for file storage, OneFS has the scalable capacity with ease of management combined with features that administrators require. Robust policy-driven features such as SmartConnect, SmartPools, and CloudPools enable maximum utilization of nodes for superior performance and storage efficiency for maximum value. You can use SmartConnect to configure access zones that are mapped to specific node performances. SmartPools can tier cold data to nodes with deep archive storage, and CloudPools can store frozen data in the cloud. Regardless of where the data is residing, it is presented as a single namespace to the end user.
Storage utilization is the amount of storage available after the NAS system overhead is deducted. The overhead consists of the space required for data protection and the operating system.
For data protection, OneFS uses software-based Reed-Solomon Error Correction with up to N+4 protection. OneFS offers several custom protection options that cover node and drive failures. The custom protection options vary according to the cluster configuration. OneFS provides data protection against more simultaneous hardware failures and is software-based, providing a significantly higher storage utilization.
The software-based data protection stripes data across nodes in stripe units, and some of the stripe units are Forward Error Correction (FEC) or parity units. The FEC units provide a variable to reformulate the data in the case of a drive or node failure. Data protection is customizable to be for node loss or hybrid protection of node and drive failure.
With software-based data protection, the protection scheme is not per cluster. It has additional granularity that allows for making data protection specific to a file or directory—without creating additional storage volumes or manually migrating data. Instead, OneFS runs a job in the background, moving data as configured.
Figure 2. OneFS data protection
OneFS protects data stored on failing nodes, or drives in a cluster through a process called SmartFail. During the process, OneFS places a device into quarantine and, depending on the severity of the issue, places the data on the device into a read-only state. While a device is quarantined, OneFS reprotects the data on the device by distributing the data to other devices.
NetApp’s data protection is all RAID-based, including NetApp RAID-TEC, NetApp RAID-DP, and RAID 4. NetApp only supports a maximum of triple parity, and simultaneous node failures in an HA pair are not supported.
For more information about SmartFail, see the following blog: OneFS Smartfail. For more information about OneFS data protection, see High Availability and Data Protection with Dell PowerScale Scale-Out NAS.
NetApp requires administrators to manually create space and explicitly define aggregates and flexible volumes. The concept of FlexVols, shares, and Qtrees are nonexistent in OneFS, as the file system is a single volume and namespace, spanning the entire cluster.
SMB shares and NFS exports are created through the web or command-line interface in OneFS. Both methods allow the user to create either within seconds with security options. SmartQuotas is used to manage storage limits, cluster-wide, across the entire namespace. They include accounting, warning messages, or hard limits of enforcement. The limits can be applied by directory, user, or group.
Conversely, ONTAP quota management is at the volume or FlexGroup level, creating additional administrative overhead because the process is more onerous.
The OneFS snapshot feature is SnapshotIQ, which does not have specified or enforced limits for snapshots per directory or snapshots per cluster. However, the best practice is 1,024 snapshots per directory and 20,000 snapshots per cluster. OneFS also supports writable snapshots. For more information about SnapshotIQ and writable snapshots, see High Availability and Data Protection with Dell PowerScale Scale-Out NAS.
NetApp Snapshot supports 255 snapshots per volume in ONTAP 9.3 and earlier. ONTAP 9.4 and later versions support 1,023 snapshots per volume. By default, NetApp requires a space reservation of 5 percent in the volume when snapshots are used, requiring the space reservation to be monitored and manually increased if space becomes exhausted. Further, the space reservation can also affect volume availability. The space reservation requirement creates additional administration overhead and affects storage efficiency by setting aside space that might or might not be used.
Data replication is required for disaster recovery, RPO, or RTO requirements. OneFS provides data replication through SyncIQ and SmartSync.
SyncIQ provides asynchronous data replication, whereas NetApp’s asynchronous replication, which is called SnapMirror, is block-based replication. SyncIQ provides options for ensuring that all data is retained during failover and failback from the disaster recovery cluster. SyncIQ is fully configurable with options for execution times and bandwidth management. A SyncIQ target cluster may be configured as a target for several source clusters.
SyncIQ offers a single-button automated process for failover and failback with Superna Eyeglass DR Edition. For more information about Superna Eyeglass DR Edition, see Superna | DR Edition (supernaeyeglass.com).
SyncIQ allows configurable options for replication down to a specific file, directory, or entire cluster. Conversely, NetApp’s SnapMirror replication starts at the volume at a minimum. The volume concept and dependence on volume requirements continue to add management complexity and overhead for administrators while also wasting storage utilization.
To address the requirements of the modern enterprise, OneFS version 9.4.0.0 introduced SmartSync. This feature replicates file-to-file data between PowerScale clusters. SmartSync cloud copy replicates file-to-object data from PowerScale clusters to Dell ECS and cloud providers. Having multiple target destinations allows administrators to store multiple copies of a dataset across locations, providing further disaster recovery readiness. SmartSync cloud copy replicates file-to-object data from PowerScale clusters to Dell ECS and cloud providers. SmartSync cloud copy also pulls the replicated object data from a cloud provider back to a PowerScale cluster in file. For more information about SyncIQ, see Dell PowerScale SyncIQ: Architecture, Configuration, and Considerations. For more information about SmartSync, see Dell PowerScale SmartSync.
OneFS SmartQuotas provides configurable options to monitor and enforce storage limits at the user, group, cluster, directory, or subdirectory level. ONTAP quotas are user-, tree-, volume-, or group-based.
For more information about SmartQuotas, see Storage Quota Management and Provisioning with Dell PowerScale SmartQuotas.
Because OneFS is a distributed architecture across a collection of nodes, client connectivity to these nodes requires load balancing. OneFS SmartConnect provides options for balancing the client connections to the nodes within a cluster. Balancing options are round-robin or based on current load. Also, SmartConnect zones can be configured to have clients connect based on group and performance needs. For example, the Engineering group might require high-performance nodes. A zone can be configured, forcing connections to those nodes.
NetApp ONTAP supports multitenancy with Storage Virtual Machines (SVMs), formerly vServers and Logical Interfaces (LIFs). SVMs isolate storage and network resources across a cluster of controller HA pairs. SVMs require managing protocols, shares, and volumes for successful provisioning. Volumes cannot be nondisruptively moved between SVMs. ONTAP supports load balancing using LIFs, but configuration is manual and must be implemented by the storage administrator. Further, it requires continuous monitoring because it is based on the load on the controller.
OneFS provides multitenancy through SmartConnect and access zones. Management is simple because the file system is one volume and access is provided by hostname and directory, rather than by volume. SmartConnect is policy-driven and does not require continuous monitoring. SmartConnect settings may be changed on demand as the requirements change.
SmartConnect zones allow administrators to provision DNS hostnames specific to IP pools, subnets, and network interfaces. If only a single authentication provider is required, all the SmartConnect zones map to a default access zone. However, if directory access and authentication providers vary, multiple access zones are provisioned, mapping to a directory, authentication provider, and SmartConnect zone. As a result, authenticated users of an access zone only have visibility into their respective directory. Conversely, an administrator with complete file system access can migrate data nondisruptively between directories.
For more information about SmartConnect, see PowerScale: Network Design Considerations.
Both ONTAP and OneFS provide compression. The OneFS deduplication feature is SmartDedupe, which allows deduplication to run at a cluster-wide level, improving overall Data Reduction Rate (DRR) and storage utilization. With ONTAP, the deduplication is enabled at the aggregate level, and it cannot cross over nodes.
For more information about OneFS data reduction, see Dell PowerScale OneFS: Data Reduction and Storage Efficiency. For more information about SmartDedupe, see Next-Generation Storage Efficiency with Dell PowerScale SmartDedupe.
OneFS has integrated features to tier data based on the data’s age or file type. NetApp has similar functionality with FabricPools.
OneFS SmartPools uses robust policies to enable data placement and movement across multiple types of storage. SmartPools can be configured to move data to a set of nodes automatically. For example, if a file has not been accessed in the last 90 days, in can be migrated to a node with deeper storage, allowing admins to define the value of storage based on performance.
OneFS CloudPools migrates data to a cloud provider, with only a stub remaining on the PowerScale cluster, based on similar policies. CloudPools not only tiers data to a cloud provider but also recalls the data back to the cluster as demanded. From a user perspective, all the data is still in a single namespace, irrespective of where it resides.
Figure 3. OneFS SmartPools and CloudPools
ONTAP tiers to S3 object stores using FabricPools.
For more information about SmartPools, see Storage Tiering with Dell PowerScale SmartPools. For more information about CloudPools, see:
Dell InsightIQ and Dell CloudIQ provide performance monitoring and reporting capabilities. InsightIQ includes advanced analytics to optimize applications, correlate cluster events, and accurately forecast future storage needs. NetApp provides performance monitoring and reporting with Cloud Insights and Active IQ, which are accessible within BlueXP.
For more information about CloudIQ, see CloudIQ: A Detailed Review. For more information about InsightIQ, see InsightIQ on Dell Support.
Similar to ONTAP, the PowerScale OneFS operating system comes with a comprehensive set of integrated security features. These features include data at rest and data in flight encryption, virus scanning tool, WORM SmartLock compliance, external key manager for data at rest encryption, STIG-hardened security profile, Common Criteria certification, and support for UEFI Secure Boot across PowerScale platforms. Further, OneFS may be configured for a Zero Trust architecture and PCI-DSS.
Superna exclusively provides the following security-focused applications for PowerScale OneFS:
Figure 4. Superna security
NetApp ONTAP security is limited to the integrated features listed above. Additional applications for further security monitoring, like Superna, are not available for ONTAP.
For more information about Superna security, see supernaeyeglass.com. For more information about PowerScale security, see Dell PowerScale OneFS: Security Considerations.
NetApp and PowerScale OneFS both support several methods for user authentication and access control. OneFS supports UNIX and Windows permissions for data-level access control. OneFS is designed for a mixed environment that allows the configuration of both Windows Access Control Lists (ACLs) and standard UNIX permissions on the cluster file system. In addition, OneFS provides user and identity mapping, permission mapping, and merging between Windows and UNIX environments.
OneFS supports local and remote authentication providers. Anonymous access is supported for protocols that allow it. Concurrent use of multiple authentication provider types, including Active Directory, LDAP, and NIS, is supported. For example, OneFS is often configured to authenticate Windows clients with Active Directory and to authenticate UNIX clients with LDAP.
OneFS supports role-based access control (RBAC), allowing administrative tasks to be configured without a root or administrator account. A role is a collection of OneFS privileges that are limited to an area of administration. Custom roles for security, auditing, storage, or backup tasks may be provisioned with RBACs. Privileges are assigned to roles. As users log in to the cluster through the platform API, the OneFS command-line interface, or the OneFS web administration interface, they are granted privileges based on their role membership.
For more information about OneFS authentication and access control, see PowerScale OneFS Authentication, Identity Management, and Authorization.
To learn more about PowerScale OneFS, see the following resources:
Tue, 21 Mar 2023 18:30:54 -0000
|Read Time: 0 minutes
The previous articles in this series have covered the SmartQoS architecture, configuration, and management. Now, we’ll turn our attention to monitoring and troubleshooting.
You can use the ‘isi statistics workload’ CLI command to monitor the dataset’s performance. The ‘Ops’ column displays the current protocol operations per second. In the following example, Ops stabilize around 9.8, which is just below the configured limit value of 10 Ops.
# isi statistics workload --dataset ds1 & data
Similarly, this next example from the SmartQoS WebUI shows a small NFS workflow performing 497 protocol Ops in a pinned workload with a limit of 500 Ops:
You can pin multiple paths and protocols by selecting the ‘Pin Workload’ option for a given Dataset. Here, four directory path workloads are each configured with different Protocol Ops limits:
When it comes to troubleshooting SmartQoS, there are a few areas that are worth checking right away, including the SmartQoS Ops limit configuration, isi_pp_d and isi_stats_d daemons, and the protocol service(s).
1. For suspected Ops limit configuration issues, first confirm that the SmartQoS limits feature is enabled:
# isi performance settings view Top N Collections: 1024 Time In Queue Threshold (ms): 10.0 Target read latency in microseconds: 12000.0 Target write latency in microseconds: 12000.0 Protocol Ops Limit Enabled: Yes
Next, verify that the workload level protocols_ops limit is correctly configured:
# isi performance workloads view <workload>
Check whether any errors are reported in the isi_tardis_d configuration log:
# cat /var/log/isi_tardis_d.log
2. To investigate isi_pp_d, first check that the service is enabled:
# isi services –a isi_pp_d Service 'isi_pp_d' is enabled.
If necessary, you can restart the isi_pp_d service as follows:
# isi services isi_pp_d disable Service 'isi_pp_d' is disabled. # isi services isi_pp_d enable Service 'isi_pp_d' is enabled.
There’s also an isi_pp_d debug tool, which can be helpful in a pinch:
# isi_pp_d -h Usage: isi_pp_d [-ldhs] -l Run as a leader process; otherwise, run as a follower. Only one leader process on the cluster will be active. -d Run in debug mode (do not daemonize). -s Display pp_leader node (devid and lnn) -h Display this help.
You can enable debugging on the isi_pp_d log file with the following command syntax:
# isi_ilog -a isi_pp_d -l debug, /var/log/isi_pp_d.log
For example, the following log snippet shows a typical isi_ppd_d.log message communication between isi_pp_d leader and isi_pp_d followers:
/ifs/.ifsvar/modules/pp/comm/SETTINGS [090500b000000b80,08020000:0000bfddffffffff,09000100:ffbcff7cbb9779de,09000100:d8d2fee9ff9e3bfe,090001 00:0000000075f0dfdf] 100,,,,20,1658854839 < in the format of <workload_id, cputime, disk_reads, disk_writes, protocol_ops, timestamp>
Here, the extract from the /var/log/isi_pp_d.log logfiles from nodes 1 and 2 of a cluster illustrate the different stages of Protocol Ops limit enforcement and usage:
3. To investigate the isi_stats_d, first confirm that the isi_pp_d service is enabled:
# isi services -a isi_stats_d Service 'isi_stats_d' is enabled.
If necessary, you can restart the isi_stats_d service as follows:
# isi services isi_stats_d disable # isi services isi_stats_d enable
You can view the workload level statistics with the following command:
# isi statistics workload list --dataset=<name>
You can enable debugging on the isi_stats_d log file with the following command syntax:
# isi_stats_tool --action set_tracelevel --value debug # cat /var/log/isi_stats_d.log
4. To investigate protocol issues, the ‘isi services’ and ‘lwsm’ CLI commands can be useful. For example, to check the status of the S3 protocol:
# /usr/likewise/bin/lwsm list | grep -i protocol hdfs [protocol] stopped lwswift [protocol] running (lwswift: 8393) nfs [protocol] running (nfs: 8396) s3 [protocol] stopped srv [protocol] running (lwio: 8096) # /usr/likewise/bin/lwsm status s3 stopped # /usr/likewise/bin/lwsm info s3 Service: s3 Description: S3 Server Categories: protocol Path: /usr/likewise/lib/lw-svcm/s3.so Arguments: Dependencies: lsass onefs_s3 AuditEnabled?flt_audit_s3 Container: s3
This CLI output confirms that the S3 protocol is inactive. You can start the S3 service as follows:
# isi services -a | grep -i s3 s3 S3 Service Enabled
Similarly, you can restart the S3 service as follows:
# /usr/likewise/bin/lwsm restart s3 Stopping service: s3 Starting service: s3
To investigate further, you can increase the protocol’s log level verbosity. For example, to set the s3 log to ‘debug’:
# isi s3 log-level view Current logging level is 'info' # isi s3 log-level modify debug # isi s3 log-level view Current logging level is 'debug'
Next, view and monitor the appropriate protocol log. For example, for the S3 protocol:
# cat /var/log/s3.log # tail -f /var/log/s3.log
Beyond the above, you can monitor /var/log/messages for pertinent errors, because the main partition performance (PP) modules log to this file. You can enable debug level logging for the various PP modules as follows.
Dataset:
# sysctl ilog.ifs.acct.raa.syslog=debug+ ilog.ifs.acct.raa.syslog: error,warning,notice (inherited) -> error,warning,notice,info,debug
Workload:
# sysctl ilog.ifs.acct.rat.syslog=debug+ ilog.ifs.acct.rat.syslog: error,warning,notice (inherited) -> error,warning,notice,info,debug
Actor work:
# sysctl ilog.ifs.acct.work.syslog=debug+ ilog.ifs.acct.work.syslog: error,warning,notice (inherited) -> error,warning,notice,info,debug
When finished, you can restore the default logging levels for the above modules as follows:
# sysctl ilog.ifs.acct.raa.syslog=notice+ # sysctl ilog.ifs.acct.rat.syslog=notice+ # sysctl ilog.ifs.acct.work.syslog=notice+
Author: Nick Trimbee
Tue, 14 Mar 2023 16:06:06 -0000
|Read Time: 0 minutes
In the previous article in this series, we looked at the underlying architecture and management of SmartQoS in OneFS 9.5. Next, we’ll step through an example SmartQoS configuration using the CLI and WebUI.
After an initial set up, configuring a SmartQoS protocol Ops limit comprises four fundamental steps. These are:
Step | Task | Description | Example |
1 | Identify Metrics of interest | Used for tracking, to enforce an Ops limit | Uses ‘path’ and ‘protocol’ for the metrics to identify the workload. |
2 | Create a Dataset | For tracking all of the chosen metric categories | Create the dataset ‘ds1’ with the metrics identified. |
3 | Pin a Workload | To specify exactly which values to track within the chosen metrics | path: /ifs/data/client_exports protocol: nfs3 |
4 | Set a Limit | To limit Ops based on the dataset, metrics (categories), and metric values defined by the workload | Protocol_ops limit: 100 |
Step 1:
First, select a metric of interest. For this example, we’ll use the following:
If not already present, create and verify an NFS export – in this case at /ifs/test/expt_nfs:
# isi nfs exports create /ifs/test/expt_nfs # isi nfs exports list ID Zone Paths Description ------------------------------------------------ 1 System /ifs/test/expt_nfs ------------------------------------------------
Or from the WebUI, under Protocols UNIX sharing (NFS) > NFS exports:
Step 2:
The ‘dataset’ designation is used to categorize workload by various identification metrics, including:
ID Metric | Details |
Username | UID or SID |
Primary groupname | Primary GID or GSID |
Secondary groupname | Secondary GID or GSID |
Zone name |
|
IP address | Local or remote IP address or IP address range |
Path | Except for S3 protocol |
Share | SMB share or NFS export ID |
Protocol | NFSv3, NFSv4, NFSoRDMA, SMB, or S3 |
SmartQoS in OneFS 9.5 only allows protocol Ops as the transient resources used for configuring a limit ceiling.
For example, you can use the following CLI command to create a dataset ‘ds1’, specifying protocol and path as the ID metrics:
# isi performance datasets create --name ds1 protocol path Created new performance dataset 'ds1' with ID number 1.
Note: Resource usage tracking by the ‘path’ metric is only supported by SMB and NFS.
The following command displays any configured datasets:
# isi performance datasets list
Or, from the WebUI, by navigating to Cluster management > Smart QoS:
Step 3:
After you have created the dataset, you can pin a workload to it by specifying the metric values. For example:
# isi performance workloads pin ds1 protocol:nfs3 path: /ifs/test/expt_nfs
Pinned performance dataset workload with ID number 100.
Or from the WebUI, by browsing to Cluster management > Smart QoS > Pin workload:
After pinning a workload, the entry appears in the ‘Top Workloads’ section of the WebUI page. However, wait at least 30 seconds to start receiving updates.
To list all the pinned workloads from a specified dataset, use the following command:
# isi performance workloads list ds1
The prior command’s output indicates that there are currently no limits set for this workload.
By default, a protocol ops limit exists for each workload. However, it is set to the maximum (the maximum value of a 64-bit unsigned integer). This is represented in the CLI output by a dash (“-“) if a limit has not been explicitly configured:
# isi performance workloads list ds1 ID Name Metric Values Creation Time Cluster Resource Impact Client Impact Limits -------------------------------------------------------------------------------------- 100 - path:/ifs/test/expt_nfs 2023-02-02T12:06:05 - - - protocol:nfs3 -------------------------------------------------------------------------------------- Total: 1
Step 4:
For a pinned workload in a dataset, you can configure a limit for the protocol ops limit from the CLI, using the following syntax:
# isi performance workloads modify <dataset> <workload ID> --limits protocol_ops:<value>
When configuring SmartQoS, always be aware that it is a powerful performance throttling tool which can be applied to significant areas of a cluster’s data and userbase. For example, protocol Ops limits can be configured for metrics such as ‘path:/ifs’, which would affect the entire /ifs filesystem, or ‘zone_name:System’ which would limit the System access zone and all users within it. While such configurations are entirely valid, they would have a significant, system-wide impact. As such, exercise caution when configuring SmartQoS to avoid any inadvertent, unintended, or unexpected performance constraints.
In the following example, the dataset is ‘ds1’, the workload ID is ‘100’, and the protocol Ops limit is set to the value ‘10’:
# isi performance workloads modify ds1 100 --limits protocol_ops:10 protocol_ops: 18446744073709551615 -> 10
Or from the WebUI, by browsing to Cluster management > Smart QoS > Pin and throttle workload:
You can use the ‘isi performance workloads’ command in ‘list’ mode to show details of the workload ‘ds1’. In this case, ‘Limits’ is set to protocol_ops = 10.
# isi performance workloads list test ID Name Metric Values Creation Time Cluster Resource Impact Client Impact Limits -------------------------------------------------------------------------------------- 100 - path:/ifs/test/expt_nfs 2023-02-02T12:06:05 - - protocol_ops:10 protocol:nfs3 -------------------------------------------------------------------------------------- Total: 1
Or in ‘view’ mode:
# isi performance workloads view ds1 100 ID: 100 Name: - Metric Values: path:/ifs/test/expt_nfs, protocol:nfs3 Creation Time: 2023-02-02T12:06:05 Cluster Resource Impact: - Client Impact: - Limits: protocol_ops:10
Or from the WebUI, by browsing to Cluster management > Smart QoS:
You can easily modify the limit value of a pinned workload with the following CLI syntax. For example, to set the limit to 100 Ops:
# isi performance workloads modify ds1 100 --limits protocol_ops:100
Or from the WebUI, by browsing to Cluster management > Smart QoS > Edit throttle:
Similarly, you can use the following CLI command to easily remove a protocol ops limit for a pinned workload:
# isi performance workloads modify ds1 100 --no-protocol-ops-limit
Or from the WebUI, by browsing to Cluster management > Smart QoS > Remove throttle:
Author: Nick Trimbee
Mon, 13 Mar 2023 23:31:33 -0000
|Read Time: 0 minutes
Among the myriad of new features included in the OneFS 9.5 release is SupportAssist, Dell’s next-gen remote connectivity system. SupportAssist is included with all support plans (features vary based on service level agreement).
Dell SupportAssist rapidly identifies, diagnoses, and resolves cluster issues and provides the following key benefits:
Within OneFS, SupportAssist transmits events, logs, and telemetry from PowerScale to Dell support. As such, it provides a full replacement for the legacy ESRS.
Delivering a consistent remote support experience across the Dell storage portfolio, SupportAssist is intended for all sites that can send telemetry off-cluster to Dell over the Internet. SupportAssist integrates the Dell Embedded Service Enabler (ESE) into PowerScale OneFS along with a suite of daemons to allow its use on a distributed system.
SupportAssist | ESRS |
---|---|
Dell’s next-generation remote connectivity solution | Being phased out of service |
Can either connect directly, or through supporting gateways | Can only use gateways for remote connectivity |
Uses Connectivity Hub to coordinate support | Uses ServiceLink to coordinate support |
Using the Dell Connectivity Hub, SupportAssist can either interact directly or through a Secure Connect gateway.
SupportAssist has a variety of components that gather and transmit various pieces of OneFS data and telemetry to Dell Support and backend services through the Embedded Service Enabler (ESE). These workflows include CELOG events; In-product activation (IPA) information; CloudIQ telemetry data; Isi-Gather-info (IGI) logsets; and provisioning, configuration, and authentication data to ESE and the various backend services.
Workflow | Details |
---|---|
CELOG | In OneFS 9.5, SupportAssist can be configured to send CELOG events and attachments through ESE to CLM. CELOG has a “supportassist” channel that, when active, creates an EVENT task for SupportAssist to propagate. |
License Activation | The isi license activation start command uses SupportAssist to connect.
Several pieces of PowerScale and OneFS functionality require licenses, and must communicate with the Dell backend services in order to register and activate those cluster licenses. In OneFS 9.5, SupportAssist is the preferred mechanism to send those license activations through ESE to the Dell backend. License information can be generated through the isi license generate CLI command and then activated with the isi license activation start syntax. |
Provisioning | SupportAssist must register with backend services in a process known as provisioning. This process must be run before the ESE will respond on any of its other available API endpoints. Provisioning can only successfully occur once per installation, and subsequent provisioning tasks will fail. SupportAssist must be configured through the CLI or WebUI before provisioning. The provisioning process uses authentication information that was stored in the key manager upon the first boot. |
Diagnostics | The OneFS isi diagnostics gather and isi_gather_info logfile collation and transmission commands have a --supportassist option. |
Healthchecks | HealthCheck definitions are updated using SupportAssist. |
Telemetry | CloudIQ telemetry data is sent using SupportAssist. |
Remote Support | Remote Support uses SupportAssist and the Connectivity Hub to assist customers with their clusters. |
SupportAssist requires an access key and PIN, or hardware key, to be enabled, with most customers likely using the access key and pin method. Secure keys are held in key manager under the RICE domain.
In addition to the transmission of data from the cluster to Dell, Connectivity Hub also allows inbound remote support sessions to be established for remote cluster troubleshooting.
In the next article in this series, we’ll take a deeper look at the SupportAssist architecture and operation.
Wed, 01 Mar 2023 22:34:30 -0000
|Read Time: 0 minutes
The SmartQoS Protocol Ops limits architecture, introduced in OneFS 9.5, involves three primary capabilities:
Under the hood, the OneFS protocol heads (NFS, SMB, and S3) identify and track how many protocol operations are being processed through a specific export or share. The existing partitioned performance (PP) reporting infrastructure is leveraged for cluster wide resource usage collection, limit calculation and distribution, along with new OneFS 9.5 functionality to support pinned workload protocol Ops limits.
The protocol scheduling module (LwSched) has a built-in throttling capability that allows the execution of individual operations to be delayed by temporarily pausing them, or ‘sleeping’. Additionally, in OneFS 9.5, the partitioned performance kernel modules have also been enhanced to calculate ‘sleep time’ based on operation count resource information (requested, average usage, and so on) – both within the current throttling window, and for a specific workload.
We can characterize the fundamental SmartQoS workflow as follows:
When an admin configures a per-cluster protocol Ops limit, the statistics gathering service, isi_stats_d, begins collecting workload resource information every 30 seconds by default from the partitioned performance (PP) kernel on each node in the cluster and notifies the isi_pp_d leader service of this resource info. Next, the leader gets the per-cluster protocol Ops limit plus additional resource consumption metrics from the isi_acct_cpp service from isi_tardis_d, the OneFS cluster configuration service and calculates the protocol Ops limit of each node for the next throttling window. It then instructs the isi_pp_d follower service on each node to update the kernel with the newly calculated protocol Ops limit, plus a request to reset the throttling window.
When the kernel receives a scheduling request for a work item from the protocol scheduler (LwSched), the kernel calculates the required ‘sleep time’ value, based on the current node protocol Ops limit and resource usage in the current throttling window. If insufficient resources are available, the work item execution thread is put to sleep for a specific interval returned from the PP kernel. If resources are available, or the thread is reactivated from sleeping, it executes the work item and reports the resource usage statistics back to PP, releasing any scheduling resources it may own.
SmartQoS can be configured through either the CLI, platform API, or WebUI, and OneFS 9.5 introduces a new SmartQoS WebUI page to support this. Note that SmartQoS is only available when an upgrade to OneFS 9.5 has been committed, and any attempt to configure or run the feature prior to upgrade commit will fail with the following message:
# isi performance workloads modify DS1 -w WS1 --limits protocol_ops:50000 Setting of protocol ops limits not available until upgrade has been committed
When a cluster is running OneFS 9.5 and the release is committed, the SmartQoS feature is enabled by default. This, and the current configuration, can be confirmed using the following CLI command:
# isi performance settings view Top N Collections: 1024 Time In Queue Threshold (ms): 10.0 Target read latency in microseconds: 12000.0 Target write latency in microseconds: 12000.0 Protocol Ops Limit Enabled: Yes
In OneFS 9.5, the ‘isi performance settings modify’ CLI command now includes a ‘protocol-ops-limit-enabled’ parameter to allow the feature to be easily disabled (or re-enabled) across the cluster. For example:
# isi performance settings modify --protocol-ops-limit-enabled false protocol_ops_limit_enabled: True -> False
Similarly, the ‘isi performance settings view’ CLI command has been extended to report the protocol OPs limit state:
# isi performance settings view * Top N Collections: 1024 Protocol Ops Limit Enabled: Yes
In order to set a protocol OPs limit on workload from the CLI, the ‘isi performance workload pin’ and ‘isi performance workload modify’ commands now accept an optional ‘–limits’ parameter. For example, to create a pinned workload with the ‘protocol_ops’ limit set to 10000:
# isi performance workload pin test protocol:nfs3 --limits protocol_ops:10000
Similarly, to modify an existing workload’s ‘protocol_ops’ limit to 20000:
# isi performance workload modify test 101 --limits protocol_ops:20000 protocol_ops: 10000 -> 20000
When configuring SmartQoS, always be aware that it is a powerful throttling tool that can be applied to significant areas of a cluster’s data and userbase. For example, protocol OPs limits can be configured for metrics such as ‘path:/ifs’, which would affect the entire /ifs filesystem, or ‘zone_name:System’ which would limit the System access zone and all users within it.
While such configurations are entirely valid, they would have a significant, system-wide impact. As such, exercise caution when configuring SmartQoS to avoid any inadvertent, unintended, or unexpected performance constraints.
To clear a protocol Ops limit on workload, the ‘isi performance workload modify’ CLI command has been extended to accept an optional ‘–noprotocol-ops-limit’ argument. For example:
# isi performance workload modify test 101 --no-protocol-ops-limit protocol_ops: 20000 -> 18446744073709551615
Note that the value of ‘18446744073709551615’ in the command output above represents ‘NO_LIMIT’ set.
You can view a workload’s protocol Ops limit by using the ‘isi performance workload list’ and ‘isi performance workload view’ CLI commands, which have been modified in OneFS 9.5 to display the limits appropriately. For example:
# isi performance workload list test ID Name Metric Values Creation Time Impact Limits --------------------------------------------------------------------- 101 - protocol:nfs3 2023-02-02T22:35:02 - protocol_ops:20000 --------------------------------------------------------------------- # isi performance workload view test 101 ID: 101 Name: - Metric Values: protocol:nfs3 Creation Time: 2023-02-02T22:35:02 Impact: - Limits: protocol_ops:20000
In the next article in this series, we’ll step through an example SmartQoS configuration and verification from both the CLI and WebUI.
Author: Nick Trimbee
Thu, 23 Feb 2023 22:34:49 -0000
|Read Time: 0 minutes
Built atop the partitioned performance (PP) resource monitoring framework, OneFS 9.5 introduces a new SmartQoS performance management feature. SmartQoS allows a cluster administrator to set limits on the maximum number of protocol operations per second (Protocol Ops) that individual pinned workloads can consume, in order to achieve desired business workload prioritization. Among the benefits of this new QoS functionality are:
This new SmartQoS feature in OneFS 9.5 supports the NFS, SMB and S3 protocols, including mixed traffic to the same workload.
But first, a quick refresher. The partitioned performance resource monitoring framework, which initially debuted in OneFS 8.0.1, enables OneFS to track and report the use of transient system resources (resources that only exist at a given instant), providing insight into who is consuming what resources, and how much of them. Examples include CPU time, network bandwidth, IOPS, disk accesses, and cache hits, and so on.
OneFS partitioned performance is an ongoing project that in OneFS 9.5 now provides control and insights. This allows control of work flowing through the system, prioritization and protection of mission critical workflows, and the ability to detect if a cluster is at capacity.
Because identification of work is highly subjective, OneFS partitioned performance resource monitoring provides significant configuration flexibility, by allowing cluster admins to craft exactly how they want to define, track, and manage workloads. For example, an administrator might want to partition their work based on criteria such as which user is accessing the cluster, the export/share they are using, which IP address they’re coming from – and often a combination of all three.
OneFS has always provided client and protocol statistics, but they were typically front-end only. Similarly, OneFS has provided CPU, cache, and disk statistics, but they did not display who was consuming them. Partitioned performance unites these two realms, tracking the usage of the CPU, drives, and caches, and spanning the initiator/participant barrier.
OneFS collects the resources consumed and groups them into distinct workloads. The aggregation of these workloads comprises a performance dataset.
Item | Description | Example |
Workload | A set of identification metrics and resources used | {username:nick, zone_name:System} consumed {cpu:1.5s, bytes_in:100K, bytes_out:50M, …} |
Performance Dataset | The set of identification metrics by which to aggregate workloads
The list of workloads collected that match that specification | {usernames, zone_names} |
Filter | A method for including only workloads that match specific identification metrics |
|
The following metrics are tracked by partitioned performance resource monitoring:
Category | Items |
Identification Metrics |
|
Transient Resources |
|
Performance Statistics |
|
Supported Protocols |
|
Be aware that, in OneFS 9.5, SmartQoS currently does not support the following Partitioned Performance criteria:
Unsupported Group | Unsupported Items |
Metrics |
|
Workloads |
|
Protocols |
|
When pinning a workload to a dataset, note that the more metrics there are in that dataset, the more parameters need to be defined when pinning to it. For example:
Dataset = zone_name, protocol, username
To set a limit on this dataset, you’d need to pin the workload by also specifying the zone name, protocol, and username.
When using the remote_address and/or local_address metrics, you can also specify a subnet. For example: 10.123.456.0/24
With the exception of the system dataset, you must configure performance datasets before statistics are collected.
For SmartQoS in OneFS 9.5, you can define and configure limits as a maximum number of protocol operations (Protocol Ops) per second across the following protocols:
You can apply a Protocol Ops limit to up to four custom datasets. All pinned workloads within a dataset can have a limit configured, up to a maximum of 1024 workloads per dataset. If multiple workloads happen to share a common metric value with overlapping limits, the lowest limit that is configured would be enforced
Note that when upgrading to OneFS 9.5, SmartQoS is activated only when the new release has been successfully committed.
In the next article in this series, we’ll take a deeper look at SmartQoS’ underlying architecture and workflow.
Author: Nick Trimbee
Thu, 16 Feb 2023 15:48:08 -0000
|Read Time: 0 minutes
In the first article in this series, we looked at the architecture and considerations of the new SmartPools transfer limits in OneFS 9.5. Now, we turn our attention to the configuration and management of this feature.
From the control plane side, OneFS 9.5 contains several WebUI and CLI enhancements to reflect the new SmartPools transfer limits functionality. Probably the most obvious change is in the Local storage usage status histogram, where tiers and their child node pools have been aggregated for a more logical grouping. Also, blue limit-lines have been added above each of the storage pools, and a red warning status is displayed for any pools that have exceeded the transfer limit.
Similarly, the storage pools status page now includes transfer limit details, with the 90% limit displayed for any storage pools using the default setting.
From the CLI, the isi storagepool nodepools view command reports the transfer limit status and percentage for a pool. The used SSD and HDD bytes percentages in the command output indicate where the pool utilization is relative to the transfer limit.
The storage transfer limit can be easily configured from the CLI as either for a specific pool, as a default, or disabled, using the new –transfer-limit and –default-transfer-limit flags.
The following CLI command can be used to set the transfer limit for a specific storage pool:
# isi storagepool nodepools/tier modify --transfer-limit={0-100, default, disabled}
For example, to set a limit of 80% on an A200 nodepool:
# isi storagepool a200_30tb_1.6tb-ssd_96gb modify --transfer-limit=80
Or to set the default limit of 90% on tier perf1:
# isi storagepool perf1 --transfer-limit=default
Note that setting the transfer limit of a tier automatically applies to all its child node pools, regardless of any prior child limit configurations.
The global isi storage settings view CLI command output shows the default transfer limit, which is 90%, but it can be configured between 0 to 100%.
This default limit can be reconfigured from the CLI with the following syntax:
# isi storagepool settings modify --default-transfer-limit={0-100, disabled}
For example, to set a new default transfer limit of 85%:
# isi storagepool settings modify --default-transfer-limit=85
And the same changes can be made from the SmartPools WebUI, by navigating to Storage pools > SmartPools settings:
Once a SmartPools job has been completed in OneFS 9.5, the job report contains a new field, files not moved due to transfer limit exceeded.
# isi job reports view 1056 ... ... Policy/testpolicy/Access changes skipped 0 Policy/testpolicy/ADS containers matched 'head’ 0 Policy/testpolicy/ADS containers matched 'snapshot’ 0 Policy/testpolicy/ADS streams matched 'head’ 0 Policy/testpolicy/ADS streams matched 'snapshot’ 0 Policy/testpolicy/Directories matched 'head’ 0 Policy/testpolicy/Directories matched 'snapshot’ 0 Policy/testpolicy/File creation templates matched 0 Policy/testpolicy/Files matched 'head’ 0 Policy/testpolicy/Files matched 'snapshot’ 0 Policy/testpolicy/Files not moved due to transfer limit exceeded 0 Policy/testpolicy/Files packed 0 Policy/testpolicy/Files repacked 0 Policy/testpolicy/Files unpacked 0 Policy/testpolicy/Packing changes skipped 0 Policy/testpolicy/Protection changes skipped 0 Policy/testpolicy/Skipped files already in containers 0 Policy/testpolicy/Skipped packing non-regular files 0 Policy/testpolicy/Skipped packing regular files 0
Additionally, the SYS STORAGEPOOL FILL LIMIT EXCEEDED alert is triggered at the Info level when a storage pool’s usage has exceeded its transfer limit. Each hour, CELOG fires off a monitor helper script that measures how full each storage pool is relative to its transfer limit. The usage is gathered by reading from the disk pool database, and the transfer limits are stored in gconfig. If a node pool has a transfer limit of 50% and usage of 75%, the monitor helper would report a measurement of 150%, triggering an alert.
# isi event view 126 ID: 126 Started: 11/29 20:32 Causes Long: storagepool: vonefs_13gb_4.2gb-ssd_6gb:hdd usage: 33.4, transfer limit: 30.0 Lnn: 0 Devid: 0 Last Event: 2022-11-29T20:32:16 Ignore: No Ignore Time: Never Resolved: No Resolve Time: Never Ended: -- Events: 1 Severity: information
And from the WebUI:
And there you have it: Transfer limits, and the first step in the evolution toward a smarter SmartPools.
Wed, 15 Feb 2023 22:53:09 -0000
|Read Time: 0 minutes
The new OneFS 9.5 release introduces the first phase of engineering’s Smarter SmartPools initiative, and delivers a new feature called SmartPools transfer limits.
The goal of SmartPools Transfer Limits is to address spill over. Previously, when file pool policies were executed, OneFS had no guardrails to protect against overfilling the destination or target storage pool. So if a pool was overfilled, data would unexpectedly spill over into other storage pools.
An overflow would result in storagepool usage exceeding 100%, and cause the SmartPools job itself to do a considerable amount of unnecessary work, trying to send files to a given storagepool. But because the pool was full, it would then have to send those files off to another storage pool that was below capacity. This would result in data going where it wasn’t intended, and the potential for individual files to end up getting split between pools. Also, if the full pool was on the most highly performing storage in the cluster, all subsequent newly created data would now land on slower storage, affecting its throughput and latency. The recovery from a spillover can be fairly cumbersome because it’s tough for the cluster to regain balance, and urgent system administration may be required to free space on the affected tier.
In order to address this, SmartPools Transfer Limits allows a cluster admin to configure a storagepool capacity-usage threshold, expressed as a percentage, and beyond which file pool policies stop moving data to that particular storage pool.
These transfer limits only take effect when running jobs that apply filepool policies, such as SmartPools, SmartPoolsTree, and FilePolicy.
The main benefits of this feature are two-fold:
Under the hood, a cluster’s storagepool SSD and HDD usage is calculated using the same algorithm as reported by the ‘isi storagepools list’ CLI command. This means that a pool’s VHS (virtual hot spare) reserved capacity is respected by SmartPools transfer limits. When a SmartPools job is running, there is at least one worker on each node processing a single LIN at any given time. In order to calculate the current HDD and SSD usage per storagepool, the worker must read from the diskpool database. To circumvent this potential bottleneck, the filepool policy algorithm caches the diskpool database contents in memory for up to 10 seconds.
Transfer limits are stored in gconfig, and a separate entry is stored within the ‘smartpools.storagepools’ hierarchy for each explicitly defined transfer limit.
Note that in the SmartPools lexicon, ‘storage pool’ is a generic term denoting either a tier or nodepool. Additionally, SmartPools tiers comprise one or more constituent nodepools.
Each gconfig transfer limit entry stores a limit value and the diskpool database identifier of the storagepool to which the transfer limit applies. Additionally, a ‘transfer limit state’ field specifies which of three states the limit is in:
Limit state | Description |
Default | Fallback to the default transfer limit. |
Disabled | Ignore transfer limit. |
Enabled | The corresponding transfer limit value is valid. |
A SmartPools transfer limit does not affect the general ingress, restriping, or reprotection of files, regardless of how full the storage pool is where that file is located. So if you’re creating or modifying a file on the cluster, it will be created there anyway. This will continue up until the pool reaches 100% capacity, at which point it will then spill over.
The default transfer limit is 90% of a pool’s capacity. This applies to all storage pools where the cluster admin hasn’t explicitly set a threshold. Note also that the default limit doesn’t get set until a cluster upgrade to OneFS 9.5 has been committed. So if you’re running a SmartPools policy job during an upgrade, you’ll have the preexisting behavior, which is to send the file to wherever the file pool policy instructs it to go. It’s also worth noting that, even though the default transfer limit is set on commit, if a job was running over that commit edge, you’d have to pause and resume it for the new limit behavior to take effect. This is because the new configuration is loaded lazily when the job workers are started up, so even though the configuration changes, a pause and resume is needed to pick up those changes.
SmartPools itself needs to be licensed on a cluster in order for transfer limits to work. And limits can be configured at the tier or nodepool level. But if you change the limit of a tier, it automatically applies to all of its child nodepools, regardless of any prior child limit configurations. The transfer limit feature can also be disabled, which results in the same spillover behavior OneFS always displayed, and any configured limits will not be respected.
Note that a filepool policy’s transfer limits algorithm does not consider the size of the file when deciding whether to move it to the policy’s target storagepool, regardless of whether the file is empty, or a large file. Similarly, a target storagepool’s usage must exceed its transfer limit before the filepool policy will stop moving data to that target pool. The assumption here is that any storagepool usage overshoot is insignificant in scale compared to the capacity of a cluster’s storagepool.
A SmartPools file pool policy allows you to send snapshot or HEAD data blocks to different targets, if so desired.
Because the transfer limit applies to the storagepool itself, and not to the file pool policy, it’s important to note that, if you’ve got varying storagepool targets and one file pool policy, you may have a situation where the head data blocks do get moved. But if the snapshot is pointing at a storage pool that has exceeded its transfer limit, its blocks will not be moved.
File pool policies also allow you to specify how a mixed node’s SSDs are used: either as L3 cache, or as an SSD strategy for head and snapshot blocks. If the SSDs in a node are configured for L3, they are not being used for storage, so any transfer limits are irrelevant to it. As an alternative to L3 cache, SmartPools offers three main categories of SSD strategy:
To reflect this, SmartPools transfer limits are slightly nuanced when it comes to SSD strategies. That is, if the storagepool target contains both HDD and SSD, the usage capacity of both mediums needs to be below the transfer limit in order for the file to be moved to that target. For example, take two node pools, NP1 and NP2.
A file pool policy, Pol1, is configured and which matches all files under /ifs/dir1, with an SSD strategy of Metadata Write, and pool NP1 as the target for HEAD’s data blocks. For snapshots, the target is NP2, with an ‘avoid’ SSD strategy, so just writing to hard disk for both snapshot data and metadata.
When a SmartPools job runs and attempts to apply this file pool policy, it sees that SSD usage is above the 85% configured transfer limit for NP1. So, even though the hard disk capacity usage is below the limit, neither HEAD data nor metadata will be sent to NP1.
For the snapshot, the SSD usage is also above the NP2 pool’s transfer limit of 90%.
However, because the SSD strategy is ‘avoid’, and because the hard disk usage is below the limit, the snapshot’s data and metadata get successfully sent to the NP2 HDDs.
Author: Nick Trimbee
Fri, 28 Apr 2023 19:57:51 -0000
|Read Time: 0 minutes
PowerScale – the world’s most flexible[1] and cyber-secure scale-out NAS solution[2] – is powering up the new year with the launch of the innovative OneFS 9.5 release. With data integrity and protection being top of mind in this era of unprecedented corporate cyber threats, OneFS 9.5 brings an array of new security features and functionality to keep your unstructured data and workloads more secure than ever, as well as delivering significant performance gains on the PowerScale nodes – such as up to 55% higher performance on all-flash F600 and F900 nodes as compared with the previous OneFS release.[3]
OneFS and hardware security features
New PowerScale OneFS 9.5 security enhancements include those that directly satisfy US Federal and DoD mandates, such as FIPS 140-2, Common Criteria, and DISA STIGs – in addition to general enterprise data security requirements. Multi-factor authentication (MFA), single sign-on (SSO) support, data encryption in-flight and at rest, TLS 1.2, USGv6R1 IPv6 support, SED Master Key rekey, plus a new host-based firewall are all part of OneFS 9.5.
15TB and 30TB self-encrypting (SED) SSDs now enable PowerScale platforms running OneFS 9.5 to scale up to 186 PB of encrypted raw capacity per cluster – all within a single volume and filesystem, and before any additional compression and deduplication benefit.
Delivering federal-grade security to protect data under a zero trust model
Security-wise, the United States Government has stringent requirements for infrastructure providers such as Dell Technologies, requiring vendors to certify that products comply with requirements such as USGv6, STIGs, DoDIN APL, Common Criteria, and so on. Activating the OneFS 9.5 cluster hardening option implements a default maximum security configuration with AES and SHA cryptography, which automatically renders a cluster FIPS 140-2 compliant.
OneFS 9.5 introduces SAML-based single sign-on (SSO) from both the command line and WebUI using a redesigned login screen. OneFS SSO is compatible with identity providers (IDPs) such as Active Directory Federation Services, and is also multi-tenant aware, allowing independent configuration for each of a cluster’s Access Zones.
Federal APL requirements mandate that a system must validate all certificates in a chain up to a trusted CA root certificate. To address this, OneFS 9.5 introduces a common Public Key Infrastructure (PKI) library to issue, maintain, and revoke public key certificates. These certificates provide digital signature and encryption capabilities, using public key cryptography to provide identification and authentication, data integrity, and confidentiality. This PKI library is used by all OneFS components that need PKI certificate verification support, such as SecureSMTP, ensuring that they all meet Federal PKI requirements.
This new OneFS 9.5 PKI and certificate authority infrastructure enables multi-factor authentication, allowing users to swipe a CAC or PIV smartcard containing their login credentials to gain access to a cluster, rather than manually entering username and password information. Additional account policy restrictions in OneFS 9.5 automatically disable inactive accounts, provide concurrent administrative session limits, and implement a delay after a failed login.
As part of FIPS 140-2 compliance, OneFS 9.5 introduces a new key manager, providing a secure central repository for secrets such as machine passwords, Kerberos keytabs, and other credentials, with the option of using MCF (modular crypt format) with SHA256 or SHA512 hash types. OneFS protocols and services may be configured to support FIPS 140-2 data-in-flight encryption compliance, while SED clusters and the new Master Key re-key capability provide FIPS 140-2 data-at-rest encryption. Plus, any unused or non-compliant services are easily disabled.
On the network side, the Federal APL has several IPv6 (USGv6) requirements that are focused on allowing granular control of individual components of a cluster’s IPv6 stack, such as duplicate address detection (DAD) and link local IP control. Satisfying both STIG and APL requirements, the new OneFS 9.5 front-end firewall allows security admins to restrict the management interface to specified subnet and implement port blocking and packet filtering rules from the cluster’s command line or WebUI, in accordance with federal or corporate security policy.
Improving performance for the most demanding workloads
OneFS 9.5 unlocks dramatic performance gains, particularly for the all-flash NVMe platforms, where the PowerScale F900 can now support line-rate streaming reads. SmartCache enhancements allow OneFS 9.5 to deliver streaming read performance gains of up to 55% on the F-series nodes, F600 and F9003, delivering benefit to media and entertainment workloads, plus AI, machine learning, deep learning, and more.
Enhancements to SmartPools in OneFS 9.5 introduce configurable transfer limits. These limits include maximum capacity thresholds, expressed as a percentage, above which SmartPools will not attempt to move files to a particular tier, boosting both reliability and tiering performance.
Granular cluster performance control is enabled with the debut of PowerScale SmartQoS, which allows admins to configure limits on the maximum number of protocol operations that NFS, S3, SMB, or mixed protocol workloads can consume.
Enhancing enterprise-grade supportability and serviceability
OneFS 9.5 enables SupportAssist, Dell’s next generation remote connectivity system for transmitting events, logs, and telemetry from a PowerScale cluster to Dell Support. SupportAssist provides a full replacement for ESRS, as well as enabling Dell Support to perform remote diagnosis and remediation of cluster issues.
Upgrading to OneFS 9.5
The new OneFS 9.5 code is available on the Dell Technologies Support site, as both an upgrade and reimage file, allowing both installation and upgrade of this new release.
Author: Nick Trimbee
[1] Based on Dell analysis, August 2021.
[2] Based on Dell analysis comparing cybersecurity software capabilities offered for Dell PowerScale vs. competitive products, September 2022.
[3] Based on Dell internal testing, January 2023. Actual results will vary.
Sun, 18 Dec 2022 19:43:36 -0000
|Read Time: 0 minutes
In addition to the /usr/bin/isi_gather_info tool, OneFS also provides both a GUI and a common ‘isi’ CLI version of the tool – albeit with slightly reduced functionality. This means that a OneFS log gather can be initiated either from the WebUI, or by using the ‘isi diagnostics’ CLI command set with the following syntax:
# isi diagnostics gather start
The diagnostics gather status can also be queried as follows:
# isi diagnostics gather status Gather is running.
When the command has completed, the gather tarfile can be found under /ifs/data/Isilon_Support.
You can also view and modify the ‘isi diagnostics’ configuration as follows:
# isi diagnostics gather settings view Upload: Yes ESRS: Yes Supportassist: Yes Gather Mode: full HTTP Insecure Upload: No HTTP Upload Host: HTTP Upload Path: HTTP Upload Proxy: HTTP Upload Proxy Port: - Ftp Upload: Yes Ftp Upload Host: ftp.isilon.com Ftp Upload Path: /incoming Ftp Upload Proxy: Ftp Upload Proxy Port: - Ftp Upload User: anonymous Ftp Upload Ssl Cert: Ftp Upload Insecure: No
The configuration options for the ‘isi diagnostics gather’ CLI command include:
Option | Description |
–upload <boolean> | Enable gather upload. |
–esrs <boolean> | Use ESRS for gather upload. |
–gather-mode (incremental | full) | Type of gather: incremental, or full. |
–http-insecure-upload <boolean> | Enable insecure HTTP upload on completed gather. |
–http-upload-host <string> | HTTP Host to use for HTTP upload. |
–http-upload-path <string> | Path on HTTP server to use for HTTP upload. |
–http-upload-proxy <string> | Proxy server to use for HTTP upload. |
–http-upload-proxy-port <integer> | Proxy server port to use for HTTP upload. |
–clear-http-upload-proxy-port | Clear proxy server port to use for HTTP upload. |
–ftp-upload <boolean> | Enable FTP upload on completed gather. |
–ftp-upload-host <string> | FTP host to use for FTP upload. |
–ftp-upload-path <string> | Path on FTP server to use for FTP upload. |
–ftp-upload-proxy <string> | Proxy server to use for FTP upload. |
–ftp-upload-proxy-port <integer> | Proxy server port to use for FTP upload. |
–clear-ftp-upload-proxy-port | Clear proxy server port to use for FTP upload. |
–ftp-upload-user <string> | FTP user to use for FTP upload. |
–ftp-upload-ssl-cert <string> | Specifies the SSL certificate to use in FTPS connection. |
–ftp-upload-insecure <boolean> | Whether to attempt a plain text FTP upload. |
–ftp-upload-pass <string> | FTP user to use for FTP upload password. |
–set-ftp-upload-pass | Specify the FTP upload password interactively. |
As mentioned above, ‘isi diagnostics gather’ does not present quite as broad an array of features as the isi_gather_info utility. This is primarily for security purposes, because ‘isi diagnostics’ does not require root privileges to run. Instead, a user account with the ‘ISI_PRIV_SYS_SUPPORT’ RBAC privilege is needed in order to run a gather from either the WebUI or ‘isi diagnostics gather’ CLI interface.
When a gather is running, a second instance cannot be started from any other node until that instance finishes. Typically, a warning similar to the following appears:
"It appears that another instance of gather is running on the cluster somewhere. If you would like to force gather to run anyways, use the --force-multiple-igi flag. If you believe this message is in error, you may delete the lock file here: /ifs/.ifsvar/run/gather.node."
You can remove this lock as follows:
# rm -f /ifs/.ifsvar/run/gather.node
You can also initiate a log gather from the OneFS WebUI by navigating to Cluster management > Diagnostics > Gather:
The WebUI also uses the ‘isi diagnostics’ platform API handler and so, like the CLI command, also offers a subset of the full isi_gather_info functionality.
A limited menu of configuration options is also available in the WebUI, under Cluster management > Diagnostics > Gather settings:
Also contained within the OneFS diagnostics command set is the ‘isi diagnostics netlogger’ utility. Netlogger captures IP traffic over a period of time for network and protocol analysis.
Under the hood, netlogger is a Python wrapper around the ubiquitous tcpdump utility, and can be run either from the OneFS command line or WebUI.
For example, from the WebUI, browse to Cluster management > Diagnostics > Netlogger:
Alternatively, from the OneFS CLI, the isi_netlogger command captures traffic on the interface (‘–interfaces’) over a timeout period of minutes (‘–duration’), and stores a specified number of log files (‘–count’).
Here’s the basic syntax of the CLI utility:
# isi diagnostics netlogger start [--interfaces <str>] [--count <integer>] [--duration <duration>] [--snaplength <integer>] [--nodelist <str>] [--clients <str>] [--ports <str>] [--protocols (ip | ip6 | arp | tcp | udp)] [{--help | -h}]
Note that using the ‘-b’ bpf buffer size option will temporarily change the default buffer size while netlogger is running.
To display help text for netlogger command options, specify 'isi diagnostics netlogger start -h'. The command options include:
Netlogger Option | Description |
–interfaces <str> | Limit packet collection to specified network interfaces. |
–count <integer> | The number of packet capture files to keep after they reach the duration limit. Defaults to the latest 3 files. 0 is infinite. |
–duration <duration> | How long to run the capture before rotating the capture file. Default is 10 minutes. |
–snaplength <integer> | The maximum amount of data for each packet that is captured. Default is 320 bytes. Valid range is 64 to 9100 bytes. |
–nodelist <str> | List of nodes specified by LNN on which to run the capture. |
–clients <str> | Limit packet collection to specified Client hostname / IP addresses. |
–ports <str> | Limit packet collection to specified TCP or UDP ports. |
–protocols (ip | ip6 | arp | tcp | udp) | Limit packet collection to specified protocols. |
Netlogger’s log files are stored by default under /ifs/netlog/<node_name>.
You can also use the WebUI to configure the netlogger parameters under Cluster management > Diagnostics > Netlogger settings:
Be aware that specifying ‘isi diagnostics netlogger’ can consume significant cluster resources. When running the tool on a production cluster, be aware of the effect on the system.
When the command has completed, the capture file(s) are stored under:
# /ifs/netlog/[nodename]
You can also use the following command to incorporate netlogger output files into a gather_info bundle:
# isi_gather_info -n [node#] -f /ifs/netlog
To capture on multiple nodes of the cluster, you can prefix the netlogger command by the versatile isi_for_array utility. For example:
# isi_for_array –s ‘isi diagnostics netlogger --nodelist 2,3 --timeout 5 --snaplength 256’
This command syntax creates five minute incremental files on nodes 2 and 3, using a snaplength of 256 bytes, which captures the first 256 bytes of each packet. These five-minute logs are kept for about three days. The naming convention is of the form netlog-<node_name>-<date>-<time>.pcap. For example:
# ls /ifs/netlog/tme_h700-1 netlog-tme_h700-1.2022-09-02_10.31.28.pcap
When using netlogger, set the ‘–snaplength’ option appropriately, depending on the protocol, in order to capture the right amount of detail in the packet headers and/or payload. Or, if you want the entire contents of every packet, use a value of zero (‘–snaplength 0’).
The default snaplength for netlogger is to capture 320 bytes per packet, which is typically sufficient for most protocols.
However, for SMB, a snaplength of 512 is sometimes required. Note that depending on a node’s traffic quantity, a snaplength of 0 (that is: capture whole packet) can potentially overwhelm the network interface driver.
All the output gets written to files under /ifs/netlog directory, and the default capture time is ten minutes (‘–duration 10’).
You can apply filters to constrain traffic to/from certain hosts or protocols. For example, to limit output to traffic between client 10.10.10.1 and the cluster node:
# isi diagnostics netlogger --duration 5 --snaplength 256 --clients 10.10.10.1
Or to capture only NFS traffic, filter on port 2049:
# isi diagnostics netlogger --ports 2049
Author: Nick Trimbee
Sun, 18 Dec 2022 19:11:11 -0000
|Read Time: 0 minutes
The previous blog outlining the investigation and troubleshooting of OneFS deadlocks and hang-dumps generated several questions about OneFS logfile gathering. So it seemed like a germane topic to explore in an article.
The OneFS ‘isi_gather_info’ utility has long been a cluster staple for collecting and collating context and configuration that primarily aids support in the identification and resolution of bugs and issues. As such, it is arguably OneFS’ primary support tool and, in terms of actual functionality, it performs the following roles:
By default, a log gather tarfile is written to the /ifs/data/Isilon_Support/pkg/ directory. It can also be uploaded to Dell using the following means:
Transport Mechanism | Description | TCP Port |
ESRS | Uses Dell EMC Secure Remote Support (ESRS) for gather upload. | 443/8443 |
FTP | Use FTP to upload completed gather. | 21 |
HTTP | Use HTTP to upload gather. | 80/443 |
More specifically, the ‘isi_gather_info’ CLI command syntax includes the following options:
Option | Description |
–upload <boolean> | Enable gather upload. |
–esrs <boolean> | Use ESRS for gather upload. |
–gather-mode (incremental | full) | Type of gather: incremental, or full. |
–http-insecure-upload <boolean> | Enable insecure HTTP upload on completed gather. |
–http-upload-host <string> | HTTP Host to use for HTTP upload. |
–http-upload-path <string> | Path on HTTP server to use for HTTP upload. |
–http-upload-proxy <string> | Proxy server to use for HTTP upload. |
–http-upload-proxy-port <integer> | Proxy server port to use for HTTP upload. |
–clear-http-upload-proxy-port | Clear proxy server port to use for HTTP upload. |
–ftp-upload <boolean> | Enable FTP upload on completed gather. |
–ftp-upload-host <string> | FTP host to use for FTP upload. |
–ftp-upload-path <string> | Path on FTP server to use for FTP upload. |
–ftp-upload-proxy <string> | Proxy server to use for FTP upload. |
–ftp-upload-proxy-port <integer> | Proxy server port to use for FTP upload. |
–clear-ftp-upload-proxy-port | Clear proxy server port to use for FTP upload. |
–ftp-upload-user <string> | FTP user to use for FTP upload. |
–ftp-upload-ssl-cert <string> | Specifies the SSL certificate to use in FTPS connection. |
–ftp-upload-insecure <boolean> | Whether to attempt a plain text FTP upload. |
–ftp-upload-pass <string> | FTP user to use for FTP upload password. |
–set-ftp-upload-pass | Specify the FTP upload password interactively. |
When the gather arrives at Dell, it is automatically unpacked by a support process and analyzed using the ‘logviewer’ tool.
Under the hood, there are two principal components responsible for running a gather. These are:
Component | Description |
Overlord | The manager process, triggered by the user, which oversees all the isi_gather_info tasks that are executed on a single node. |
Minion | The worker process, which runs a series of commands (specified by the overlord) on a specific node. |
The ‘isi_gather_info’ utility is primarily written in Python, with its configuration under the purview of MCP, and RPC services provided by the isi_rpc_d daemon.
For example:
# isi_gather_info& # ps -auxw | grep -i gather root 91620 4.4 0.1 125024 79028 1 I+ 16:23 0:02.12 python /usr/bin/isi_gather_info (python3.8) root 91629 3.2 0.0 91020 39728 - S 16:23 0:01.89 isi_rpc_d: isi.gather.minion.minion.GatherManager (isi_rpc_d) root 93231 0.0 0.0 11148 2692 0 D+ 16:23 0:00.01 grep -i gather
The overlord uses isi_rdo (the OneFS remote command execution daemon) to start up the minion processes and informs them of the commands to be executed by an ephemeral XML file, typically stored at /ifs/.ifsvar/run/<uuid>-gather_commands.xml. The minion then spins up an executor and a command for each entry in the XML file.
The parallel process executor (the default one to use) acts as a pool, triggering commands to run in parallel until a specified number are running in parallel. The commands themselves take care of the running and processing of results, checking frequently to ensure that the timeout threshold has not been passed.
The executor also keeps track of which commands are currently running, and how many are complete, and writes them to a file so that the overlord process can display useful information. When this is complete, the executor returns the runtime information to the minion, which records the benchmark file. The executor will also safely shut itself down if the isi_gather_info lock file disappears, such as if the isi_gather_info process is killed.
During a gather, the minion returns nothing to the overlord process, because the output of its work is written to disk.
Architecturally, the ‘gather’ process comprises an eight phase workflow:
The details of each phase are as follows:
Phase | Description |
1. Setup | Reads from the arguments passed in, and from any config files on disk, and sets up the config dictionary, which will be used throughout the rest of the codebase. Most of the code for this step is contained in isilon/lib/python/gather/igi_config/configuration.py. This is also the step where the program is most likely to exit, if some config arguments end up being invalid. |
2. Run local | Executes all the cluster commands, which are run on the same node that is starting the gather. All these commands run in parallel (up to the current parallelism value). This is typically the second longest running phase. |
3. Run nodes | Executes the node commands across all of the cluster’s nodes. This runs on each node, and while these commands run in parallel (up to the current parallelism value), they do not run in parallel with the local step. |
4. Collect | Ensures that all results end up on the overlord node (the node that started gather). If gather is using /ifs, it is very fast, but if it’s not, it needs to SCP all the node results to a single node. |
5. Generate Extra Files | Generates nodes_info and package_info.xml. These two files are present in every single gather, and tell us some important metadata about the cluster. |
6. Packing | Packs (tars and gzips) all the results. This is typically the longest running phase, often by an order of magnitude. |
7. Upload | Transports the tarfile package to its specified destination. Depending on the geographic location, this phase might also be lengthy. |
8. Cleanup | Cleans up any intermediary files that were created on cluster. This phase will run even if gather fails or is interrupted. |
Because the isi_gather_info tool is primarily intended for troubleshooting clusters with issues, it runs as root (or compadmin in compliance mode), because it needs to be able to execute under degraded conditions (that is, without GMP, during upgrade, and under cluster splits, and so on). Given these atypical requirements, isi_gather_info is built as a stand-alone utility, rather than using the platform API for data collection.
The time it takes to complete a gather is typically determined by cluster configuration, rather than size. For example, a gather on a small cluster with a large number of NFS shares will take significantly longer than on large cluster with a similar NFS configuration. Incremental gathers are not recommended, because the base that’s required to check against in the log store may be deleted. By default, gathers only persist for two weeks in the log processor.
On completion of a gather, a tar’d and zipped logset is generated and placed under the cluster’s /ifs/data/IsilonSupport/pkg directory by default. A standard gather tarfile unpacks to the following top-level structure:
# du -sh * 536M IsilonLogs-powerscale-f900-cl1-20220816-172533-3983fba9-3fdc-446c-8d4b-21392d2c425d.tgz 320K benchmark 24K celog_events.xml 24K command_line 128K complete 449M local 24K local.log 24K nodes_info 24K overlord.log 83M powerscale-f900-cl1-1 24K powerscale-f900-cl1-1.log 119M powerscale-f900-cl1-2 24K powerscale-f900-cl1-2.log 134M powerscale-f900-cl1-3 24K powerscale-f900-cl1-3.log
In this case, for a three node F900 cluster, the compressed tarfile is 536 MB in size. The bulk of the data, which is primarily CLI command output, logs, and sysctl output, is contained in the ‘local’ and individual node directories (powerscale-f900-cl1-*). Each node directory contains a tarfile, varlog.tar, containing all the pertinent logfiles for that node.
The root directory of the tarfile file includes the following:
Item | Description |
benchmark | § Runtimes for all commands executed by the gather. |
celog_events.xml |
§ Cluster/Node names § Node Serial numbers § Configuration ID § OneFS version info § Events |
complete | § Lists of complete commands run across the cluster and on individual nodes |
local |
|
nodes_info |
|
overlord.log | § Gather execution and issue log. |
package_info.xml | § Cluster version details, GUID, S/N, and customer info (name, phone, email, and so on). |
command_line |
|
Notable contents of the ‘local’ directory (all the cluster-wide commands that are executed on the node running the gather) include:
Local Contents Item | Description |
isi_alerts_history
|
|
isi_job_list |
|
isi_job_schedule |
|
isi_license |
|
isi_network_interfaces | § State and configuration of all the cluster’s network interfaces. |
isi_nfs_exports | § Configuration detail for all the cluster’s NFS exports. |
isi_services | § Listing of all the OneFS services and whether they are enabled or disabled. More detailed configuration for each service is contained in separate files. For example, for SnapshotIQ:
|
isi_smb | § Detailed configuration info for all the cluster’s NFS exports. |
isi_stat | § Overall status of the cluster, including networks, drives, and so on. |
isi_statistics | § CPU, protocol, and disk IO stats. |
Contents of the directory for the ‘node’ directory include:
Node Contents Item | Description |
df | Output of the df command |
du |
|
isi_alerts | Contains a list of outstanding alerts on the node |
ps and ps_full | Lists of all running process at the time that isi_gather_info was executed. |
As the isi_gather_info command runs, status is provided in the interactive CLI session:
# isi_gather_info Configuring COMPLETE running local commands IN PROGRESS \ Progress of local [######################################################## ] 147/152 files written \ Some active commands are: ifsvar_modules_jobengine_cp, isi_statistics_heat, ifsv ar_modules
When the gather has completed, the location of the tarfile on the cluster itself is reported as follows:
# isi_gather_info Configuring COMPLETE running local commands COMPLETE running node commands COMPLETE collecting files COMPLETE generating package_info.xml COMPLETE tarring gather COMPLETE uploading gather COMPLETE
The path to the tar-ed gather is:
/ifs/data/Isilon_Support/pkg/IsilonLogs-h5001-20220830-122839-23af1154-779c-41e9-b0bd-d10a026c9214.tgz
If the gather upload services are unavailable, errors are displayed on the console, as shown here:
… uploading gather FAILED ESRS failed - ESRS has not been provisioned FTP failed - pycurl error: (28, 'Failed to connect to ftp.isilon.com port 21 after 81630 ms: Operation timed out')
Author: Nick Trimbee
Wed, 07 Dec 2022 20:54:43 -0000
|Read Time: 0 minutes
As we’ve seen in prior articles in this series, OneFS and the PowerScale platforms support a variety of Ethernet speeds, cable and connector styles, and network interface counts, depending on the node type selected. However, unlike the back-end network, Dell Technologies does not specify particular front-end switch models, allowing PowerScale clusters to seamlessly integrate into the data link layer (layer 2) of an organization’s existing Ethernet IP network infrastructure. For example:
A layer 2 looped topology, as shown here, extends VLANs between the distribution/aggregation switches, with spanning tree protocol (STP) preventing network loops by shutting down redundant paths. The access layer uplinks can be used to load balance VLANs. This distributed architecture allows the cluster’s external network to connect to multiple access switches, affording each node similar levels of availability, performance, and management properties.
Link aggregation can be used to combine multiple Ethernet interfaces into a single link-layer interface, and is implemented between a single switch and a PowerScale node, where transparent failover or switch port redundancy is required. Link aggregation assumes that all links are full duplex, point to point, and at the same data rate, providing graceful recovery from link failures. If a link fails, traffic is automatically sent to the next available link without disruption.
Quality of service (QoS) can be implemented through differentiated services code point (DSCP), by specifying a value in the packet header that maps to an ‘effort level’ for traffic. Because OneFS does not provide an option for tagging packets with a specified DSCP marking, the recommended practice is to configure the first hop ports to insert DSCP values on the access switches connected to the PowerScale nodes. OneFS does however retain headers for packets that already have a specified DSCP value.
When designing a cluster, the recommendation is that each node have at least one front-end interface configured, preferably in at least one static SmartConnect zone. Although a cluster can be run in a ‘not all nodes on the network’ (NANON) configuration, where feasible, the recommendation is to connect all nodes to the front-end network(s). Additionally, cluster services such as SNMP, ESRS, ICAP, and auth providers (AD, LDAP, NIS, and so on) prefer that each node have an address that can reach the external servers.
In contrast with scale-up NAS platforms that use separate network interfaces for out-of-band management and configuration, OneFS traditionally performs all cluster network management in-band. However, PowerScale nodes typically contain a dedicated 1Gb Ethernet port that can be configured for use as a management network by ICMP or iDRAC, simplifying administration of a large cluster. OneFS also supports using a node’s serial port as an RS-232 out-of-band management interface. This practice is highly recommended for large clusters. Serial connectivity can provide reliable BIOS-level command line access for on-site or remote service staff to perform maintenance, troubleshooting, and installation operations.
SmartConnect provides a configurable allocation method for each IP address pool:
Allocation Method | Attributes |
Static | • One IP per interface is assigned, will likely require fewer IPs to meet minimum requirements • No Failover of IPs to other interfaces |
Dynamic | • Multiple IPs per interface is assigned, will require more IPs to meet minimum requirements • Failover of IPs to other interfaces, failback policies are needed |
The default ‘static’ allocation assigns a single persistent IP address to each interface selected in the pool, leaving additional pool IP addresses unassigned if the number of addresses exceeds the total interfaces.
The lowest IP address of the pool is assigned to the lowest Logical Node Number (LNN) from the selected interfaces. The same is true for the second-lowest IP address and LNN, and so on. If a node or interface becomes unavailable, this IP address does not move to another node or interface. Also, when the node or interface becomes unavailable, it is removed from the SmartConnect zone, and new connections will not be assigned to the node. When the node is available again, SmartConnect automatically adds it back into the zone and assigns new connections.
By contrast, ‘dynamic’ allocation divides all available IP addresses in the pool across all selected interfaces. OneFS attempts to assign the IP addresses as evenly as possible. However, if the interface-to-IP address ratio is not an integer value, a single interface might have more IP addresses than another. As such, wherever possible, ensure that all the interfaces have the same number of IP addresses.
In concert with dynamic allocation, dynamic failover provides high availability by transparently migrating IP addresses to another node when an interface is not available. If a node becomes unavailable, all the IP addresses it was hosting are reallocated across the new set of available nodes in accordance with the configured failover load-balancing policy. The default IP address failover policy is round robin, which evenly distributes IP addresses from the unavailable node across available nodes. Because the IP address remains consistent, irrespective of the node on which it resides, failover to the client is transparent, so high availability is seamless.
The other available IP address failover policies are the same as the initial client connection balancing policies, that is, connection count, throughput, or CPU usage. In most scenarios, round robin is not only the best option but also the most common. However, the other failover policies are available for specific workflows.
The decision on whether to implement dynamic failover depends on the protocol(s) being used, general workflow attributes, and any high-availability design requirements:
Protocol | State | Suggested Allocation Strategy |
NFSv3 | Stateless | Dynamic |
NFSv4 | Stateful | Dynamic or Static, depending on mount daemon, OneFS version, and Kerberos. |
SMB | Stateful | Dynamic or Static |
SMB Multi-channel | Stateful | Dynamic or Static |
S3 | Stateless | Dynamic or Static |
HDFS | Stateful | Dynamic or Static. HDFS uses separate name-node and data-node connections. Allocation strategy depends on the need for data locality and/or multi-protocol, that is:
HDFS + NFSv3 : Dynamic Pool
HDFS + SMB : Static Pool |
HTTP | Stateless | Static |
FTP | Stateful | Static |
SyncIQ | Stateful | Static required |
Assigning each workload or data store to a unique IP address enables OneFS SmartConnect to move each workload to one of the other interfaces. This minimizes the additional work that a remaining node in the SmartConnect pool must absorb and ensures that the workload is evenly distributed across all the other nodes in the pool.
Static IP pools require one IP address for each logical interface within the pool. Because each node provides two interfaces for external networking, if link aggregation is not configured, this would require 2*N IP addresses for a static pool.
Determining the number of IP addresses within a dynamic allocation pool varies depending on the workflow, node count, and the estimated number of clients that would be in a failover event. While dynamic pools need, at a minimum, the number of IP addresses to match a pool’s node count, the ‘N * (N – 1)’ formula can often prove useful for calculating the required number of IP addresses for smaller pools. In this equation, N is the number of nodes that will participate in the pool.
For example, a SmartConnect pool with four-node interfaces, using the ‘N * (N – 1)’ model will result in three unique IP addresses being allocated to each node. A failure on one node interface will cause each of that interface’s three IP addresses to fail over to a different node in the pool. This ensures that each of the three active interfaces remaining in the pool receives one IP address from the failed node interface. If client connections to that node are evenly balanced across its three IP addresses, SmartConnect will evenly distribute the workloads to the remaining pool members. For larger clusters, this formula may not be feasible due to the sheer number of IP addresses required.
Enabling jumbo frames (Maximum Transmission Unit set to 9000 bytes) typically yields improved throughput performance with slightly reduced CPU usage than when using standard frames, where the MTU is set to 1500 bytes. For example, with 40 Gb Ethernet connections, jumbo frames provide about five percent better throughput and about one percent less CPU usage.
OneFS provides the ability to optimize storage performance by designating zones to support specific workloads or subsets of clients. Different network traffic types can be segregated on separate subnets using SmartConnect pools.
For large clusters, partitioning the cluster’s networking resources and allocating bandwidth to each workload can help minimize the likelihood that heavy traffic from one workload will affect network throughput for another. This is particularly true for SyncIQ replication and NDMP backup traffic, which can frequently benefit from its own set of interfaces, separate from user and client IO load.
The ‘groupnet’ networking object is part of OneFS’ support for multi-tenancy. Groupnets sit above subnets and pools and allow separate Access Zones to contain distinct DNS settings.
The management and data network(s) can then be incorporated into different Access Zones, each with their own DNS, directory access services, and routing, as appropriate.
Author: Nick Trimbee
Wed, 07 Dec 2022 20:42:17 -0000
|Read Time: 0 minutes
A key decision for performance, particularly in a large cluster environment, is the type and quantity of nodes deployed. Heterogeneous clusters can be architected with a wide variety of node styles and capacities, to meet the needs of a varied data set and a wide spectrum of workloads. These node styles encompass several hardware generations, and fall loosely into three main categories or tiers. While heterogeneous clusters can easily include many hardware classes and configurations, the best practice of simplicity for building clusters holds true here too.
Consider the physical cluster layout and environmental factors, particularly when designing and planning a large cluster installation. These factors include:
The following table details the physical dimensions, weight, power draw, and thermal properties for the range of PowerScale F-series all-flash nodes:
Model | Tier | Height | Width | Depth | RU | Weight | Max Watts | Watts | Max BTU | Normal BTU |
F900 | All-flash NVMe performance | 2U | 17.8 IN / | 31.8 IN / 85.9 cm | 2RU | 73 lbs | 1297 | 859 | 4425 | 2931 |
F600 | All-flash NVMe Performance | 1U (1.75IN) | 17.8 IN / | 31.8 IN / 85.9 cm | 1RU | 43 lbs | 467 | 718 | 2450 | 1594 |
F200 | All-flash | 1U (1.75IN) | 17.8 IN / | 31.8 IN / 85.9 cm | 1RU | 47 lbs | 395 | 239 | 1346 | 816 |
Note that the table above represents individual nodes. A minimum of three similar nodes are required for a node pool.
Similarly, the following table details the physical dimensions, weight, power draw, and thermal properties for the range of PowerScale chassis-based platforms:
Model | Tier | Height | Width | Depth | RU | Weight | Max Watts | Watts | Max BTU | Normal BTU |
F800/ | All-flash performance | 4U (4×1.75IN) | 17.6 IN / 45 cm | 35 IN / | 4RU | 169 lbs (77 kg) | 1764 | 1300 | 6019 | 4436 |
H700 |
Hybrid/Utility | 4U (4×1.75IN) | 17.6 IN / 45 cm | 35 IN / | 4RU | 261lbs (100 kg) | 1920 | 1528 | 6551 | 5214 |
H7000 |
Hybrid/Utility | 4U (4×1.75IN) | 17.6 IN / 45 cm | 39 IN / | 4RU | 312 lbs (129 kg) | 2080 | 1688 | 7087 | 5760 |
H600 |
Hybrid/Utility | 4U (4×1.75IN) | 17.6 IN / 45 cm | 35 IN / | 4RU | 213 lbs (97 kg) | 1990 | 1704 | 6790 | 5816 |
H5600 |
Hybrid/Utility | 4U (4×1.75IN) | 17.6 IN / 45 cm | 39 IN / | 4RU | 285 lbs (129 kg) | 1906 | 1312 | 6504 | 4476 |
H500 |
Hybrid/Utility | 4U (4×1.75IN) | 17.6 IN / 45 cm | 35 IN / | 4RU | 248 lbs (112 kg) | 1906 | 1312 | 6504 | 4476 |
H400 |
Hybrid/Utility | 4U (4×1.75IN) | 17.6 IN / 45 cm | 35 IN / | 4RU | 242 lbs (110 kg) | 1558 | 1112 | 5316 | 3788 |
A300 |
Archive | 4U (4×1.75IN) | 17.6 IN / 45 cm | 35 IN / | 4RU | 252 lbs (100 kg) | 1460 | 1070 | 4982 | 3651 |
A3000 |
Archive | 4U (4×1.75IN) | 17.6 IN / 45 cm | 39 IN / | 4RU | 303 lbs (129 kg) | 1620 | 1230 | 5528 | 4197 |
A200 |
Archive | 4U (4×1.75IN) | 17.6 IN / 45 cm | 35 IN / | 4RU | 219 lbs (100 kg) | 1460 | 1052 | 4982 | 3584 |
A2000 |
Archive | 4U (4×1.75IN) | 17.6 IN / 45 cm | 39 IN / | 4RU | 285 lbs (129 kg) | 1520 | 1110 | 5186 | 3788 |
Note that this table represents 4RU chassis, each of which contains four PowerScale platform nodes (the minimum node pool size).
The following figure shows the locations of both the front-end (ext-1 & ext-2) and back-end (int-1 & int-2) network interfaces on the PowerScale stand-alone F-series and chassis-based nodes:
A PowerScale cluster’s back-end network is analogous to a distributed systems bus. Each node has two back-end interfaces for redundancy that run in an active/passive configuration (int-1 and int-2 above). The primary interface is connected to the primary switch; the secondary interface is connected to a separate switch.
For nodes using 40/100 Gb or 25/10 Gb Ethernet or InfiniBand connected with multimode fiber, the maximum cable length is 150 meters. This allows a cluster to span multiple rack rows, floors, and even buildings, if necessary. While this can solve floor space challenges, in order to perform any physical administration activity on nodes, you must know where the equipment is located.
The following table shows the various PowerScale node types and their respective back-end network support. While Ethernet is the preferred medium – particularly for large PowerScale clusters – InfiniBand is also supported for compatibility with legacy Isilon clusters.
Node Models | Details |
F200, F600, F900 | F200: nodes support a 10 GbE or 25 GbE connection to the access switch using the same NIC. A breakout cable can connect up to four nodes to a single switch port.
F600: nodes support a 40 GbE or 100 GbE connection to the access switch using the same NIC.
F900: nodes support a 40 GbE or 100 GbE connection to the access switch using the same NIC. |
H700, H7000, A300, A3000 | Supports 40 GbE or 100 GbE connection to the access switch using the same NIC.
OR
Supports 25 GbE or 10 GbE connection to the leaf using the same NIC. A breakout cable can connect a 40 GbE switch port to four 10 GbE nodes or a 100 GbE switch port to four 25 GbE nodes. |
F810, F800, H600, H500, H5600 | Performance nodes support a 40 GbE connection to the access switch. |
A200, A2000, H400 | Archive nodes support a 10GbE connection to the access switch using a breakout cable. A breakout cable can connect a 40 GbE switch port to four 10 GbE nodes or a 100 GbE switch port to four 10 GbE nodes. |
Currently only Dell Technologies approved switches are supported for back-end Ethernet and IB cluster interconnection. These include:
Switch | Port | Port | Height | Role | Notes |
Dell S4112 | 24 | 10GbE | ½ | ToR | 10 GbE only. |
Dell 4148 | 48 | 10GbE | 1 | ToR | 10 GbE only. |
Dell S5232 | 32 | 100GbE | 1 | Leaf or Spine | Supports 4x10GbE or 4x25GbE breakout cables.
Total of 124 10GbE or 25GbE nodes as top-of-rack back-end switch.
Port 32 does not support breakout. |
Dell Z9100 | 32 | 100GbE | 1 | Leaf or Spine | Supports 4x10GbE or 4x25GbE breakout cables.
Total of 128 10GbE or 25GbE nodes as top-of-rack back-end switch. |
Dell Z9264 | 64 | 100GbE | 2 | Leaf or Spine | Supports 4x10GbE or 4x25GbE breakout cables.
Total of 128 10GbE or 25GbE nodes as top-of-rack back-end switch. |
Arista 7304 | 128 | 40GbE | 8 | Enterprise core | 40GbE or 10GbE line cards. |
Arista 7308 | 256 | 40GbE | 13 | Enterprise/ large cluster | 40GbE or 10GbE line cards. |
Mellanox Neptune MSX6790 | 36 | QDR | 1 | IB fabric | 32Gb/s quad data rate InfiniBand. |
Be aware that the use of patch panels is not supported for PowerScale cluster back-end connections, regardless of overall cable lengths. All connections must be a single link, single cable directly between the node and back-end switch. Also, Ethernet and InfiniBand switches must not be reconfigured or used for any traffic beyond a single cluster.
Support for leaf spine back-end Ethernet network topologies was first introduced in OneFS 8.2. In a leaf-spine network switch architecture, the PowerScale nodes connect to leaf switches at the access, or leaf, layer of the network. At the next level, the aggregation and core network layers are condensed into a single spine layer. Each leaf switch connects to each spine switch to ensure that all leaf switches are no more than one hop away from one another. For example:
Leaf-to-spine switch connections require even distribution, to ensure the same number of spine connections from each leaf switch. This helps minimize latency and reduces the likelihood of bottlenecks in the back-end network. By design, a leaf spine network architecture is both highly scalable and redundant.
Leaf spine network deployments can have a minimum of two leaf switches and one spine switch. For small to medium clusters in a single rack, the back-end network typically uses two redundant top-of-rack (ToR) switches, rather than implementing a more complex leaf-spine topology.
Author: Nick Trimbee
Wed, 07 Dec 2022 20:29:30 -0000
|Read Time: 0 minutes
When it comes to physically installing PowerScale nodes, most use a 35 inch depth chassis and will fit in a standard depth data center cabinet. Nodes can be secured to standard storage racks with their sliding rail kits, included in all node packaging and compatible with racks using either 3/8 inch square holes, 9/32 inch round holes, or 10-32 / 12-24 / M5X.8 / M6X1 pre-threaded holes. These supplied rail kit mounting brackets are adjustable in length, from 24 inches to 36 inches, to accommodate different rack depths. When selecting an enclosure for PowerScale nodes, ensure that the rack supports the minimum and maximum rail kit sizes.
Rack Component | Description |
a | Distance between front surface of the rack and the front NEMA rail |
b | Distance between NEMA rails, minimum=24in (609.6mm), max=34in (863.6mm) |
c | Distance between the rear of the chassis to the rear of the rack, min=2.3in (58.42mm) |
d | Distance between inner front of the front door and the NEMA rail, min=2.5in (63.5mm) |
e | Distance between the inside of the rear post and the rear vertical edge of the chassis and rails, min=2.5in (63.5mm) |
f | Width of the rear rack post |
g | 19in (486.2mm)+(2e), min=24in (609.6mm) |
h | 19in (486.2mm) NEMA+(2e)+(2f) Note: Width of the PDU+0.5in (13mm) <=e +f
If j=i+c+PDU depth+3in (76.2mm), then h=min 23.6in (600mm)
Assuming the PDU is mounted beyond i+c. |
i | Chassis depth: Normal chassis=35.80in (909mm) : Deep chassis=40.40in (1026mm) Switch depth (measured from the front NEMA rail): Note: The inner rail is fixed at 36.25in (921mm)
Allow up to 6in (155mm) for cable bend radius when routing up to 32 cables to one side of the rack. Select the greater of the installed equipment. |
j | Minimum rack depth=i+c |
k | Front |
l | Rear |
m | Front door |
n | Rear door |
p | Rack post |
q | PDU |
r | NEMA |
s | NEMA 19 inch |
t | Rack top view |
u | Distance from front NEMA to chassis face: Dell PowerScale deep and normal chassis = 0in |
However, the high-capacity models, such as the F800/810, H7000, H5600, A3000 and A2000, have 40 inch depth chassis and require extended depth cabinets, such as the APC 3350 or Dell Titan-HD rack.
Additional room must be provided for opening the FRU service trays at the rear of the nodes and, in the chassis-based 4RU platforms, the disk sleds at the front of the chassis. Except for the 2RU F900, the stand-alone PowerScale all-flash nodes are 1RU in height (including the 1RU diskless P100 accelerator and B100 backup accelerator nodes).
Power-wise, each cabinet typically requires between two and six independent single or three-phase power sources. To determine the specific requirements, use the published technical specifications and device rating labels for the devices to calculate the total current draw for each rack.
Specification | North American 3 wire connection (2 L and 1 G) | International 3 wire connection (1 L, 1 N, and 1 G) |
Input nominal voltage | 200–240 V ac +/- 10% L – L nom | 220–240 V ac +/- 10% L – L nom |
Frequency | 50–60 Hz | 50–60 Hz |
Circuit breakers | 30 A | 32 A |
Power zones | Two | Two |
Power requirements at site (minimum to maximum) | Single-phase: six 30A drops, two per zone
Three-phase Delta: two 50A drops, one per zone
Three-phase Wye: two 32A drops, one per zone | Single-phase: six 30A drops, two per zone
Three-phase Delta: two 50A drops, one per zone
Three-phase Wye: two 32A drops, one per zone |
Additionally, the recommended environmental conditions to support optimal PowerScale cluster operation are as follows:
Attribute | Details |
Temperature | Operate at >=90 percent of the time between 10 degrees Celsius to 35 degrees Celsius, and <=10 percent of the time between 5 degrees Celsius to 40 degrees Celsius. |
Humidity | 40 to 55 percent relative humidity |
Weight | A fully configured cabinet must sit on at least two floor tiles, and can weigh approximately 1588 kilograms (3500 pounds). |
Altitude | 0 meters to 2439 meters (0 to 8,000 ft) above sea level operating altitude. |
Weight is a critical factor to keep in mind, particularly with the chassis-based nodes. Individual 4RU chassis can weigh up to around 300 lbs each, and the maximum floor tile capacity for each individual cabinet or rack must be kept in mind. For the deep node styles (H7000, H5600, A3000 and A2000), the considerable node weight may prevent racks from being fully populated with PowerScale equipment. If the cluster uses a variety of node types, installing the larger, heavier nodes at the bottom of each rack and the lighter chassis at the top can help distribute weight evenly across the cluster racks’ floor tiles.
Note that there are no lift handles on the PowerScale 4RU chassis. However, the drive sleds can be removed to provide handling points if no lift is available. With all the drive sleds removed, but leaving the rear compute modules inserted, the chassis weight drops to a more manageable 115 lbs or so. It is strongly recommended to use a lift for installation of 4RU chassis.
Cluster back-end switches ship with the appropriate rails (or tray) for proper installation of the switch in the rack. These rail kits are adjustable to fit NEMA front rail to rear rail spacing ranging from 22 in to 34 in.
Note that some manufacturers’ Ethernet switch rails are designed to overhang the rear NEMA rails, helping to align the switch with the PowerScale chassis at the rear of the rack. These require a minimum clearance of 36 in from the front NEMA rail to the rear of the rack, in order to ensure that the rack door can be closed.
Consider the following large cluster topology, for example:
This contiguous rack architecture is designed to scale up to the current maximum PowerScale cluster size of 252 nodes, in 63 4RU chassis, across nine racks as the environment grows – while still keeping cable management relatively simple. Note that this configuration assumes 1RU per node. If you are using F900 nodes, which are 2RU in size, be sure to budget for additional rack capacity.
Successful large cluster infrastructures depend on the proficiency of the installer and their optimizations for maintenance and future expansion. Some good data center design practices include:
For Hadoop workloads, PowerScale clusters are compatible with the rack awareness feature of HDFS to provide balancing in the placement of data. Rack locality keeps the data flow internal to the rack.
Excess cabling can be neatly stored in 12” service coils on a cable tray above the rack, if available, or at the side of the rack as illustrated below.
The use of intelligent power distribution units (PDUs) within each rack can facilitate the remote power cycling of nodes, if desired.
For deep nodes such as the H7000 and A3000 hardware, where chassis depth can be a limiting factor, horizontally mounted PDUs within the rack can be used in place of vertical PDUs, if necessary. If front-mounted, partial depth Ethernet switches are deployed, you can install horizontal PDUs in the rear of the rack directly behind the switches to maximize available rack capacity.
With copper cables (such as SFP+, QSFP, CX4), the maximum cable length is typically limited to 10 meters or less. After factoring in for dressing the cables to maintain some level of organization and proximity within the racks and cable trays, all the racks with PowerScale nodes need to be near each other – either in the same rack row or close by in an adjacent row – or adopt a leaf-spine topology, with leaf switches in each rack.
If greater physical distance between nodes is required, support for multimode fiber (QSFP+, MPO, LC, etc) extends the cable length limitation to 150 meters. This allows nodes to be housed on separate floors or on the far side of a floor in a datacenter if necessary. While solving the floor space problem, this does have the potential to introduce new administrative and management challenges.
The following table lists the various cable types, form factors, and supported lengths available for PowerScale nodes:
Cable Form Factor | Medium | Speed (Gb/s) | Max Length |
QSFP28 | Optical | 100Gb | 30M |
MPO | Optical | 100/40Gb | 150M |
QSFP28 | Copper | 100Gb | 5M |
QSFP+ | Optical | 40Gb | 10M |
LC | Optical | 25/10Gb | 150M |
QSFP+ | Copper | 40Gb | 5M |
SFP28 | Copper | 25Gb | 5M |
SFP+ | Copper | 10Gb | 7M |
CX4 | Copper | IB QDR/DDR | 10M |
The connector types for the cables above can be identified as follows:
As for the nodes themselves, the following rear views indicate the locations of the various network interfaces:
Note that Int-a and int-b indicate the primary and secondary back-end networks, whereas Ext-1 and Ext-2 are the front-end client networks interfaces.
Be aware that damage to the InfiniBand or Ethernet cables (copper or optical fiber) can negatively affect cluster performance. Never bend cables beyond the recommended bend radius, which is typically 10–12 times the diameter of the cable. For example, if a cable is 1.6 inches, round up to 2 inches and multiply by 10 for an acceptable bend radius.
Cables differ, so follow the explicit recommendations of the cable manufacturer.
The most important design attribute for bend radius consideration is the minimum mated cable clearance (Mmcc). Mmcc is the distance from the bulkhead of the chassis through the mated connectors/strain relief including the depth of the associated 90 degree bend. Multimode fiber has many modes of light (fiber optic) traveling through the core. As each of these modes moves closer to the edge of the core, light and the signal are more likely to be reduced, especially if the cable is bent. In a traditional multimode cable, as the bend radius is decreased, the amount of light that leaks out of the core increases, and the signal decreases. Best practices for data cabling include:
Note that the effects of gravity can also decrease the bend radius and result in degradation of signal power and quality.
Cables, particularly when bundled, can also obstruct the movement of conditioned air around the cluster, and cables should be secured away from fans. Flooring seals and grommets can be useful to keep conditioned air from escaping through cable holes. Also ensure that smaller Ethernet switches are drawing cool air from the front of the rack, not from inside the cabinet. This can be achieved either with switch placement or by using rack shelving.
Author: Nick Trimbee
Wed, 07 Dec 2022 17:28:21 -0000
|Read Time: 0 minutes
In this article, we turn our attention to some of the environmental and logistical aspects of cluster design, installation, and management.
In addition to available rack space and physical proximity of nodes, provision needs to be made for adequate power and cooling as the cluster expands. New generations of drives and nodes typically deliver increased storage density, which often magnifies the power draw and cooling requirements per rack unit.
The recommendation is for a large cluster’s power supply to be fully redundant and backed up with a battery UPS and/or power generator. In the worst instance, if a cluster does loose power, the nodes are protected internally by filesystem journals which preserve any in-flight uncommitted writes. However, the time to restore power and bring up a large cluster from an unclean shutdown can be considerable.
Like most data center equipment, the cooling fans in PowerScale nodes and switches pull air from the front to back of the chassis. To complement this, data centers often employ a hot isle/cold isle rack configuration, where cool, low humidity air is supplied in the aisle at the front of each rack or cabinet either at the floor or ceiling level, and warm exhaust air is returned at ceiling level in the aisle to the rear of each rack.
Given the significant power draw, heat density, and weight of cluster hardware, some datacenters are limited in the number of nodes each rack can support. For partially filled racks, the use of blank panels to cover the front and rear of any unfilled rack units can help to efficiently direct airflow through the equipment.
The table below shows the various front and back-end network speeds and connector form factors across the PowerScale storage node portfolio.
Speed (Gb/s) | Form Factor | Front-end/ | Speed (Gb/s) |
100/40 | QSFP28 | Back-end | F900, F600, H700, H7000, A300, A3000, P100, B100 |
40 QDR | QSFP+ | Back-end | F800, F810, H600, H5600, H500, H400, A200, A2000 |
25/10 | SFP28 | Back-end | F900, F600, F200, H700, H7000, A300, A3000, P100, B100 |
10 QDR | QSFP+ | Back-end | H400, A200, A2000 |
100/40 | QSFP28 | Front-end | F900, F600, H700, H7000, A300, A3000, P100, B100 |
40 QDR | QSFP+ | Front-end | F800, F810, H600, H5600, H500, H400, A200, A2000 |
25/10 | SFP28 | Front-end | F900, F600, F200, H700, H7000, A300, A3000, P100, B100 |
25/10 | SFP+ | Front-end | F800, F810, H600, H5600, H500, H400, A200, A2000 |
10 QDR | SFP+ | Front-end | F800, F810, H600, H5600, H500, H400, A200, A2000 |
With large clusters, especially when the nodes may not be racked in a contiguous manner, it is highly advised to have all the nodes and switches connected to serial console concentrators and remote power controllers. However, to perform any physical administration or break/fix activity on nodes, you must know where the equipment is located and have administrative resources available to access and service all locations.
As such, the following best practices are recommended:
Disciplined cable management and labeling for ease of identification is particularly important in larger PowerScale clusters, where density of cabling is high. Each chassis can require up to 28 cables, as shown in the following table:
Cabling Component | Medium | Cable Quantity per Chassis |
Back-end network | Ethernet or Infiniband | 8 |
Front-end network | Ethernet | 8 |
Management interface | 1Gb Ethernet | 4 |
Serial console | DB9 RS 232 | 4 |
Power cord | 110V or 220V AC power | 4 |
Total |
| 28 |
The recommendations for cabling a PowerScale chassis are:
Similarly, the stand-alone F-series all flash nodes, in particular the 1RU F600 and F200 nodes, also have a similar density of cabling per rack unit:
Cabling Component | Medium | Cable Quantity per |
Back-end network | 10 or 40 Gb Ethernet or QDR Infiniband | 2 |
Front-end network | 10 or 40Gb Ethernet | 2 |
Management interface | 1Gb Ethernet | 1 |
Serial console | DB9 RS 232 | 1 |
Power cord | 110V or 220V AC power | 2 |
Total |
| 8 |
Consistent and meticulous cable labeling and management is particularly important in large clusters. PowerScale chassis that employ both front and back-end Ethernet networks can include up to 20 Ethernet connections per 4RU chassis.
In each node’s compute module, there are two PCI slots for the Ethernet cards (NICs). Viewed from the rear of the chassis, in each node the right hand slot (HBA Slot 0) houses the NIC for the front-end network, and the left hand slot (HBA Slot 1) houses the NIC for the front-end network. There is also a separate built-in 1Gb Ethernet port on each node for cluster management traffic.
While there is no requirement that node 1 aligns with port 1 on each of the back-end switches, it can certainly make cluster and switch management and troubleshooting considerably simpler. Even if exact port alignment is not possible, with large clusters, ensure that the cables are clearly labeled and connected to similar port regions on the back-end switches.
PowerScale nodes and the drives they contain have identifying LED lights to indicate when a component has failed and to allow proactive identification of resources. You can use the ‘isi led’ CLI command to illuminate specific node and drive indicator lights, as needed, to aid in identification.
Drive repair times depend on a variety of factors:
A useful method to estimate future FlexProtect runtime is to use old repair runtimes as a guide, if available.
The drives in the PowerScale chassis-based platforms have a bay-grid nomenclature, where A-E indicates each of the sleds and 0-6 would point to the drive position in the sled. The drive closest to the front is 0, whereas the drive closest to the back is 2/3/5, depending on the drive sled type.
When it comes to updating and refreshing hardware in a large cluster, swapping nodes can be a lengthy process of somewhat unpredictable duration. Data has to be evacuated from each old node during the Smartfail process prior to its removal, and restriped and balanced across the new hardware’s drives. During this time there will also be potentially impactful group changes as new nodes are added and the old ones removed.
However, if replacing an entire node-pool as part of a tech refresh, a SmartPools filepool policy can be crafted to migrate the data to another nodepool across the back-end network. When complete, the nodes can then be Smartfailed out, which should progress swiftly because they are now empty.
If multiple nodes are Smartfailed simultaneously, at the final stage of the process the node remove is serialized with around 60 seconds pause between each. The Smartfail job places the selected nodes in read-only mode while it copies the protection stripes to the cluster’s free space. Using SmartPools to evacuate data from a node or set of nodes in preparation to remove them is generally a good idea, and is usually a relatively fast process.
Another efficient approach can often be to swap drives out into new chassis. In addition to being considerably faster, the drive swapping process focuses the disruption on a single whole cluster down event. Estimating the time to complete a drive swap, or ‘disk tango’ process, is simpler and more accurate and can typically be completed in a single maintenance window.
With PowerScale chassis-based platforms, such as the H700 and A300, the available hardware ‘tango’ options are expanded and simplified. Given the modular design of these platforms, the compute and chassis tango strategies typically replace the disk tango:
Replacement Strategy | Component | PowerScale F-series | Chassis-based Nodes | Description |
Disk tango | Drives / drive sleds | x | x | Swapping out data drives or drive sleds |
Compute tango | Chassis Compute modules |
| x | Rather than swapping out the twenty drive sleds in a chassis, it’s usually cleaner to exchange the four compute modules |
Chassis tango | 4RU Chassis |
| x | Typically only required if there’s an issue with the chassis mid-plane. |
Note that any of the above ‘tango’ procedures should only be executed under the recommendation and supervision of Dell support.
Author: Nick Trimbee
Mon, 03 Oct 2022 16:39:01 -0000
|Read Time: 0 minutes
Dell PowerScale, the world’s most secure NAS storage array[1], continues to evolve its already rich security capabilities with the recent introduction of External Key Manager for Data-at-Rest-Encryption, enhancements to the STIG security profile, and support for UEFI Secure Boot across PowerScale platforms.
Our next release of PowerScale OneFS adds new security features that include software-based firewall functionality, multi-factor authentication with support for CAC/PIV, SSO for administrative WebUI, and FIPS-compliant data in flight.
As the PowerScale security feature set continues to advance, meeting the highest level of federal compliance is paramount to support industry and federal security standards. We are excited to announce that our scheduled verification by the Department of Defense Information Network (DISA) for inclusion on the DoD Approved Product List will begin in March 2023. For more information, see the DISA schedule here.
Moreover, OneFS will embrace the move to IPv6-only networks with support for USGv6-r1, a critical network standard applicable to hundreds of federal agencies and to the most security-conscious enterprises, including the DoD. Refreshed Common Criteria certification activities are underway and will provide a highly regarded international and enterprise-focused complement to other standards being supported.
We believe that implementing the zero trust model is the best foundation for building a robust security framework for PowerScale. This model and its principles are discussed below.
In the age of digital transformation, multiple cloud providers, and remote employees, the confines of the traditional data center are not enough to provide the highest levels of security. In the traditional sense, security was considered placing your devices in an imaginary “bubble.” The thought was that as long as devices were in the protected “bubble,” security was already accounted for through firewalls on the perimeter. However, the age-old concept of an organization’s security depending on the firewall is no longer relevant and is the easiest for a malicious party to attack.
Now that the data center is not confined to an area, the security framework must evolve, transform, and adapt. For example, although firewalls are still critical to network infrastructure, security must surpass just a firewall and security devices.
Although this seems like an easy question, it’s essential to understand the value of what is being protected. Traditionally, an organization’s most valuable assets were its infrastructure, including a building and the assets required to produce its goods. However, in the age of Digital Transformation, organizations have realized that the most critical asset is their data.
Because data is an organization’s most valuable asset, protecting the data is paramount. And how do we protect this data in the modern environment without data center confines? Enter the zero trust model!
Although Forrester Research first defined zero trust architecture in 2010, it has recently received more attention with the ever-changing security environment leading to a focus on cybersecurity. The zero trust architecture is a general model and must be refined for a specific implementation. For example, in September 2019, the National Institute of Standards and Technology (NIST) introduced its concept of Zero Trust Architecture. As a result, the White House has also published an Executive Order on Improving the Nation’s Cybersecurity, including zero trust initiatives.
In a zero trust architecture, all devices must be validated and authenticated. The concept applies to all devices and hosts, ensuring that none are trusted until proven otherwise. In essence, the model adheres to a “never trust, always verify” policy for all devices.
NIST Special Publication 800-207 Zero Trust Architecture states that a zero trust model is architected with the following design tenets:
The PowerScale family of scale-out NAS solutions includes all-flash, hybrid, and archive storage nodes that can be deployed across the entire enterprise – from the edge, to core, and the cloud, to handle the most demanding file-based workloads. PowerScale OneFS combines the three layers of storage architecture—file system, volume manager, and data protection—into a scale-out NAS cluster. Dell Technologies follows the NIST Cybersecurity Framework to apply zero trust principles on a PowerScale cluster. The NIST Framework identifies five principles: identify, protect, detect, respond, and recover. Combining the framework from the NIST CSF and the data model provides the basis for the PowerScale zero trust architecture in five key stages, as shown in the following figure.
Let’s look at each of these stages and what Dell Technologies tools can be used to implement them.
To secure an asset, the first step is to identify the asset. In our case, it is data. To secure a dataset, it must first be located, sorted, and tagged to secure it effectively. This can be an onerous process depending on the number of datasets and their size. We recommend using the Superna Eyeglass Search and Recover feature to understand your unstructured data and to provide insights through a single pane of glass, as shown in the following image. For more information, see the Eyeglass Search and Recover Product Overview.
Once we know the data we are securing, the next step is to associate roles to the indexed data. The role-specific administrators and users only have access to a subset of the data necessary for their responsibilities. PowerScale OneFS allows system access to be limited to an administrative role through Role-Based Access Control (RBAC). As a best practice, assign only the minimum required privileges to each administrator as a baseline. In the future, more privileges can be added as needed. For more information, see PowerScale OneFS Authentication, Identity Management, and Authorization.
For the next step in deploying the zero trust model, use encryption to protect the data from theft and man-in-the-middle attacks.
PowerScale OneFS provides Data at Rest Encryption (D@RE) using self-encrypting drives (SEDs), allowing data to be encrypted during writes and decrypted during reads with a 256-bit AES encryption key, referred to as the data encryption key (DEK). Further, OneFS wraps the DEK for each SED in an authentication key (AK). Next, the AKs for each drive are placed in a key manager (KM) that is stored securely in an encrypted database, the key manager database (KMDB). Next, the KMDB is encrypted with a 256-bit master key (MK). Finally, the 256-bit master key is stored external to the PowerScale cluster using a key management interoperability protocol (KMIP)-compliant key manager server, as shown in the following figure. For more information, see PowerScale Data at Rest Encryption.
Data in flight is encrypted using SMB3 and NFS v4.1 protocols. SMB encryption can be used by clients that support SMB3 encryption, including Windows Server 2012, 2012 R2, 2016, Windows 10, and 11. Although SMB supports encryption natively, NFS requires additional Kerberos authentication to encrypt data in flight. OneFS Release 9.3.0.0 supports NFS v4.1, allowing Kerberos support to encrypt traffic between the client and the PowerScale cluster.
Once the protocol access is encrypted, the next step is encrypting data replication. OneFS supports over-the-wire, end-to-end encryption for SyncIQ data replication, protecting and securing in-flight data between clusters. For more information about these features, see the following:
In an environment of ever-increasing cyber threats, cyber protection must be part of any security model. Superna Eyeglass Ransomware Defender for PowerScale provides cyber resiliency. It protects a PowerScale cluster by detecting attack events in real-time and recovering from cyber-attacks. Event triggers create an automated response with real-time access auditing, as shown in the following figure.
The Enterprise AirGap capability creates an isolated data copy in a cyber vault that is network isolated from the production environment, as shown in the following figure. For more about PowerScale Cyber Protection Solution, check out this comprehensive eBook.
Monitoring is a critical component of applying a zero trust model. A PowerScale cluster should constantly be monitored through several tools for insights into cluster performance and tracking anomalies. Monitoring options for a PowerScale cluster include the following:
This blog introduces implementing the zero trust model on a PowerScale cluster. For additional details and applying a complete zero trust implementation, see the PowerScale Zero Trust Architecture section in the Dell PowerScale OneFS: Security Considerations white paper. You can also explore the other sections in this paper to learn more about all PowerScale security considerations.
Author: Aqib Kazi
[1] Based on Dell analysis comparing cybersecurity software capabilities offered for Dell PowerScale vs competitive products, September 2022.
Sat, 01 Oct 2022 23:21:56 -0000
|Read Time: 0 minutes
As a security best practice, a quarterly security review is recommended. Forming an aggressive security posture for a PowerScale cluster is composed of different facets that may not be applicable to every organization. An organization’s industry, clients, business, and IT administrative requirements determine what is applicable. To ensure an aggressive security posture for a PowerScale cluster, use the checklist in the following table as a baseline for security.
This table serves as a security baseline and must be adapted to specific organizational requirements. See the Dell PowerScale OneFS: Security Considerations white paper for a comprehensive explanation of the concepts in the table below.
Further, cluster security is not a single event. It is an ongoing process: Monitor this blog for updates. As new updates become available, this post will be updated. Consider implementing an organizational security review on a quarterly basis.
The items listed in the following checklist are not in order of importance or hierarchy but rather form an aggressive security posture as more features are implemented.
Table 1. PowerScale security baseline checklist
Security Feature | Configuration | Links | Complete (Y/N) | Notes |
Data at Rest Encryption | Implement external key manager with SEDs | PowerScale Data at Rest Encryption |
|
|
Data in flight encryption | Encrypt protocol communication and data replication | PowerScale: Solution Design and Considerations for SMB Environments PowerScale OneFS NFS Design Considerations and Best Practices PowerScale SyncIQ: Architecture, Configuration, and Considerations |
|
|
Role-based access control (RBACs) | Assign the lowest possible access required for each role | Dell PowerScale OneFS: Authentication, Identity Management, and Authorization |
|
|
Multi-factor authentication | Dell PowerScale OneFS: Authentication, Identity Management, and Authorization Disabling the WebUI and other non-essential services |
|
| |
Cybersecurity |
|
| ||
Monitoring | Monitor cluster activity | Dell CloudIQ - AIOps for Intelligent IT Infrastructure Insights |
|
|
Secure Boot | Configure PowerScale Secure Boot | See PowerScale Secure Boot section |
|
|
Auditing | Configure auditing | File System Auditing with Dell PowerScale and Dell Common Event Enabler |
|
|
Custom applications | Create a custom application for cluster monitoring |
|
| |
Perform a quarterly security review | Review all organizational security requirements and current implementation. Check this paper and checklist for updates Monitor security advisories for PowerScale: https://www.dell.com/support/security/en-us |
|
| |
General cluster security best practices
| See the Security best practices section in the Security Configuration Guide for the relevant release at OneFS Info Hubs |
|
| |
Login, authentication, and privileges best practices |
|
| ||
SNMP security best practices |
|
| ||
SSH security best practices |
|
| ||
Data-access protocols best practices |
|
| ||
Web interface security best practices |
|
| ||
Anti-Virus |
|
|
Author: Aqib Kazi
Tue, 06 Sep 2022 20:46:32 -0000
|Read Time: 0 minutes
Content creation workflows are increasingly distributed between multiple sites and cloud providers. Data orchestration has long been a key component in these workflows. With the extra complexity (and functionality) of multiple on-premises and cloud infrastructures, automated data orchestration is more crucial than ever.
There has been a subtle but significant shift in how media companies store and manage data. In the old way, file storage formed the “core” and data was eventually archived off to tape or object storage for long-term retention. The new way of managing data flips this paradigm. Object storage has become the new “core” with performant file storage at edge locations used for data processing and manipulation.
Various factors have influenced this shift. These factors include the ever-increasing volume of data involved in modern productions, the expanding role of public cloud providers (for whom object storage is the default), and media application support.
Figure 1. Global storage environment
With this shift in roles, new techniques for data orchestration become necessary. Data management vendors are reacting to these requirements for data movement and global file system solutions.
However, many of these solutions require data to be ingested and accessed through dedicated proprietary gateways. Often this gateway approach means that the data is now inaccessible using the native S3 API.
PowerScale OneFS and Superna Golden Copy provide a way of orchestrating data between file and object that retains the best qualities of both types of storage. Data is available to be accessed on both the performant edge (PowerScale) and the object core (ECS or public cloud) with no lock in at either end.
Further, Superna Golden Copy is directly integrated with the PowerScale OneFS API. The OneFS snapshot change list is used for immediate incremental data moves. Filesystem metadata is preserved in S3 tags.
Golden Copy and OneFS are a solution built for seamless movement of data between locations, file system, and object storage. File structure and metadata are preserved.
Data that originates on object storage needs to be accessible natively by systems that can speak object APIs. Also, some subset of data needs to be moved to file storage for further processing. Production data that originates on file storage similarly needs native access. That file data will need to be moved to object storage for long-term retention and to make it accessible to globally distributed resources.
Content creation workflows are spread across multiple teams working in many locations. Multisite productions require distributed storage ecosystems that can span geographies. This architecture is well suited to a core of object storage as the “central source of truth”. Pools of highly performant file storage serve teams in their various global locations.
The Golden Copy GraphQL API allows external systems to control, configure, and monitor Golden Copy jobs. This type of API-based data orchestration is essential to the complex global pipelines of content creators. Manually moving large amounts of data is untenable. Schedule-based movement of data aligns well with some content creation workflows; other workflows require more ad hoc data movement.
Figure 2. Object Core with GoldenCopy and PowerScale
A large ecosystem of production management tools, such as Autodesk Shotgrid, exist for managing global teams. These tools are excellent for managing projects, but do not typically include dedicated data movers. Data movement can be particularly challenging when large amounts of media need to be shifted between object and file.
Production asset management can trigger data moves with Golden Copy based on metadata changes to a production or scene. This kind of API and metadata driven data orchestration fits in the MovieLabs 2030 vision for software-defined workflows for content creation. This topic is covered in some detail for tiering within a OneFS file system in the paper: A Metadata Driven Approach to On Demand Tiering.
For more information about using PowerScale OneFS together with Superna GoldenCopy, see my full white paper PowerScale OneFS: Distributed Media Workflows.
Author: Gregory Shiff
Tue, 06 Sep 2022 18:14:53 -0000
|Read Time: 0 minutes
AI is a fancy and hot topic in recent years. A common question from our customers is ‘How can AI help the day-to-day operation and management of PowerScale?’ It’s a very interesting question, because although AI can help realize so many possibilities, we still don’t have that many implementations of it in IT infrastructure.
But, we finally have something that is very inspiring. Here is what we have achieved as proof of concept in our lab with the support of AI Dynamics, a professional AI platform company.
With the increase in complexity of IT infrastructure comes the increase in the amount of data produced by these systems, Real-time performance logs, usage reports, audits, and other metadata can add up to gigabytes or terabytes a day. It is a big challenge for the IT department to analyze this data and to extract proactive predictions, such as IT infrastructure performance issues and their bottlenecks.
AIOps is the methodology to address these challenges. The term ‘AIOps’ refers to the use of artificial intelligence (AI), specifically machine learning (ML) techniques, to ingest, analyze, and learn from large volumes of data from every corner of the IT environment. The goal of AIOps is to allow IT departments to manage their assets and tackle performance challenges proactively, in real-time (or better), before they become system-wide issues.
In this solution, we identify NFS latency as the PowerScale performance indicator that customers would like to see predictive reporting about. The goal of the AI model is to study historical system activity and predict the NFS latency at five-minute intervals for four hours in the future. A traditional software system can use these predictions to alert users of a potential performance bottleneck based on the user’s specified latency threshold level and spike duration. In the future, AI models can be built that help diagnose the source of these issues so that both an alert and a best-recommended solution can be reported to the user.
The whole training process involves the following three steps (I’ll explain the details in the following sections):
The raw performance data is collected through Dell Secure Remote Services (SRS) from 12 different all-flash PowerScale clusters from an electronic design automation (EDA) customer each week. We identify and extract 26 performance key metrics from the raw data, most of which are logged and updated every five minutes. AI Dynamics NeoPulse is used to extract some additional fields (such as the day of the week and time of day from the UNIX timestamp fields) to allow the model to make better predictions. Each week new data was collected from the PowerScale cluster to increase the size of the training dataset and to improve the AI model. During every training run, we also withheld 10% of the data, which we used to test the AI model in the testing phase. This is separate from the 10% of training data withheld for validation.
Figure 1. Data preparation process
Over a period of two months, more than 50 different AI models were trained using a variety of different time series architectures, varying model architecture parameters, hyperparameters, and data engineering techniques to maximize performance, without overfitting to existing data. When these training pipelines were created in NeoPulse, they could easily be reused as new data arrived from the client each week, to rerun training and testing in order to quantify the performance of the model.
At the end of the two-month period, we had built a model that could predict whether this one performance metric (NFS3 latency) would be above a threshold of 10ms, correctly for 70% of each one of the next 48 five-minute intervals (four hours total).
In the data preparation phase, we withheld 10% of the total data set to be used for AI model validation and testing. With the current AI model, end-users can easily configure the threshold of the latency as they want. In this case, we validated the model at 10ms and 15ms thresholds latency. The model can correctly identify over 70% of 10ms latency spikes and 60% of 15ms latency spikes over the entire ensuing four-hour period.
Figure 2. Model Validation
In this solution, we used NFS latency from PowerScale as the indicator to be predicted. The AI model uses the performance data from the previous four hours to predict the trends and spikes of NFS latency in the next four hours. If the software identifies a five-minute period when a >10ms latency spike would occur more than 70% of the time, it will trigger a configurable alert to the user.
The following diagram shows an example. At 8:55 a.m., the AI model predicts the NFS latency from 8:55 a.m. to 12:55 p.m., based on the input of performance data from 4:55 a.m. to 8:55 a.m. The AI model makes predictions for each five-minute period over the prediction duration. The model predicts a few isolated spikes in latency, with a large consecutive cluster of high latency between around 12 p.m. and 12:55 p.m. A software system can use this prediction to alert the user about the expected increase in latency, giving them over three hours to get ahead of the problem and reduce the server load. In the graph, the dotted line shows the AI model’s prediction, whereas the solid line shows actual performance.
Figure 3. Dell PowerScale NFS Latency Forecasting
To sum up, the solution achieved the following:
AIOps introduces the intelligence needed to manage the complexity of modern IT environments. The NeoPulse platform from AI Dynamics makes AIOps easy to implement. In an all-flash configuration of Dell PowerScale clusters, performance is one of the key considerations. Hundreds and thousands of performance logs are generated every day and it is very easy for AIOps to consume the existing logs and provide insight into potential performance bottlenecks. Dell servers with GPUs are great platforms for performing training and inference, for not just this model but for any other new AI challenge the company wishes to tackle, across dozens of problem types.
For additional details about our testing, see the white paper Key Performance Prediction using Artificial Intelligence for IT operations (AIOps).
Author: Vincent Shen
Tue, 23 Aug 2022 17:00:45 -0000
|Read Time: 0 minutes
Network connectivity is an essential part of any infrastructure architecture. When it comes to how Kubernetes connects to PowerScale, there are several options to configure the Container Storage Interface (CSI). In this post, we will cover the concepts and configuration you can implement.
The story starts with CSI plugin architecture.
Like all other Dell storage CSI, PowerScale CSI follows the Kubernetes CSI standard by implementing functions in two components.
The CSI controller plugin is deployed as a Kubernetes Deployment, typically with two or three replicas for high-availability, with only one instance acting as a leader. The controller is responsible for communicating with PowerScale, using Platform API to manage volumes (to PowerScale it’s to create/delete directories, NFS exports, and quotas), to update the NFS client list when a Pod moves, and so on.
A CSI node plugin is a Kubernetes DaemonSet, running on all nodes by default. It’s responsible for mounting the NFS export from PowerScale, to map the NFS mount path to a Pod as persistent storage, so that applications and users in the Pod can access the data on PowerScale.
Because CSI needs to access both PAPI (PowerScale Platform API) and NFS data, a single user role typically isn’t secure enough: the role for PAPI access will need more privileges than normal users.
According to the PowerScale CSI manual, CSI requires a user that has the following privileges to perform all CSI functions:
Privilege | Type |
ISI_PRIV_LOGIN_PAPI | Read Only |
ISI_PRIV_NFS | Read Write |
ISI_PRIV_QUOTA | Read Write |
ISI_PRIV_SNAPSHOT | Read Write |
ISI_PRIV_IFS_RESTORE | Read Only |
ISI_PRIV_NS_IFS_ACCESS | Read Only |
ISI_PRIV_IFS_BACKUP | Read Only |
Among these privileges, ISI_PRIV_SNAPSHOT and ISI_PRIV_QUOTA are only available in the System zone. And this complicates things a bit. To fully utilize these CSI features, such as volume snapshot, volume clone, and volume capacity management, you have to allow the CSI to be able to access the PowerScale System zone. If you enable the CSM for replication, the user needs the ISI_PRIV_SYNCIQ privilege, which is a System-zone privilege too.
By contrast, there isn’t any specific role requirement for applications/users in Kubernetes to access data: the data is shared by the normal NFS protocol. As long as they have the right ACL to access the files, they are good. For this data accessing requirement, a non-system zone is suitable and recommended.
These two access zones are defined in different places in CSI configuration files:
If an admin really cannot expose their System zone to the Kubernetes cluster, they have to disable the snapshot and quota features in the CSI installation configuration file (values.yaml). In this way, the PAPI access zone can be a non-System access zone.
The following diagram shows how the Kubernetes cluster connects to PowerScale access zones.
Normally a Kubernetes cluster comes with many networks: a pod inter-communication network, a cluster service network, and so on. Luckily, the PowerScale network doesn’t have to join any of them. The CSI pods can access a host’s network directly, without going through the Kubernetes internal network. This also has the advantage of providing a dedicated high-performance network for data transfer.
For example, on a Kubernetes host, there are two NICs: IP 192.168.1.x and 172.24.1.x. NIC 192.168.1.x is used for Kubernetes, and is aligned with its hostname. NIC 172.24.1.x isn’t managed by Kubernetes. In this case, we can use NIC 172.24.1.x for data transfer between Kubernetes hosts and PowerScale.
Because by default, the CSI driver will use the IP that is aligned with its hostname, to let CSI recognize the second NIC 172.24.1.x, we have explicitly set the IP range in “allowedNetworks” in the values.yaml file in the CSI driver installation. For example:
allowedNetworks: [172.24.1.0/24]
Also, in this network configuration, it’s unlikely that the Kubernetes internal DNS can resolve the PowerScale FQDN. So, we also have to make sure the “dnsPolicy” has been set to “ClusterFirstWithHostNet” in the values.yaml file. With this dnsPolicy, the CSI pods will reach the DNS server in /etc/resolv.conf in the host OS, not the internal DNS server of Kubernetes.
The following diagram shows the configuration mentioned above:
Please note that the “allowedNetworks” setting only affects the data access zone, and not the PAPI access zone. In fact, CSI just uses this parameter to decide which host IP should be set as the NFS client IP on the PowerScale side.
Regarding the network routing, CSI simply follows the OS route configuration. Because of that, if we want the PAPI access zone to go through the primary NIC (192.168.1.x), and have the data access zone to go through the second NIC (172.24.1.x), we have to change the route configuration of the Kubernetes host, not this parameter.
Hopefully this blog helps you understand the network configuration for PowerScale CSI better. Stay tuned for more information on Dell Containers & Storage!
Authors: Sean Zhan, Florian Coulombel
Mon, 25 Jul 2022 13:43:38 -0000
|Read Time: 0 minutes
In today's security environment, organizations must adhere to governance security requirements, including disabling specific HTTP services.
OneFS release 9.4.0.0 has introduced an option to disable non-essential cluster services selectively rather than disabling all HTTP services. Disabling selectively allows administrators to determine which services are necessary. Disabling the services allows other essential services on the cluster to continue to run. You can disable the following non-essential services:
Each of these services can be disabled independently and has no impact on other HTTP-based data services. The services can be disabled through the CLI or API with the ISI_PRIV_HTTP privilege. To manage the non-essential services from the CLI, use the isi http services list command to list the services. Use the isi http services view and isi http services modify commands to view and modify the services. The impact of disabling each of the services is listed in the following table.
HTTP services impacts
Service | Impacts |
PowerScaleUI | The WebUI is entirely disabled. Attempting to access the WebUI displays Service Unavailable. Please contact Administrator. |
Platform-API-External | Disabling the Platform-API-External service does not impact the Platform-API-Internal service of the cluster. The Platform-API-Internal services continue to function, even if the Platform-API-External service is disabled. However, if the Platform-API-External service is disabled, the WebUI is also disabled at that time, because the WebUI uses the Platform-API-External service. |
RAN (Remote Access to Namespace) | If RAN is disabled, use of the Remote File Browser UI component is restricted in the Remote File Browser and the File System Explorer. |
RemoteService | If RemoteService is disabled, the remote support UI and the InProduct Activation UI components are restricted. |
To disable the WebUI, use the following command:
isi http services modify --service-id=PowerScaleUI --enabled=false
Author: Aqib Kazi
Fri, 22 Jul 2022 17:58:28 -0000
|Read Time: 0 minutes
PowerScale for Google Cloud provides the native-cloud experience of file services with high performance. It is a scalable file service that provides high-speed file access over multiple protocols, including SMB, NFS, and HDFS. PowerScale for Google Cloud enables customers to run their cloud workloads on the PowerScale scale-out NAS storage system. The following figure shows the architecture of PowerScale for Google Cloud. The three main parts are the Dell Technologies partner data center, the Dell Technologies Google Cloud organization (isiloncloud.com), and the customer’s Google Cloud organization (for example, customer-a.com and customer-b.com).
We proudly released a new version of PowerScale for Google Cloud on July 8, 2022. It provides the following key features and enhancements:
In the previous version of PowerScale for Google Cloud, only several pre-defined node tiers were available. With the latest version, you can purchase all PowerScale node types to fit your business needs and accelerate your native-cloud file service experience.
In the previous version, the supported regions include North America and APJ (Australia and Singapore). We are now adding the EMEA region, which includes London, Frankfurt, Paris, and Warsaw.
PowerScale for Google Cloud is now certified to support GCVE. GCVE guest VMs can connect to PowerScale for Google Cloud file services to fully leverage PowerScale cluster storage. We’ll be taking a deeper look at the details in blog articles in the next few weeks.
Want to know more about the powerful cloud file service solution? Just click these links:
Author: Lieven Lin
Fri, 01 Jul 2022 14:15:16 -0000
|Read Time: 0 minutes
In the previous blog, we introduced the OneFS file permission basics, including:
1. OneFS file permission is only in one of the following states:
2. No matter the OneFS file permission state, the on-disk identity for a file is always a UID, a GID, or an SID. The name of a user or group is for display only.
3. When OneFS receives a user access request, it generates an access token for the user and compares the token to the file permissions based on UID/GID/SID.
Therefore, in this blog, we will explain what UID/GID/SID is, and will explain what a OneFS access token is. Now, let’s start by looking at UID/GID/SIDs.
In our daily life, we are usually familiar with a username or a group name, which stands for a user or a group. In a NAS system, we use UID, GID, and SID to identify a user or a group, then the NAS system will resolve the UID, GID, and SID into a related username or group name.
The UID/GID is usually used in a UNIX environment to identify users/groups with a positive integer assigned. The UID/GID is usually provided by the local operating system and LDAP server.
The SID is usually used in a Windows environment to identify users/groups. The SID is usually provided by the local operating system and Active Directory (AD). The SID is written in the format:
(SID)-(revision level)-(identifier-authority)-(subauthority1)-(subauthority2)-(etc)
for example:
S-1-5-21-1004336348-1177238915-682003330-512
For more information about SIDs, see the Microsoft article: What Are Security Identifiers?.
In OneFS, information about users and groups is managed and stored in different authentication providers, including UID/GID and SID information, and user group membership information. OneFS can add multiple types of authentication provider, including:
OneFS retrieves a user’s identity (UID/GID/SID) and group memberships from the above authentication providers. Assuming that we have a user named Joe, OneFS tries to resolve Joe’s UID/GID and group memberships from LDAP, NIS, file provider, and Local provider. Meanwhile, it also tries to resolve Joe’s SID and group memberships from AD, file provider, or local provider.
It is not always the case that OneFS needs to resolve a user from username to UID/GID/SID. It is also possible that OneFS needs to resolve a user in reverse: that is, resolve a UID to its related username. This usually occurs when using NFSv3. When OneFS gets all UID/GID/SID information for a user, it will maintain the identity relationship in a local database, which records the UID <--> SID and GID <-->SID mapping, also known as the ID mapping function in OneFS.
Now, you should have an overall idea about how OneFS maintains the important UID/GID/SID information, and how to retrieve this information as needed.
Next, let’s see how OneFS can determine whether different usernames in different authentication types are actually the same real user. For example: how can we tell if the Joe in AD and the joe_f in LDAP is same guy or not? If they are the same, OneFS needs to ensure that they have the same access to the same file, even with different protocols.
That is the magic of the OneFS user mapping function. The default user mapping rule maps users together that have the same usernames in different authentication providers. For example, the Joe in AD and the Joe in LDAP will be considered the same user. You must create user mapping rules if a real user has different names in different authentication providers. The user mapping rule can have different operators, to provide more flexible management between different usernames in different authentication providers. The operators could be Append, Insert, Replace, Remove Groups, Join. See OneFS user mapping operators for more details. We just need to remember that the user mapping is just a function to determine if the user information in an authentication provider should be used when generating an access token.
Although it is easy to confuse user mapping with ID mapping, user mapping is the process of identifying users across authentication providers for the purpose of token generation. After the token is generated, the mappings of SID<-->UID are placed in the ID mapping database.
Finally, OneFS must choose an authoritative identity (that is, an On-Disk Identity) from the collected/generated UID/GID/SID for the user, which will be stored on disk and is used when the file is created or when ownership of file changes, impacting the file permissions.
In a single protocol environment, determining the On-Disk Identity is simple because Windows uses SIDs and Linux uses UIDs. However, in a multi-protocol environment, only one identity is stored, and the challenge is determining which one is stored. By default, the policy configured for on-disk identity is Native mode. Native mode is the best option for most environments. OneFS selects the real value between the SID and UID/GID. If both the SID and UID/GID are real values, OneFS selects UID/GID. Please note that this blog series is based on the default policy setting.
Now you should have an overall understanding of user mapping, ID mapping, and on-disk identity. These are the key concepts when understanding user access tokens and doing troubleshooting. Finally, let’s see what an access token contains.
You can view a user’s access token by using the command isi auth mapping token <username> in OneFS. Here is an example of Joe’s access token:
vonefs-aima-1# isi auth mapping token Joe User Name: Joe UID: 2001 SID: S-1-5-21-1137111906-3057660394-507681705-1002 On Disk: 2001 ZID: 1 Zone: System Privileges: - Primary Group Name: Market GID: 2003 SID: S-1-5-21-1137111906-3057660394-507681705-1006 On Disk: 2003 Supplemental Identities Name: Authenticated Users SID: S-1-5-11 Below
From the above output, we can see that an access token contains the following information:
Still, remember that we have a file created and owned by Joe in the previous blog? Here are the file permissions:
vonefs-aima-1# ls -le acl-file.txt -rwxrwxr-x + 1 Joe Market 69 May 28 01:08 acl-file.txt OWNER: user:Joe GROUP: group:Market 0: user:Joe allow file_gen_all 1: group:Market allow file_gen_read,file_gen_execute 2: user:Bob allow file_gen_all 3: everyone allow file_gen_read,file_gen_execute
The ls -le command here shows the user’s username only. And we already emphasized that the on-disk identity is always UID/GID or SID, so let’s use the ls -len command to show the on-disk identities. In the following output, we see that Joe’s on-disk identity is his UID 2001, and his GID 2003. When Joe wants to access the file, OneFS compares Joe’s access token with the file permissions below, finds that Joe’s UID is 2001 in his token, and grants him access to the file.
vonefs-aima-1# ls -len acl-file.txt -rwxrwxr-x + 1 2001 2003 69 May 28 01:08 acl-file.txt OWNER: user:2001 GROUP: group:2003 0: user:2001 allow file_gen_all 1: group:2003 allow file_gen_read,file_gen_execute 2: user:2002 allow file_gen_all 3: SID:S-1-1-0 allow file_gen_read,file_gen_execute
The above Joe is a OneFS local user from a local provider. Next, we will see what the access token looks like if a user’s SID is from AD and UID/GID is from LDAP.
Let’s assume that user John has an account named John_AD in AD, and also has an account named John_LDAP in LDAP server. This means that OneFS has to ensure that the two usernames have consistent access permissions on a file. To achieve that, we need to create a user mapping rule to join them together, so that the final access token will contain the SID information in AD and UID/GID information in LDAP. The access token for John_AD looks like this:
vonefs-aima-1# isi auth mapping token vlab\\John_AD User Name: VLAB\john_ad UID: 1000019 SID: S-1-5-21-2529895029-2434557131-462378659-1110 On Disk: S-1-5-21-2529895029-2434557131-462378659-1110 ZID: 1 Zone: System Privileges: - Primary Group Name: VLAB\domain users GID: 1000041 SID: S-1-5-21-2529895029-2434557131-462378659-513 On Disk: S-1-5-21-2529895029-2434557131-462378659-513 Supplemental Identities Name: Users GID: 1545 SID: S-1-5-32-545 Name: Authenticated Users SID: S-1-5-11 Name: John_LDAP UID: 19421 SID: S-1-22-1-19421 Name: ldap_users GID: 32084 SID: S-1-22-2-32084
Assume that a file that is owned and only accessible by John_LDAP has the file permissions shown in the following output. As the John_AD and John_LDAP is joined together with a user mapping rule, the John_LDAP identity (UID) is also in the John_AD access token, so John_AD can also access the file.
vonefs-aima-1# ls -le john_ldap.txt -rwx------ 1 John_LDAP ldap_users 19 Jun 15 07:36 john_ldap.txt OWNER: user:John_LDAP GROUP: group:ldap_users SYNTHETIC ACL 0: user:John_LDAP allow file_gen_read,file_gen_write,file_gen_execute,std_write_dac 1: group:ldap_users allow std_read_dac,std_synchronize,file_read_attr
You should now have an understanding of OneFS access tokens, and how they are used to determine a user’s authorized operation on data, through file permission checking.
In my next blog, we will see what will happen for different protocols when accessing OneFS data.
Author: Lieven Lin
Mon, 27 Jun 2022 21:03:17 -0000
|Read Time: 0 minutes
OneFS protects data stored on failing nodes or drives in a cluster through a process called smartfail. During the process, OneFS places a device into quarantine and, depending on the severity of the issue, the data on it into a read-only state. While a device is quarantined, OneFS reprotects the data on the device by distributing the data to other devices.
After all data eviction or reconstruction is complete, OneFS logically removes the device from the cluster, and the node or drive can be physically replaced. OneFS only automatically smartfails devices as a last resort. Nodes and/or drives can also be manually smartfailed. However, it is strongly recommended to first consult Dell Technical Support.
Occasionally a device might fail before OneFS detects a problem. If a drive fails without being smartfailed, OneFS automatically starts rebuilding the data to available free space on the cluster. However, because a node might recover from a transient issue, if a node fails, OneFS does not start rebuilding data unless it is logically removed from the cluster.
A node that is unavailable and reported by isi status as ‘D’, or down, can be smartfailed. If the node is hard down, likely with a significant hardware issue, the smartfail process will take longer because data has to be recalculated from the FEC protection parity blocks. That said, it’s well worth attempting to bring the node up if at all possible – especially if the cluster, and/or node pools, is at the default +2D:1N protection. The concern here is that, with a node down, there is a risk of data loss if a drive or other component goes bad during the smartfail process.
If possible, and assuming the disk content is still intact, it can often be quicker to have the node hardware repaired. In this case, the entire node’s chassis (or compute module in the case of Gen 6 hardware) could be replaced and the old disks inserted with original content on them. This should only be performed at the recommendation and under the supervision of Dell Technical Support. If the node is down because of a journal inconsistency, it will have to be smartfailed out. In this case, engage Dell Technical Support to determine an appropriate action plan.
The recommended procedure for smartfailing a node is as follows. In this example, we’ll assume that node 4 is down:
From the CLI of any node except node 4, run the following command to smartfail out the node:
# isi devices node smartfail --node-lnn 4
Verify that the node is removed from the cluster.
# isi status –q
(An ‘—S-’ will appear in node 4’s ‘Health’ column to indicate it has been smartfailed).
Monitor the successful completion of the job engine’s MultiScan, FlexProtect/FlexProtectLIN jobs:
# isi job status
Un-cable and remove the node from the rack for disposal.
As mentioned previously, there are two primary Job Engine jobs that run as a result of a smartfail:
MultiScan performs the work of both the AutoBalance and Collect jobs simultaneously, and it is triggered after every group change. The reason is that new file layouts and file deletions that happen during a disruption to the cluster might be imperfectly balanced or, in the case of deletions, simply lost.
The Collect job reclaims free space from previously unavailable nodes or drives. A mark and sweep garbage collector, it identifies everything valid on the filesystem in the first phase. In the second phase, the Collect job scans the drives, freeing anything that isn’t marked valid.
When node and drive usage across the cluster are out of balance, the AutoBalance job scans through all the drives looking for files to re-layout, to make use of the less filled devices.
The purpose of the FlexProtect job is to scan the file system after a device failure to ensure that all files remain protected. Incomplete protection levels are fixed, in addition to missing data or parity blocks caused by drive or node failures. This job is started automatically after smartfailing a drive or node. If a smartfailed device was the reason the job started, the device is marked gone (completely removed from the cluster) at the end of the job.
Although a new node can be added to a cluster at any time, it’s best to avoid major group changes during a smartfail operation. This helps to avoid any unnecessary interruptions of a critical job engine data reprotection job. However, because a node is down, there is a window of risk while the cluster is rebuilding the data from that cluster. Under pressing circumstances, the smartfail operation can be paused, the node added, and then smartfail resumed when the new node has happily joined the cluster.
Be aware that if the node you are adding is the same node that was smartfailed, the cluster maintains a record of that node and may prevent the re-introduction of that node until the smartfail is complete. To mitigate risk, Dell Technical Support should definitely be involved to ensure data integrity.
The time for a smartfail to complete is hard to predict with any accuracy, and depends on:
Attribute | Description |
OneFS release | Determines OneFS job engine version and how efficiently it operates. |
System hardware | Drive types, CPU, RAM, and so on. |
File system | Quantity and type of data (that is, small vs. large files), protection, tunables, and so on. |
Cluster load | Processor and CPU utilization, capacity utilization, and so on. |
Typical smartfail runtimes range from minutes (for fairly empty, idle nodes with SSD and SAS drives) to days (for nodes with large SATA drives and a high capacity utilization). The FlexProtect job already runs at the highest job engine priority (value=1) and medium impact by default. As such, there isn’t much that can be done to speed up this job, beyond reducing cluster load.
Smartfail is also a valuable tool for proactive cluster node replacement, such as during a hardware refresh. Provided that the cluster quorum is not broken, a smartfail can be initiated on multiple nodes concurrently – but never more than n/2 – 1 nodes (rounded up)!
If replacing an entire node pool as part of a tech refresh, a SmartPools filepool policy can be crafted to migrate the data to another node pool across the backend network. When complete, the nodes can then be smartfailed out, which should progress swiftly because they are now empty.
If multiple nodes are smartfailed simultaneously, at the final stage of the process the node remove is serialized with roughly a 60 second pause between each. The smartfail job places the selected nodes in read-only mode while it copies the protection stripes to the cluster’s free space. Using SmartPools to evacuate data from a node or set of nodes, in preparation to remove them, is generally a good idea, and usually a relatively fast process.
SmartPools’ Virtual Hot Spare (VHS) functionality helps ensure that node pools maintain enough free space to successfully re-protect data in the event of a smartfail. Though configured globally, VHS actually operates at the node pool level so that nodes with different size drives reserve the appropriate VHS space. This helps ensure that while data may move from one disk pool to another during repair, it remains on the same class of storage. VHS reservations are cluster wide and configurable, as either a percentage of total storage (0-20%), or as a number of virtual drives (1-4), with the default being 10%.
Note: a smartfail is not guaranteed to remove all data on a node. Any pool in a cluster that’s flagged with the ‘System’ flag can store /ifs/.ifsvar data. A filepool policy to move the regular data won’t address this data. Also, because SmartPools ‘spillover’ may have occurred at some point, there is no guarantee that an ‘empty’ node is completely devoid of data. For this reason, OneFS still has to search the tree for files that may have blocks residing on the node.
Nodes can be easily smartfailed from the OneFS WebUI by navigating to Cluster Management > Hardware Configuration and selecting ‘Actions > More > Smartfail Node’ for the desired node(s):
Similarly, the following CLI commands first initiate and then halt the node smartfail process, respectively. First, the ‘isi devices node smartfail’ command kicks off the smartfail process on a node and removes it from the cluster.
# isi devices node smartfail -h Syntax # isi devices node smartfail [--node-lnn <integer>] [--force | -f] [--verbose | -v]
If necessary, the ‘isi devices node stopfail’ command can be used to discontinue the smartfail process on a node.
# isi devices node stopfail -h Syntax isi devices node stopfail [--node-lnn <integer>] [--force | -f] [--verbose | -v]
Similarly, individual drives within a node can be smartfailed with the ‘isi devices drive smartfail’ CLI command.
# isi devices drive smartfail { <bay> | --lnum <integer> | --sled <string> } [--node-lnn <integer>] [{--force | -f}] [{--verbose | -v}] [{--help | -h}]
Author: Nick Trimbee
Fri, 24 Jun 2022 18:22:15 -0000
|Read Time: 0 minutes
Traditionally, OneFS has used the SmartPools jobs to apply its file pool policies. To accomplish this, the SmartPools job visits every file, and the SmartPoolsTree job visits a tree of files. However, the scanning portion of these jobs can result in significant random impact to the cluster and lengthy execution times, particularly in the case of the SmartPools job. To address this, OneFS also provides the FilePolicy job, which offers a faster, lower impact method for applying file pool policies than the full-blown SmartPools job.
But first, a quick Job Engine refresher…
As we know, the Job Engine is OneFS’ parallel task scheduling framework, and is responsible for the distribution, execution, and impact management of critical jobs and operations across the entire cluster.
The OneFS Job Engine schedules and manages all data protection and background cluster tasks: creating jobs for each task, prioritizing them, and ensuring that inter-node communication and cluster wide capacity utilization and performance are balanced and optimized. Job Engine ensures that core cluster functions have priority over less important work and gives applications integrated with OneFS – Isilon add-on software or applications integrating to OneFS by means of the OneFS API – the ability to control the priority of their various functions to ensure the best resource utilization.
Each job, such as the SmartPools job, has an “Impact Profile” comprising a configurable policy and a schedule that characterizes how much of the system’s resources the job will take – plus an Impact Policy and an Impact Schedule. The amount of work a job has to do is fixed, but the resources dedicated to that work can be tuned to minimize the impact to other cluster functions, like serving client data.
Here’s a list of the specific jobs that are directly associated with OneFS SmartPools:
Job | Description |
SmartPools | Job that runs and moves data between the tiers of nodes within the same cluster. Also executes the CloudPools functionality if licensed and configured. |
SmartPoolsTree | Enforces SmartPools file policies on a subtree. |
FilePolicy | Efficient changelist-based SmartPools file pool policy job. |
IndexUpdate | Creates and updates an efficient file system index for FilePolicy job. |
SetProtectPlus | Applies the default file policy. This job is disabled if SmartPools is activated on the cluster. |
In conjunction with the IndexUpdate job, FilePolicy improves job scan performance by using a ‘file system index’, or changelist, to find files needing policy changes, rather than a full tree scan.
Avoiding a full treewalk dramatically decreases the amount of locking and metadata scanning work the job is required to perform, reducing impact on CPU and disk – albeit at the expense of not doing everything that SmartPools does. The FilePolicy job enforces just the SmartPools file pool policies, as opposed to the storage pool settings. For example, FilePolicy does not deal with changes to storage pools or storage pool settings, such as:
However, most of the time, SmartPools and FilePolicy perform the same work. Disabled by default, FilePolicy supports the full range of file pool policy features, reports the same information, and provides the same configuration options as the SmartPools job. Because FilePolicy is a changelist-based job, it performs best when run frequently – once or multiple times a day, depending on the configured file pool policies, data size, and rate of change.
Job schedules can easily be configured from the OneFS WebUI by navigating to Cluster Management > Job Operations, highlighting the desired job, and selecting ‘View\Edit’. The following example illustrates configuring the IndexUpdate job to run every six hours at a LOW impact level with a priority value of 5:
When enabling and using the FilePolicy and IndexUpdate jobs, the recommendation is to continue running the SmartPools job as well, but at a reduced frequency (monthly).
In addition to running on a configured schedule, the FilePolicy job can also be executed manually.
FilePolicy requires access to a current index. If the IndexUpdate job has not yet been run, attempting to start the FilePolicy job will fail with the error shown in the following figure. Instructions in the error message appear, prompting to run the IndexUpdate job first. When the index has been created, the FilePolicy job will run successfully. The IndexUpdate job can be run several times daily (that is, every six hours) to keep the index current and prevent the snapshots from getting large.
Consider using the FilePolicy job with the job schedules below for workflows and datasets with the following characteristics:
For clusters without the characteristics described above, the recommendation is to continue running the SmartPools job as usual and not to activate the FilePolicy job.
The following table provides a suggested job schedule when deploying FilePolicy:
Job | Schedule | Impact | Priority |
FilePolicy | Every day at 22:00 | LOW | 6 |
IndexUpdate | Every six hours, every day | LOW | 5 |
SmartPools | Monthly – Sunday at 23:00 | LOW | 6 |
Because no two clusters are the same, this suggested job schedule may require additional tuning to meet the needs of a specific environment.
Note that when clusters running older OneFS versions and the FSAnalyze job are upgraded to OneFS 8.2.x or later, the legacy FSAnalyze index and snapshots are removed and replaced by new snapshots the first time that IndexUpdate is run. The new index stores considerably more file and snapshot attributes than the old FSA index. Until the IndexUpdate job effects this change, FSA keeps running on the old index and snapshots.
Author: Nick Trimbee
Thu, 23 Jun 2022 15:51:46 -0000
|Read Time: 0 minutes
CloudPools 2.0 brings many improvements and is released along with OneFS 8.2.0. It’s valuable to be able to upgrade OneFS from 8.x to 8.2.x or later and leverage the data management benefits of CloudPools 2.0.
This blog describes the preparations for upgrading a CloudPools environment. The purpose is to avoid potential issues when upgrading OneFS from 8.x to 8.2.x or later (that is, from CloudPools 1.0 to CloudPools 2.0).
For the recommended procedure for upgrading a CloudPools environment, see the document PowerScale CloudPools: Upgrading 8.x to 8.2.2.x or later.
For the best practices and considerations for CloudPools upgrades, see the white paper Dell PowerScale: CloudPools and ECS.
This blog covers the preparations both on cloud providers and on PowerScale clusters.
CloudPools is a OneFS feature that allows customers to archive or tier data from a PowerScale cluster to cloud storage, including public cloud providers such as Amazon Web Services (AWS), Microsoft Azure, Google Cloud, Alibaba Cloud, or a private cloud based on Dell ECS.
Important: Run the isi cloud account list command to verify which cloud providers are used for CloudPools. Different authentications are used with different cloud providers for CloudPools, which might cause potential issues when upgrading a CloudPools environment.
AWS signature authentication is used for AWS, Dell ECS, and Google Cloud. In OneFS releases prior to 8.2, AWS SigV2 is only supported for CloudPools. Starting from OneFS 8.2, AWS SigV4 is added, which provides an extra level of security for authentication with the enhanced algorithm. For more information about V4, see Authenticating Requests: AWS Signature V4. AWS SigV4 will be used automatically for CloudPools in OneFS 8.2.x or later if the configurations (CloudPools and cloud providers) are correct. Please note that a different authentication is used for Azure or Alibaba Cloud.
If public cloud providers are used in a customer’s environment, there should be no issues because all configurations are already created by public cloud providers.
If Dell ECS is used in a customer’s environment, the ECS configurations are implemented separately and you need make sure that the configurations are correct on ECS, including load balancer and Domain Name System (DNS).
This section only covers the preparations for CloudPools and Dell ECS before upgrading OneFS from 8.x to 8.2.x or later.
In general, CloudPools may already be archiving a lot of data from a PowerScale (Isilon) cluster to ECS before an upgrade OneFS from 8.x to 8.2.x or later. That means that most of the configurations should be created for CloudPools. For more information about CloudPools and ECS, see the white paper Dell PowerScale: CloudPools and ECS.
This section covers the following configurations for ECS before a OneFS upgrade from 8.x to 8.2.x or later.
A load balancer balances traffic to the various ECS nodes from the PowerScale cluster, and can provide much better performance and throughput for CloudPools. A load balancer is strongly recommended for CloudPools 2.0 and ECS. The following white papers provide information about how to implement a load balancer with ECS:
AWS always has a wildcard DNS record configured. See the document Virtual hosting of buckets, which introduces path-style access and virtual hosted-style access for a bucket. It also shows how to associate a hostname with an Amazon S3 bucket using CNAMEs for virtual hosted-style access.
Meanwhile, the path-style URL will be deprecated on September 23, 2022. Buckets created after that date must be referenced using the virtual-hosted model. For the reasons behind moving to the virtual-hosted model, see the document Amazon S3 Path Deprecation Plan – The Rest of the Story.
ECS supports Amazon S3 compatible applications that use virtual host-style and path-style addressing schemes. (For more information, see document Bucket and namespace addressing.) And, to help ensure the proper DNS configuration for ECS, see the document DNS configuration.
The procedure for configuring DNS depends on your DNS server or DNS provider.
For example, a DNS is set up on a Windows server. The following two tables and three figures show the DNS entries created. The customer must create their own DNS entries.
Name | Record Type | FQDN | IP Address | Comment |
ecs | A | ecs.demo.local | 192.168.1.40 | The FQDN of the load balancer will be ecs.demo.local. |
Name | Record Type | FQDN | FQDN for | Comment |
cloudpools_uri | CNAME | cloudpools_uri.demo.local | ecs.demo.local | If you create an SSL certificate for the ECS S3 service, it must have the certificate and the non-wildcard version as a Subject Alternate Name. |
*.cloudpools_uri | CNAME | *.cloudpools_uri.demo.local | ecs.demo.local | Used for virtual host addressing for a bucket. |
In CloudPools 2.0 and ECS, a base URL must be created on ECS. For details about creating a Base URL on ECS, see the section Appendix A Base URL in the white paper Dell PowerScale: CloudPools and ECS.
When creating a new Base URL, keep the default setting (No) for Use with Namespace. Make sure that the Base URL is the FQDN alias of the load balancer virtual IP.
If SyncIQ is configured for CloudPools, run the following commands on the source and target PowerScale cluster to check and record the CloudPools configurations, including CloudPools storage accounts, CloudPool, file pool policies, and SyncIQ policies.
# isi cloud accounts list -v # isi cloud pools list -v # isi filepool policies list -v # isi sync policies list -v
For CloudPools and ECS, please be sure that URI is the FQDN alias of the load balancer virtual IP.
Important: It is strongly recommended that no job (such as for CloudPools/SmartPools, SyncIQ, and NDMP) be running before upgrading.
In a SyncIQ environment, upgrade the SyncIQ target cluster before upgrading the source cluster. OneFS allows SyncIQ to send CP1.0 formatted SmartLink files to the target, where they will be converted into CP2.0 formatted SmartLink files. (If the source cluster is upgraded first, Sync operations will fail until both are upgraded; the only known resolution is to reconfigure the Sync policy to "Deep Copy".)
And the customer may have active (read & write) CloudPools accounts both on source and target PowerScale clusters, replicating SmartLink files of active CloudPools accounts bidirectionally. That means that the source is also a target. In this case, you need to reconfigure the Sync policy to “Deep Copy” on one of PowerScale clusters. After that, the target with replicated SmartLink files should be upgraded first.
This blog covered what you need to check, on cloud providers and PowerScale clusters, before upgrading OneFS from 8.x to 8.2.x or later (that is, from CloudPools 1.0 to CloudPools 2.0). My hope is that it can help you avoid potential CloudPools issues when upgrading a CloudPools environment.
Author: Jason He, Principal Engineering Technologist
Tue, 21 Jun 2022 19:55:15 -0000
|Read Time: 0 minutes
Dell PowerScale OneFS 9.3.0.0 first introduced support for Secure Boot on the Dell Isilon A2000 platform. Now, OneFS 9.4.0.0 expands that support across the PowerScale A300, A3000, B100, F200, F600, F900, H700, H7000, and P100 platforms.
Secure Boot was introduced as part of the Unified Extensible Firmware Interface (UEFI) Forums of the UEFI 2.3.1 specification. The goal for Secure Boot is to ensure device security in the preboot environment by allowing only authorized EFI binaries to be loaded during the process.
The operating system boot loaders are signed with a digital signature. PowerScale Secure Boot takes the UEFI framework further by including the OneFS kernel and modules. The UEFI infrastructure is responsible for the EFI signature validation and binary loading within UEFI Secure Boot. Also, the FreeBSD veriexec function can perform signature validation for the boot loader and kernel. The PowerScale Secure Boot feature runs during the nodes’ bootup process only, using public-key cryptography to verify the signed code and ensure that only trusted code is loaded on the node.
PowerScale Secure Boot is available on the following platform:
Platform | NFP version | OneFS release |
Isilon A2000 | 11.4 or later | 9.3.0.0 or later |
PowerScale A300, A3000, B100, F200, F600, F900, H700, H7000, P100 | 11.4 or later | 9.3.0.0 or later |
Before configuring the PowerScale Secure Boot feature, consider the following:
For more information about configuring the PowerScale Secure Boot feature, see the document Dell PowerScale OneFS Secure Boot.
Author: Aqib Kazi
Tue, 21 Jun 2022 19:44:06 -0000
|Read Time: 0 minutes
There have been a couple of recent inquiries from the field about the SnapRevert job.
For context, SnapRevert is one of three main methods for restoring data from a OneFS snapshot. The options are shown here:
Method | Description |
Copy | Copying specific files and directories directly from the snapshot |
Clone | Cloning a file from the snapshot |
Revert | Reverting the entire snapshot using the SnapRevert job |
However, the most efficient of these approaches is the SnapRevert job, which automates the restoration of an entire snapshot to its top-level directory. This allows for quickly reverting to a previous, known-good recovery point (for example, if there is a virus outbreak). The SnapRevert job can be run from the Job Engine WebUI, and requires adding the desired snapshot ID.
There are two main components to SnapRevert:
So, what exactly is a SnapRevert domain? At a high level, a domain defines a set of behaviors for a collection of files under a specified directory tree. The SnapRevert domain is described as a restricted writer domain, in OneFS parlance. Essentially, this is a piece of extra filesystem metadata and associated locking that prevents a domain’s files from being written to while restoring a last known good snapshot.
Because the SnapRevert domain is essentially just a metadata attribute placed onto a file/directory, a best practice is to create the domain before there is data. This avoids having to wait for DomainMark (the aptly named job that marks a domain’s files) to walk the entire tree, setting that attribute on every file and directory within it.
The SnapRevert job itself actually uses a local SyncIQ policy to copy data out of the snapshot, discarding any changes to the original directory. When the SnapRevert job completes, the original data is left in the directory tree. In other words, after the job completes, the file system (HEAD) is exactly as it was at the point in time that the snapshot was taken. The LINs for the files or directories do not change because what is there is not a copy.
To manually run SnapRevert, go to the OneFS WebUI > Cluster Management > Job Operations > Job Types > SnapRevert, and click the Start Job button.
Also, you can adjust the job’s impact policy and relative priority, if desired.
Before a snapshot is reverted, SnapshotIQ creates a point-in-time copy of the data that is being replaced. This enables the snapshot revert to be undone later, if necessary.
Also, individual files, rather than entire snapshots, can also be restored in place using the isi_file_revert command-line utility.
# isi_file_revert usage: isi_file_revert -l lin -s snapid isi_file_revert -p path -s snapid -d (debug output) -f (force, no confirmation)
This can help drastically simplify virtual machine management and recovery, for example.
Before creating snapshots, it is worth considering that reverting a snapshot requires that a SnapRevert domain exist for the directory that is being restored. As such, we recommend that you create SnapRevert domains for those directories while the directories are empty. Creating a domain for an empty (or sparsely populated) directory takes considerably less time.
Files may belong to multiple domains. Each file stores a set of domain IDs indicating which domain they belong to in their inode’s extended attributes table. Files inherit this set of domain IDs from their parent directories when they are created or moved. The domain IDs refer to domain settings themselves, which are stored in a separate system B-tree. These B-tree entries describe the type of the domain (flags), and various other attributes.
As mentioned, a Restricted-Write domain prevents writes to any files except by threads that are granted permission to do so. A SnapRevert domain that does not currently enforce Restricted-Write shows up as (Writable) in the CLI domain listing.
Occasionally, a domain will be marked as (Incomplete). This means that the domain will not enforce its specified behavior. Domains created by the job engine are incomplete if not all files that are part of the domain are marked as being members of that domain. Since each file contains a list of domains of which it is a member, that list must be kept up to date for each file. The domain is incomplete until each file’s domain list is correct.
Besides SnapRevert, OneFS also uses domains for SyncIQ replication and SnapLock immutable archiving.
A SnapRevert domain must be created on a directory before it can be reverted to a particular point in time snapshot. As mentioned before, we recommend creating SnapRevert domains for a directory while the directory is empty.
The root path of the SnapRevert domain must be the same root path of the snapshot. For instance, a domain with a root path of /ifs/data/marketing cannot be used to revert a snapshot with a root path of /ifs/data/marketing/archive.
For example, for snapshot DailyBackup_04-27-2021_12:00 which is rooted at /ifs/data/marketing/archive, you would perform the following:
1. Set the SnapRevert domain by running the DomainMark job (which marks all files).
# isi job jobs start domainmark --root /ifs/data/marketing --dm-type SnapRevert
2. Verify that the domain has been created.
# isi_classic domain list –l
To restore a directory back to the state it was in at the point in time when a snapshot was taken, you need to:
To accomplish this, do the following:
1. Identify the ID of the snapshot you want to revert by running the isi snapshot snapshots view command and picking your point in time (PIT).
For example:
# isi snapshot snapshots view DailyBackup_04-27-2021_12:00 ID: 38 Name: DailyBackup_04-27-2021_12:00 Path: /ifs/data/marketing Has Locks: No Schedule: daily Alias: - Created: 2021-04-27T12:00:05 Expires: 2021-08-26T12:00:00 Size: 0b Shadow Bytes: 0b % Reserve: 0.00% % Filesystem: 0.00% State: active
2. Revert to a snapshot by running the isi job jobs start command. The following command reverts to snapshot ID 38 named DailyBackup_04-27-2021_12:00.
# isi job jobs start snaprevert --snapid 38
You can also perform this action from the WebUI. Go to Cluster Management > Job Operations > Job Types > SnapRevert, and click the Start Job button.
OneFS automatically creates a snapshot before the SnapRevert process reverts the specified directory tree. The naming convention for these snapshots is of the form: <snapshot_name>.pre_revert.*
# isi snap snap list | grep pre_revert 39 DailyBackup_04-27-2021_12:00.pre_revert.1655328160 /ifs/data/marketing
This allows for an easy rollback of a SnapRevert if the desired results are not achieved.
If a domain is currently preventing the modification or deletion of a file, a protection domain cannot be created on a directory that contains that file. For example, if files under /ifs/data/smartlock are set to a WORM state by a SmartLock domain, OneFS will not allow a SnapRevert domain to be created on /ifs/data/.
If desired or required, SnapRevert domains can also be deleted using the job engine CLI. For example, to delete the SnapRevert domain at /ifs/data/marketing:
# isi job jobs start domainmark --root /ifs/data/marketing --dm-type SnapRevert --delete
Author: Nick Trimbee
Thu, 16 Jun 2022 20:29:24 -0000
|Read Time: 0 minutes
Have you ever been confused about PowerScale OneFS file system multi-protocol data access? If so, this blog series will help you out. We’ll try to demystify OneFS multi-protocol data access. Different Network Attached Storage vendors have different designs for implementing multi-protocol data access. In OneFS multi-protocol data access, you can access the same set of data consistently with different operating systems and protocols.
To make it simple, the overall data access process in OneFS includes:
Finally, OneFS enforces the permissions on the target data for the user. This process evaluates the file permissions based on the user's access token and file share level permissions.
Does it sound simple but some details still confusing? Like, what exactly are UIDs, GIDs, and SIDs? What’s an access token? How does OneFS evaluate the file permissions? and so on. Don’t worry if you are not familiar with these concepts. Keep reading and we’ll explain!
To make it easier, we will start with OneFS file permissions, and then introduce OneFS access tokens. Finally, we will see how data access depends on the protocol you use.
In this blog series, we’ll cover the following topics:
Now let's have a look at OneFS file permissions. In a multi-protocol environment, the OneFS operating system is designed to support basic POSIX mode bits and Access Control Lists (ACLs). Therefore, two file permission states are designated:
POSIX mode bits only define three specific permissions: read(r), write(w), and execute(x). Meanwhile, there are three classes to which you can assign permissions: Owner, Group, and Others.
The ls -le command displays a file’s permissions; the ls -led command displays a directory’s permissions. If it has these permissions:
-rw-rw-r--
then:
-rw-rw-r-- means that the owner has read and write permissions
-rw-rw-r-- means that the group has read and write permissions
-rw-rw-r-- means that all others have only read permissions
In the following example for the file posix-file.txt, the file owner Joe has read and write access permissions, the file group Market has read and write access permissions, and all others only have read access permissions.
Also displayed here is the synthetic ACL (shown beneath the SYNTHETIC ACL flag) which indicates that the file is in the POSIX mode bit file permission state. There are three Access Control Entities (ACEs) created for the synthetic ACL, all of which is another way of representing the file’s POSIX mode bits permissions.
vonefs-aima-1# ls -le posix-file.txt -rw-rw-r-- 1 Joe Market 65 May 28 02:08 posix-file.txt OWNER: user:Joe GROUP: group:Market SYNTHETIC ACL 0: user:Joe allow file_gen_read,file_gen_write,std_write_dac 1: group:Market allow file_gen_read,file_gen_write 2: everyone allow file_gen_read
When OneFS receives a user access request, it generates an access token for the user and compares the token to the file permissions – in this case, the POSIX mode bits.
In contrast to POSIX mode bits, OneFS ACLs support more expressive permissions. (For all available permissions, which are listed in Table 1 through Table 3 of the documentation, see Access Control Lists on Dell EMC PowerScale OneFS.) A OneFS ACL consists of one or more Access Control Entries (ACEs). A OneFS ACE contains the following information:
For example, the ACE "0: group:Engineer allow file_gen_read,file_gen_execute" indicates that its index is 0, and allows the group called Engineer to have file_gen_read and file_gen_execute access permissions.
The following example shows a full ACL for a file. Although there is no SYNTHETIC ACL flag, there is a "+" character following the POSIX mode bits that indicates that the file is in the OneFS real ACL state. The file’s OneFS ACL grants full permission to users Joe and Bob. It also grants file_gen_read and file_gen_execute permissions to the group Market and to everyone. In this case, the POSIX mode bits are for representation only: you cannot tell the accurate file permissions from the approximate POSIX mode bits. You should therefore always rely on the OneFS ACL to check file permissions.
vonefs-aima-1# ls -le acl-file.txt -rwxrwxr-x + 1 Joe Market 69 May 28 01:08 acl-file.txt OWNER: user:Joe GROUP: group:Market 0: user:Joe allow file_gen_all 1: group:Market allow file_gen_read,file_gen_execute 2: user:Bob allow file_gen_all 3: everyone allow file_gen_read,file_gen_execute
No matter the OneFS file permission state, the on-disk identity for a file is always a UID, a GID, or an SID. So, for the above two files, file permissions stored on disk are:
vonefs-aima-1# ls -len posix-file.txt -rw-rw-r-- 1 2001 2003 65 May 28 02:08 posix-file.txt OWNER: user:2001 GROUP: group:2003 SYNTHETIC ACL 0: user:2001 allow file_gen_read,file_gen_write,std_write_dac 1: group:2003 allow file_gen_read,file_gen_write 2: SID:S-1-1-0 allow file_gen_read vonefs-aima-1# ls -len acl-file.txt -rwxrwxr-x + 1 2001 2003 69 May 28 01:08 acl-file.txt OWNER: user:2001 GROUP: group:2003 0: user:2001 allow file_gen_all 1: group:2003 allow file_gen_read,file_gen_execute 2: user:2002 allow file_gen_all 3: SID:S-1-1-0 allow file_gen_read,file_gen_execute
When OneFS receives a user access request, it generates an access token for the user and compares the token to the file permissions. OneFS grants access when the file permissions include an ACE that allows the identity in the token to access the file, and does not include an ACE that denies the identity access.
When evaluating the file permission for a user's access token, OneFS checks the ACEs one by one by following the ACEs index order and stops checking when the following conditions are met:
Let’s say we have a file named acl-file01.txt that has the file permissions shown below. When user Bob tries to read the data of the file, OneFS checks the ACEs from index 0 to index 3. When checking ACE index 1, it explicitly denies Bob read data permissions. The ACLs then stop checking, and read access is denied.
vonefs-aima-1# ls -le acl-file01.txt -rwxrw-r-- + 1 Joe Market 12 May 28 06:19 acl-file01.txt OWNER: user:Joe GROUP: group:Market 0: user:Joe allow file_gen_all 1: user:Bob deny file_gen_read 2: user:Bob allow file_gen_read,file_gen_write 3: everyone allow file_gen_read
Now let’s say that we still have the file named acl-file01.txt, but the file permissions are now a little different, as shown below. When user Bob tries to read the data of the file, OneFS checks the ACEs from index 0 to index 3. When checking ACE index 1, it explicitly allows Bob to have read permissions. The ACLs checking process therefore ends, and read access is authorized. Therefore, it is recommended to put all “deny” ACEs in front of “allow” ACEs if you want to explicitly deny specific permissions for specific users/groups.
vonefs-aima-1# ls -le acl-file01.txt -rwxrw-r-- + 1 Joe Market 12 May 28 06:19 acl-file01.txt OWNER: user:Joe GROUP: group:Market 0: user:Joe allow file_gen_all 1: user:Bob allow file_gen_read,file_gen_write 2: user:Bob deny file_gen_read 3: everyone allow file_gen_read
As mentioned before, a file can only be in one state at a time. However, the file permission state of the file may be flipped. If a file is in POSIX, it can be flipped to an ACL file by modifying the permissions using SMB/NFSv4 clients or by using the chmod command in OneFS. If a file is in ACL, it can be flipped to a POSIX file, by using the OneFS CLI command: chmod –b XXX <filename>. The ‘XXX’ specifies the new POSIX permission. For more examples, see File permission state changes.
Now, you should be able to check a file’s permission on OneFS with the command ls -len filename, and check a directory’s permissions on OneFS with the command ls -lend directory_name.
In my next blog, we will cover what an access token is and how to check a user’s access token!
Author: Lieven Lin
Thu, 12 May 2022 14:22:45 -0000
|Read Time: 0 minutes
Recently a customer contacted us to tell us that he thought that there was an error in the output of the OneFS CLI command ‘isi_cstats’. Starting with OneFS 9.3, the ‘isi_cstats’ command includes the accounted number of inlined files within /ifs. It also contains a statistic called “Total inlined data savings”.
This customer expected that the ‘Total inlined data savings’ number was simply ‘Total inlined files’ multiplied by 8KB. The reason he thought this number was wrong was that this number does not consider the protection level.
In OneFS, for the 2d:1n protection level, each file smaller than 128KB is stored as 3X mirrors. Take the screenshot below as an example.
If we do some calculation here,
379,948,336 * 8KB = 3,039,586,688KiB = 2898.78GiB
we can see that the 2,899GiB from the command output is calculated as one block per inlined file. So, in our example, the customer would think that ‘Total inlined data savings’ should report 2898.78 GiB * 3, because of the 2d:1n protection level.
Well, this statistic is not the actual savings, it is really the logical on-disk cost for all inlined files. We can't accurately report the physical savings because it depends on the non-inlined protection overhead, which can vary. For example:
One more thing to consider, if a file is smaller than 8KB after compression, it will be inlined into an inode as well. Therefore, this statistic doesn't represent logical savings either, because it doesn't take compression into account. To report the logical savings, total logical size for all inlined files should be tracked.
To avoid any confusion, we plan to rename this statistic to “Total inline data” in the next version of OneFS. We also plan to show more useful information about total logical data of inlined files, in addition to “Total inline data”.
For more information about the reporting of data reduction features, see the white paper PowerScale OneFS: Data Reduction and Storage Efficiency on the Info Hub.
Author: Yunlong Zhang, Principal Engineering Technologist
Wed, 04 May 2022 14:36:26 -0000
|Read Time: 0 minutes
Among the objectives of OneFS reduction and efficiency reporting is to provide ‘industry standard’ statistics, allowing easier comprehension of cluster efficiency. It’s an ongoing process, and prior to OneFS 9.2 there was limited tracking of certain filesystem statistics – particularly application physical and filesystem logical – which meant that data reduction and storage efficiency ratios had to be estimated. This is no longer the case, and OneFS 9.2 and later provides accurate data reduction and efficiency metrics at a per-file, quota, and cluster-wide granularity.
The following table provides descriptions for the various OneFS reporting metrics, while also attempting to rationalize their naming conventions with other general industry terminology:
OneFS Metric | Also Known As | Description |
Protected logical | Application logical | Data size including sparse data, zero block eliminated data, and CloudPools data stubbed to a cloud tier. |
Logical data | Effective
Filesystem logical | Data size excluding protection overhead and sparse data, and including data efficiency savings (compression and deduplication). |
Zero-removal saved |
| Capacity savings from zero removal. |
Dedupe saved |
| Capacity savings from deduplication. |
Compression saved |
| Capacity savings from in-line compression. |
Preprotected physical | Usable
Application physical | Data size excluding protection overhead and including storage efficiency savings. |
Protection overhead |
| Size of erasure coding used to protect data. |
Protected physical | Raw
Filesystem physical | Total footprint of data including protection overhead FEC erasure coding) and excluding data efficiency savings (compression and deduplication). |
Dedupe ratio |
| Deduplication ratio. Will be displayed as 1.0:1 if there are no deduplicated blocks on the cluster. |
Compression ratio |
| Usable reduction ratio from compression, calculated by dividing ‘logical data’ by ‘preprotected physical’ and expressed as x:1. |
Inlined data ratio |
| Efficiency ratio from storing small files’ data within their inodes, thereby not requiring any data or protection blocks for their storage. |
Data reduction ratio | Effective to Usable | Usable efficiency ratio from compression and deduplication. Will display the same value as the compression ratio if there is no deduplication on the cluster. |
Efficiency ratio | Effective to Raw | Overall raw efficiency ratio expressed as x:1 |
So let’s take these metrics and look at what they represent and how they’re calculated.
(Note that filesystem logical was not accurately tracked in releases prior to OneFS 9.2, so metrics prior to this were somewhat estimated.)
With the enhanced data reduction reporting in OneFS 9.2 and later, the actual statistics themselves are largely the same, just calculated more accurately.
The storage efficiency data was available in releases prior to OneFS 9.2, albeit somewhat estimated, but the data reduction metrics were introduced with OneFS 9.2.
The following tools are available to query these reduction and efficiency metrics at file, quota, and cluster-wide granularity:
Realm | OneFS Command | OneFS Platform API |
File | isi get -D | |
Quota | isi quota list -v | 12/quota/quotas |
Cluster-wide | isi statistics data-reduction | 1/statistics/current?key=cluster.data.reduce.* |
Detailed Cluster-wide | isi_cstats | 1/statistics/current?key=cluster.cstats.* |
Note that the ‘isi_cstats’ CLI command provides some additional, behind-the-scenes details. The interface goes through platform API to fetch these stats.
The ‘isi statistics data-reduction’ CLI command is the most comprehensive of the data reduction reporting CLI utilities. For example:
# isi statistics data-reduction Recent Writes Cluster Data Reduction (5 mins) --------------------- ------------- ---------------------- Logical data 6.18M 6.02T Zero-removal saved 0 - Deduplication saved 56.00k 3.65T Compression saved 4.16M 1.96G Preprotected physical 1.96M 2.37T Protection overhead 5.86M 910.76G Protected physical 7.82M 3.40T Zero removal ratio 1.00 : 1 - Deduplication ratio 1.01 : 1 2.54 : 1 Compression ratio 3.12 : 1 1.02 : 1 Data reduction ratio 3.15 : 1 2.54 : 1 Inlined data ratio 1.04 : 1 1.00 : 1 Efficiency ratio 0.79 : 1 1.77 : 1
The ‘recent writes’ data in the first column provides precise statistics for the five-minute period prior to running the command. By contrast, the ‘cluster data reduction’ metrics in the second column are slightly less real-time but reflect the overall data and efficiencies across the cluster. Be aware that, in OneFS 9.1 and earlier, the right-hand column metrics are designated by the ‘Est’ prefix, denoting an estimated value. However, in OneFS 9.2 and later, the ‘logical data’ and ‘preprotected physical’ metrics are tracked and reported accurately, rather than estimated.
The ratio data in each column is calculated from the values above it. For instance, to calculate the data reduction ratio, the ‘logical data’ (effective) is divided by the ‘preprotected physical’ (usable) value. From the output above, this would be:
6.02 / 2.37 = 1.76 Or a Data Reduction ratio of 2.54:1
Similarly, the ‘efficiency ratio’ is calculated by dividing the ‘logical data’ (effective) by the ‘protected physical’ (raw) value. From the output above, this yields:
6.02 / 3.40 = 0.97 Or an Efficiency ratio of 1.77:1
OneFS SmartQuotas reports the capacity saving from in-line data reduction as a storage efficiency ratio. SmartQuotas reports efficiency as a ratio across the desired data set as specified in the quota path field. The efficiency ratio is for the full quota directory and its contents, including any overhead, and reflects the net efficiency of compression and deduplication. On a cluster with licensed and configured SmartQuotas, this efficiency ratio can be easily viewed from the WebUI by navigating to File System > SmartQuotas > Quotas and Usage. In OneFS 9.2 and later, in addition to the storage efficiency ratio, the data reduction ratio is also displayed.
Similarly, the same data can be accessed from the OneFS command line by using the ‘isi quota quotas list’ CLI command. For example:
# isi quota quotas list Type AppliesTo Path Snap Hard Soft Adv Used Reduction Efficiency ---------------------------------------------------------------------------- directory DEFAULT /ifs No - - - 6.02T 2.54 : 1 1.77 : 1 ----------------------------------------------------------------------------
Total: 1
More detail, including both the physical (raw) and logical (effective) data capacities, is also available by using the ‘isi quota quotas view <path> <type>’ CLI command. For example:
# isi quota quotas view /ifs directory Path: /ifs Type: directory Snapshots: No Enforced: No Container: No Linked: No Usage Files: 5759676 Physical(With Overhead): 6.93T FSPhysical(Deduplicated): 3.41T FSLogical(W/O Overhead): 6.02T AppLogical(ApparentSize): 6.01T ShadowLogical: - PhysicalData: 2.01T Protection: 781.34G Reduction(Logical/Data): 2.54 : 1 Efficiency(Logical/Physical): 1.77 : 1
To configure SmartQuotas for in-line data efficiency reporting, create a directory quota at the top-level file system directory of interest, for example /ifs. Creating and configuring a directory quota is a simple procedure and can be performed from the WebUI by navigating to File System > SmartQuotas > Quotas and Usage and selecting Create a Quota. In the Create a quota dialog, set the Quota type to ‘Directory quota’, add the preferred top-level path to report on, select ’Application logical size’ for Quota Accounting, and set the Quota Limits to ‘Track storage without specifying a storage limit’. Finally, click the ‘Create Quota’ button to confirm the configuration and activate the new directory quota.
The efficiency ratio is a single, current-in time efficiency metric that is calculated per quota directory and includes the sum of in-line compression, zero block removal, in-line dedupe, and SmartDedupe. This is in contrast to a history of stats over time, as reported in the ‘isi statistics data-reduction’ CLI command output, described above. As such, the efficiency ratio for the entire quota directory will reflect what is actually there.
Author: Nick Trimbee
Thu, 12 May 2022 14:48:01 -0000
|Read Time: 0 minutes
Among the features and functionality delivered in the new OneFS 9.4 release is the promotion of in-line dedupe to enabled by default, further enhancing PowerScale’s dollar-per-TB economics, rack density and value.
Part of the OneFS data reduction suite, in-line dedupe initially debuted in OneFS 8.2.1. However, it was enabled manually, so many customers simply didn’t use it. But with this enhancement, new clusters running OneFS 9.4 now have in-line dedupe enabled by default.
Cluster configuration | In-line dedupe | In-line compression |
New cluster running OneFS 9.4 | Enabled | Enabled |
New cluster running OneFS 9.3 or earlier | Disabled | Enabled |
Cluster with in-line dedupe enabled that is upgraded to OneFS 9.4 | Enabled | Enabled |
Cluster with in-line dedupe disabled that is upgraded to OneFS 9.4 | Disabled | Enabled |
That said, any clusters that upgrade to 9.4 will not see any change to their current in-line dedupe config during upgrade. Also, there is also no change to the behavior for in-line compression, which remains enabled by default in all OneFS versions from 8.1.3 onwards.
But before we examine the-under-the-hood changes in OneFS 9.4, let’s have a quick dedupe refresher.
Currently, OneFS in-line data reduction, which encompasses compression, dedupe, and zero block removal, is supported on the F900, F600, and F200 all-flash nodes, plus the F810, H5600, H700/7000, and A300/3000 Gen6.x chassis.
Within the OneFS data reduction pipeline, zero block removal is performed first, followed by dedupe, and then compression. This order allows each phase to reduce the scope of work each subsequent phase.
Unlike SmartDedupe, which performs deduplication once data has been written to disk, or post-process, in-line dedupe acts in real time, deduplicating data as is ingested into the cluster. Storage efficiency is achieved by scanning the data for identical blocks as it is received and then eliminating the duplicates.
When in-line dedupe discovers a duplicate block, it moves a single copy of the block to a special set of files known as shadow stores. These are file-system containers that allow data to be stored in a sharable manner. As such, files stored under OneFS can contain both physical data and pointers, or references, to shared blocks in shadow stores.
Shadow stores are similar to regular files but are hidden from the file system namespace, so they cannot be accessed through a pathname. A shadow store typically grows to a maximum size of 2 GB, which is around 256 K blocks, and each block can be referenced by 32,000 files. If the reference count limit is reached, a new block is allocated, which may or may not be in the same shadow store. Also, shadow stores do not reference other shadow stores. And snapshots of shadow stores are not permitted because the data contained in shadow stores cannot be overwritten.
When a client writes a file to a node pool configured for in-line dedupe on a cluster, the write operation is divided up into whole 8 KB blocks. Each block is hashed, and its cryptographic ‘fingerprint’ is compared against an in-memory index for a match. At this point, one of the following will happen:
For in-line dedupe to perform on a write operation, the following conditions need to be true:
OneFS in-line dedupe uses the 128-bit CityHash algorithm, which is both fast and cryptographically strong. This contrasts with the OneFS post-process SmartDedupe, which uses SHA-1 hashing.
Each node in a cluster with in-line dedupe enabled has its own in-memory hash index that it compares block fingerprints against. The index lives in system RAM and is allocated using physically contiguous pages and is accessed directly with physical addresses. This avoids the need to traverse virtual memory mappings and does not incur the cost of translation lookaside buffer (TLB) misses, minimizing dedupe performance impact.
The maximum size of the hash index is governed by a pair of sysctl settings, one of which caps the size at 16 GB, and the other which limits the maximum size to 10% of total RAM. The strictest of these two constraints applies. While these settings are configurable, the recommended best practice is to use the default configuration. Any changes to these settings should only be performed under the supervision of Dell support.
Since in-line dedupe and SmartDedupe use different hashing algorithms, the indexes for each are not shared directly. However, the work performed by each dedupe solution can be used by each other. For instance, if SmartDedupe writes data to a shadow store, when those blocks are read, the read-hashing component of in-line dedupe sees those blocks and indexes them.
When a match is found, in-line dedupe performs a byte-by-byte comparison of each block to be shared to avoid the potential for a hash collision. Data is prefetched before the byte-by-byte check and is compared against the L1 cache buffer directly, avoiding unnecessary data copies and adding minimal overhead. Once the matching blocks are compared and verified as identical, they are shared by writing the matching data to a common shadow store and creating references from the original files to this shadow store.
In-line dedupe samples every whole block that is written and handles each block independently, so it can aggressively locate block duplicity. If a contiguous run of matching blocks is detected, in-line dedupe merges the results into regions and processes them efficiently.
In-line dedupe also detects dedupe opportunities from the read path, and blocks are hashed as they are read into L1 cache and inserted into the index. If an existing entry exists for that hash, in-line dedupe knows there is a block-sharing opportunity between the block it just read and the one previously indexed. It combines that information and queues a request to an asynchronous dedupe worker thread. As such, it is possible to deduplicate a data set purely by reading it all. To help mitigate the performance impact, the hashing is performed out-of-band in the prefetch path, rather than in the latency-sensitive read path.
The original in-line dedupe control path design had its limitations, since it did not provide gconfig control settings for the default-disabled in-line dedupe. The previous control-path logic had no gconfig control settings for default-disabled in-line dedupe. But in OneFS 9.4, there are now two separate features that interact together to distinguish between a new cluster or an upgrade to an existing cluster configuration:
For the first feature, upon upgrade to 9.4 on an existing cluster, if there is no in-line dedupe config present, the upgrade explicitly sets it to disabled in gconfig. This has no effect on an existing cluster since it’s already disabled. Similarly, if the upgrading cluster already has an existing in-line dedupe setting in gconfig, OneFS takes no action.
For the other half of the functionality, when booting OneFS 9.4, a node looks in gconfig to see if there’s an in-line dedupe setting. If no config is present, OneFS enables it by default. Therefore, new OneFS 9.4 clusters automatically enable dedupe, and existing clusters retain their legacy setting upon upgrade.
Since the in-line dedupe configuration is binary (either on or off across a cluster), you can easily control it manually through the OneFS command line interface (CLI). As such, the isi dedupe inline settings modify CLI command can either enable or disable dedupe at will—before, during, or after the upgrade. It doesn’t matter.
For example, you can globally disable in-line dedupe and verify it using the following CLI command:
# isi dedupe inline settings viewMode: enabled# isi dedupe inline settings modify –-mode disabled # isi dedupe inline settings view Mode: disabled
Similarly, the following syntax enables in-line dedupe:
# isi dedupe inline settings view Mode: disabled # isi dedupe inline settings modify –-mode enabled # isi dedupe inline settings view