Home Storage PowerScale (Isilon) Blogs

OneFS Snapshot Security

Fri, 21 Apr 2023 17:11:00 -0000

Read Time: 0 minutes

In this era of elevated cyber-crime and data security threats, there is increasing demand for immutable, tamper-proof snapshots. Often this need arises as part of a broader security mandate, ideally proactively, but oftentimes as a response to a security incident. OneFS addresses this requirement in the following ways:

On-cluster	Off-cluster
Read-only snapshots Snapshot locks Role-based administration	SyncIQ snapshot replication Cyber-vaulting

Read-only snapshots

At its core, OneFS SnapshotIQ generates read-only, point-in-time, space efficient copies of a defined subset of a cluster’s data.

Only the changed blocks of a file are stored when updating OneFS snapshots, ensuring efficient storage utilization. They are also highly scalable and typically take less than a second to create, while generating little performance overhead. As such, the RPO (recovery point objective) and RTO (recovery time objective) of a OneFS snapshot can be very small and highly flexible, with the use of rich policies and schedules.

OneFS Snapshots are created manually, on a schedule, or automatically generated by OneFS to facilitate system operations. But whatever the generation method, when a snapshot has been taken, its contents cannot be manually altered.

Snapshot Locks

In addition to snapshot contents immutability, for an enhanced level of tamper-proofing, SnapshotIQ also provides the ability to lock snapshots with the ‘isi snapshot locks’ CLI syntax. This prevents snapshots from being accidentally or unintentionally deleted.

For example, a manual snapshot, ‘snaploc1’ is taken of /ifs/test:

# isi snapshot snapshots create /ifs/test --name snaploc1
# isi snapshot snapshots list | grep snaploc1
79188 snaploc1                                     /ifs/test

A lock is then placed on it (in this case lock ID=1):

# isi snapshot locks create snaplock1
# isi snapshot locks list snaploc1
ID
----
1
----
Total: 1

Attempts to delete the snapshot fail because the lock prevents its removal:

# isi snapshot snapshots delete snaploc1
Are you sure? (yes/[no]): yes
Snapshot "snaploc1" can't be deleted because it is locked

The CLI command ‘isi snapshot locks delete <lock_ID>’ can be used to clear existing snapshot locks, if desired. For example, to remove the only lock (ID=1) from snapshot ‘snaploc1’:

# isi snapshot locks list snaploc1
ID
----
1
----
Total: 1
# isi snapshot locks delete snaploc1 1
Are you sure you want to delete snapshot lock 1 from snaploc1? (yes/[no]): yes
# isi snap locks view snaploc1 1
No such lock

When the lock is removed, the snapshot can then be deleted:

# isi snapshot snapshots delete snaploc1
Are you sure? (yes/[no]): yes
# isi snapshot snapshots list| grep -i snaploc1 | wc -l
       0

Note that a snapshot can have up to a maximum of sixteen locks on it at any time. Also, lock numbers are continually incremented and not recycled upon deletion.

Like snapshot expiration, snapshot locks can also have an expiration time configured. For example, to set a lock on snapshot ‘snaploc1’ that expires at 1am on April 1st, 2024:

# isi snap lock create snaploc1 --expires '2024-04-01T01:00:00'
# isi snap lock list snaploc1
ID
----
36
----
Total: 1
# isi snap lock view snaploc1 33
     ID: 36
Comment:
Expires: 2024-04-01T01:00:00
  Count: 1

Note that if the duration period of a particular snapshot lock expires but others remain, OneFS will not delete that snapshot until all the locks on it have been deleted or expired.

The following table provides an example snapshot expiration schedule, with monthly locked snapshots to prevent deletion:

Snapshot Frequency	Snapshot Time	Snapshot Expiration	Max Retained Snapshots
Every other hour	Start at 12:00AM End at 11:59AM	1 day	27
Every day	At 12:00AM	1 week
Every week	Saturday at 12:00AM	1 month
Every month	First Saturday of month at 12:00AM	Locked

Roles-based Access Control

Read-only snapshots plus locks present physically secure snapshots on a cluster. However, if you can login to the cluster and have the required elevated administrator privileges to do so, you can still remove locks and/or delete snapshots.

Because data security threats come from inside an environment as well as out, such as from a disgruntled IT employee or other internal bad actor, another key to a robust security profile is to constrain the use of all-powerful ‘root’, ‘administrator’, and ‘sudo’ accounts as much as possible. Instead, of granting cluster admins full rights, a preferred security best practice is to leverage the comprehensive authentication, authorization, and accounting framework that OneFS natively provides.

OneFS role-based access control (RBAC) can be used to explicitly limit who has access to manage and delete snapshots. This granular control allows you to craft administrative roles that can create and manage snapshot schedules, but prevent their unlocking and/or deletion. Similarly, lock removal and snapshot deletion can be isolated to a specific security role (or to root only).

A cluster security administrator selects the desired access zone, creates a zone-aware role within it, assigns privileges, and then assigns members.

For example, from the WebUI under Access > Membership and roles > Roles:

When these members access the cluster through the WebUI, PlatformAPI, or CLI, they inherit their assigned privileges.

The specific privileges that can be used to segment OneFS snapshot management include:

Privilege	Description
ISI_PRIV_SNAPSHOT_ALIAS	Aliasing for snapshots
ISI_PRIV_SNAPSHOT_LOCKS	Locking of snapshots from deletion
ISI_PRIV_SNAPSHOT_PENDING	Upcoming snapshot based on schedules
ISI_PRIV_SNAPSHOT_RESTORE	Restoring directory to a particular snapshot
ISI_PRIV_SNAPSHOT_SCHEDULES	Scheduling for periodic snapshots
ISI_PRIV_SNAPSHOT_SETTING	Service and access settings
ISI_PRIV_SNAPSHOT_SNAPSHOTMANAGEMENT	Manual snapshots and locks
ISI_PRIV_SNAPSHOT_SNAPSHOT_SUMMARY	Snapshot summary and usage details

Each privilege can be assigned one of four permission levels for a role, including:

Permission Indicator	Description
–	No permission
R	Read-only permission
X	Execute permission
W	Write permission

The ability for a user to delete a snapshot is governed by the ‘ISI_PRIV_SNAPSHOT_SNAPSHOTMANAGEMENT’ privilege. Similarly, the ‘ISI_PRIV_SNAPSHOT_LOCKS’ privilege governs lock creation and removal.

In the following example, the ‘snap’ role has ‘read’ rights for the ‘ISI_PRIV_SNAPSHOT_LOCKS’ privilege, allowing a user associated with this role to view snapshot locks:

# isi auth roles view snap | grep -I -A 1 locks
             ID: ISI_PRIV_SNAPSHOT_LOCKS
     Permission: r
--
# isi snapshot locks list snaploc1
ID
----
1
----
Total: 1

However, attempts to remove the lock ‘ID 1’ from the ‘snaploc1’ snapshot fail without write privileges:

# isi snapshot locks delete snaploc1 1
Privilege check failed. The following write privilege is required: Snapshot locks (ISI_PRIV_SNAPSHOT_LOCKS)

Write privileges are added to ‘ISI_PRIV_SNAPSHOT_LOCKS’ in the ‘’snaploc1’ role:

# isi auth roles modify snap –-add-priv-write ISI_PRIV_SNAPSHOT_LOCKS
# isi auth roles view snap | grep -I -A 1 locks
             ID: ISI_PRIV_SNAPSHOT_LOCKS
     Permission: w
--

This allows the lock ‘ID 1’ to be successfully deleted from the ‘snaploc1’ snapshot:

# isi snapshot locks delete snaploc1 1
Are you sure you want to delete snapshot lock 1 from snaploc1? (yes/[no]): yes
# isi snap locks view snaploc1 1
No such lock

Using OneFS RBAC, an enhanced security approach for a site could be to create three OneFS roles on a cluster, each with an increasing realm of trust:

1. First, an IT ops/helpdesk role with ‘read’ access to the snapshot attributes would permit monitoring and troubleshooting, but no changes:

Snapshot Privilege	Description
ISI_PRIV_SNAPSHOT_ALIAS	Read
ISI_PRIV_SNAPSHOT_LOCKS	Read
ISI_PRIV_SNAPSHOT_PENDING	Read
ISI_PRIV_SNAPSHOT_RESTORE	Read
ISI_PRIV_SNAPSHOT_SCHEDULES	Read
ISI_PRIV_SNAPSHOT_SETTING	Read
ISI_PRIV_SNAPSHOT_SNAPSHOTMANAGEMENT	Read
ISI_PRIV_SNAPSHOT_SNAPSHOT_SUMMARY	Read

2. Next, a cluster admin role, with ‘read’ privileges for ‘ISI_PRIV_SNAPSHOT_LOCKS’ and ‘ISI_PRIV_SNAPSHOT_SNAPSHOTMANAGEMENT’ would prevent snapshot and lock deletion, but provide ‘write’ access for schedule configuration, restores, and so on.

Snapshot Privilege	Description
ISI_PRIV_SNAPSHOT_ALIAS	Write
ISI_PRIV_SNAPSHOT_LOCKS	Read
ISI_PRIV_SNAPSHOT_PENDING	Write
ISI_PRIV_SNAPSHOT_RESTORE	Write
ISI_PRIV_SNAPSHOT_SCHEDULES	Write
ISI_PRIV_SNAPSHOT_SETTING	Write
ISI_PRIV_SNAPSHOT_SNAPSHOTMANAGEMENT	Read
ISI_PRIV_SNAPSHOT_SNAPSHOT_SUMMARY	Write

3. Finally, a cluster security admin role (root equivalence) would provide full snapshot configuration and management, lock control, and deletion rights:

Snapshot Privilege	Description
ISI_PRIV_SNAPSHOT_ALIAS	Write
ISI_PRIV_SNAPSHOT_LOCKS	Write
ISI_PRIV_SNAPSHOT_PENDING	Write
ISI_PRIV_SNAPSHOT_RESTORE	Write
ISI_PRIV_SNAPSHOT_SCHEDULES	Write
ISI_PRIV_SNAPSHOT_SETTING	Write
ISI_PRIV_SNAPSHOT_SNAPSHOTMANAGEMENT	Write
ISI_PRIV_SNAPSHOT_SNAPSHOT_SUMMARY	Write

Note that when configuring OneFS RBAC, remember to remove the ‘ISI_PRIV_AUTH’ and ‘ISI_PRIV_ROLE’ privilege from all but the most trusted administrators.

Additionally, enterprise security management tools such as CyberArk can also be incorporated to manage authentication and access control holistically across an environment. These can be configured to frequently change passwords on trusted accounts (that is, every hour or so), require multi-Level approvals prior to retrieving passwords, and track and audit password requests and trends.

While this article focuses exclusively on OneFS snapshots, the expanded use of RBAC granular privileges for enhanced security is germane to most key areas of cluster management and data protection, such as SyncIQ replication, and so on.

Snapshot replication

In addition to using snapshots for its own checkpointing system, SyncIQ, the OneFS data replication engine, supports snapshot replication to a target cluster.

OneFS SyncIQ replication policies contain an option for triggering a replication policy when a snapshot of the source directory is completed. Additionally, at the onset of a new policy configuration, when the “Whenever a Snapshot of the Source Directory is Taken” option is selected, a checkbox appears to enable any existing snapshots in the source directory to be replicated. More information is available in this SyncIQ paper.

Cyber-vaulting

File data is arguably the most difficult to protect, because:

It is the only type of data where potentially all employees have a direct connection to the storage (with the other type of storage it’s through an application)
File data is linked (or mounted) to the operating system of the client. This means that it’s sufficient to gain file access to the OS to get access to potentially critical data.
Users are the largest breach points.

The Cyber Security Framework (CSF) from the National Institute of Standards and Technology (NIST) categorizes the threat through recovery process:

Within the ‘Protect’ phase, there are two core aspects:

Applying all the core protection features available on the OneFS platform, namely:

Feature	Description
Access control	Where the core data protection functions are being executed. Assess who actually needs write access.
Immutability	Having immutable snapshots, replica versions, and so on. Augmenting backup strategy with an archiving strategy with SmartLock WORM.
Encryption	Encrypting both data in-flight and data at rest.
Anti-virus	Integrating with anti-virus/anti-malware protection that does content inspection.
Security advisories	Dell Security Advisories (DSA) inform customers about fixes to common vulnerabilities and exposures.

Data isolation provides a last resort copy of business critical data, and can be achieved by using an air gap to isolate the cyber vault copy of the data. The vault copy is logically separated from the production copy of the data. Data syncing happens only intermittently by closing the air gap after ensuring that there are no known issues.

The combination of OneFS snapshots and SyncIQ replication allows for granular data recovery. This means that only the affected files are recovered, while the most recent changes are preserved for the unaffected data. While an on-prem air-gapped cyber vault can still provide secure network isolation, in the event of an attack, the ability to failover to a fully operational ‘clean slate’ remote site provides additional security and peace of mind.

We’ll explore PowerScale cyber protection and recovery in more depth in a future article.

Author: Nick Trimbee

Tags:

Related Blog Posts

security PowerScale OneFS HTTP

OneFS and HTTP Security

Mon, 22 Apr 2024 20:35:30 -0000

Read Time: 0 minutes

To enable granular HTTP security configuration, OneFS provides an option to disable nonessential HTTP components selectively. This can help reduce the overall attack surface of your infrastructure. Disabling a specific component’s service still allows other essential services on the cluster to continue to run unimpeded. In OneFS 9.4 and later, you can disable the following nonessential HTTP services:

Service	Description
PowerScaleUI	The OneFS WebUI configuration interface.
Platform-API-External	External access to the OneFS platform API endpoints.
Rest Access to Namespace (RAN)	REST-ful access by HTTP to a cluster’s /ifs namespace.
RemoteService	Remote Support and In-Product Activation.
SWIFT (deprecated)	Deprecated object access to the cluster using the SWIFT protocol. This has been replaced by the S3 protocol in OneFS.

You can enable or disable each of these services independently, using the CLI or platform API, if you have a user account with the ISI_PRIV_HTTP RBAC privilege.

You can use the isi http services CLI command set to view and modify the nonessential HTTP services:

# isi http services list
ID                     Enabled
------------------------------
Platform-API-External Yes
PowerScaleUI          Yes
RAN                   Yes
RemoteService         Yes
SWIFT                 No
------------------------------
Total: 5

For example, you can easily disable remote HTTP access to the OneFS /ifs namespace as follows:

# isi http services modify RAN --enabled=0

You are about to modify the service RAN. Are you sure? (yes/[no]): yes

Similarly, you can also use the WebUI to view and edit a subset of the HTTP configuration settings, by navigating to Protocols > HTTP settings:

That said, the implications and impact of disabling each of the services is as follows:

Service	Disabling impacts
WebUI	The WebUI is completely disabled, and access attempts (default TCP port 8080) are denied with the warning Service Unavailable. Please contact Administrator. If the WebUI is re-enabled, the external platform API service (Platform-API-External) is also started if it is not running. Note that disabling the WebUI does not affect the PlatformAPI service.
Platform API	External API requests to the cluster are denied, and the WebUI is disabled, because it uses the Platform-API-External service. Note that the Platform-API-Internal service is not impacted if/when the Platform-API-External is disabled, and internal pAPI services continue to function as expected. If the Platform-API-External service is re-enabled, the WebUI will remain inactive until the PowerScaleUI service is also enabled.
RAN	If RAN is disabled, the WebUI components for File System Explorer and File Browser are also automatically disabled. From the WebUI, attempts to access the OneFS file system explorer (File System > File System Explorer) fail with the warning message Browse is disabled as RAN service is not running. Contact your administrator to enable the service. This same warning also appears when attempting to access any other WebUI components that require directory selection.
RemoteService	If RemoteService is disabled, the WebUI components for Remote Support and In-Product Activation are disabled. In the WebUI, going to Cluster Management > General Settings and selecting the Remote Support tab displays the message The service required for the feature is disabled. Contact your administrator to enable the service. In the WebUI, going to Cluster Management > Licensing and scrolling to the License Activation section displays the message The service required for the feature is disabled. Contact your administrator to enable the service.
SWIFT	Deprecated object protocol and disabled by default.

You can use the CLI command isi http settings view to display the OneFS HTTP configuration:

# isi http settings view
            Access Control: No
      Basic Authentication: No
    WebHDFS Ran HTTPS Port: 8443
                        Dav: No
         Enable Access Log: Yes
                      HTTPS: No
 Integrated Authentication: No
               Server Root: /ifs
                    Service: disabled
           Service Timeout: 8m20s
          Inactive Timeout: 15m
           Session Max Age: 4H
Httpd Controlpath Redirect: No

Similarly, you can manage and change the HTTP configuration using the isi http settings modify CLI command.

For example, to reduce the maximum session age from four to two hours:

# isi http settings view | grep -i age
           Session Max Age: 4H
# isi http settings modify --session-max-age=2H
# isi http settings view | grep -i age
           Session Max Age: 2H

The full set of configuration options for isi http settings includes:

Option	Description
--access-control <boolean>	Enable Access Control Authentication for the HTTP service. Access Control Authentication requires at least one type of authentication to be enabled.
--basic-authentication <boolean>	Enable Basic Authentication for the HTTP service.
--webhdfs-ran-https-port <integer>	Configure Data Services Port for the HTTP service.
--revert-webhdfs-ran-https-port	Set value to system default for --webhdfs-ran-https-port.
--dav <boolean>	Comply with Class 1 and 2 of the DAV specification (RFC 2518) for the HTTP service. All DAV clients must go through a single node. DAV compliance is NOT met if you go through SmartConnect, or using 2 or more node IPs.
--enable-access-log <boolean>	Enable writing to a log when the HTTP server is accessed for the HTTP service.
--https <boolean>	Enable the HTTPS transport protocol for the HTTP service.
--https <boolean>	Enable the HTTPS transport protocol for the HTTP service.
--integrated-authentication <boolean>	Enable Integrated Authentication for the HTTP service.
--server-root <path>	Document root directory for the HTTP service. Must be within /ifs.
--service (enabled \| disabled \| redirect \| disabled_basicfile)	Enable/disable the HTTP Service or redirect to WebUI or disabled BasicFileAccess.
--service-timeout <duration>	The amount of time (in seconds) that the server will wait for certain events before failing a request. A value of 0 indicates that the service timeout value is the Apache default.
--revert-service-timeout	Set value to system default for --service-timeout.
--inactive-timeout <duration>	Get the HTTP RequestReadTimeout directive from both the WebUI and the HTTP service.
--revert-inactive-timeout	Set value to system default for --inactive-timeout.
--session-max-age <duration>	Get the HTTP SessionMaxAge directive from both WebUI and HTTP service.
--revert-session-max-age	Set value to system default for --session-max-age.
--httpd-controlpath-redirect <boolean>	Enable or disable WebUI redirection to the HTTP service.

Note that while the OneFS S3 service uses HTTP, it is considered a tier-1 protocol, and as such is managed using its own isi s3 CLI command set and corresponding WebUI area. For example, the following CLI command forces the cluster to only accept encrypted HTTPS/SSL traffic on TCP port 9999 (rather than the default TCP port 9021):

# isi s3 settings global modify --https-only 1 –https-port 9921
# isi s3 settings global view
         HTTP Port: 9020
        HTTPS Port: 9999
        HTTPS only: Yes
S3 Service Enabled: Yes

Additionally, you can entirely disable the S3 service with the following CLI command:

# isi services s3 disable
The service 's3' has been disabled.

Or from the WebUI, under Protocols > S3 > Global settings:

Author: Nick Trimbee

security PowerScale OneFS

OneFS Key Manager Rekey Support

Mon, 24 Jul 2023 19:16:34 -0000

Read Time: 0 minutes

The OneFS key manager is a backend service that orchestrates the storage of sensitive information for PowerScale clusters. To satisfy Dell’s Secure Infrastructure Ready requirements and other public and private sector security mandates, the manager provides the ability to replace, or rekey, cryptographic keys.

The quintessential consumer of OneFS key management is data-at-rest encryption (DARE). Protecting sensitive data stored on the cluster with cryptography ensures that it’s guarded against theft, in the event that drives or nodes are removed from a PowerScale cluster. DARE is a requirement for federal and industry regulations, ensuring data is encrypted when it is stored. OneFS has provided DARE solutions for many years through secure encrypted drives (SEDs) and the OneFS key management system.

A 256-bit key (MK) encrypts the Key Manager Database (KMDB) for SED and cluster domains. In OneFS 9.2 and later, the MK for SEDs can either be stored off-cluster on a KMIP server or locally on a node (the legacy behavior).

However, there are a variety of other consumers of the OneFS key manager, in addition to DARE. These include services and protocols such as:

Service	Description
CELOG	Cluster event log
CloudPools	Cluster tier to cloud service
Email	Electronic mail
FTP	File transfer protocol
IPMI	Intelligent platform management interface for remote cluster console access
JWT	JSON web tokens
NDMP	Network data management protocol for cluster backups and DR
Pstore	Active directory and Kerberos password store
S3	S3 object protocol
SyncIQ	Cluster replication service
SmartSync	OneFS push and pull replication cluster and cloud replication service
SNMP	Simple network monitoring protocol
SRS	Old Dell support remote cluster connectivity
SSO	Single sign-on
SupportAssist	Remote cluster connectivity to Dell Support

OneFS 9.5 introduces a number of enhancements to the venerable key manager, including:

The ability to rekey keystores. Rekey operation will generate a new MK and re-encrypt all entries stored with the new key.
New CLI commands and WebUI options to perform a rekey operation or schedule key rotation on a time interval.
New commands to monitor the progress and status of a rekey operation.

As such, OneFS 9.5 now provides the ability to rekey the MK, irrespective of where it is stored.

Note that when you are upgrading from an earlier OneFS release, the new rekey functionality is only available once the OneFS 9.5 upgrade has been committed.

Under the hood, each provider store in the key manager consists of secure backend storage and an MK. Entries are kept in a SQLite database or key-value store. A provider datastore uses its MK to encrypt all its entries within the store.

During the rekey process, the old MK is only deleted after a successful re-encryption with the new MK. If for any reason the process fails, the old MK is available and remains as the current MK. The rekey daemon retries the rekey every 15 minutes if the process fails.

The OneFS rekey process is as follows:

A new MK is generated, and internal configuration is updated.
Any entries in the provider store are decrypted and encrypted with the new MK.
If the prior steps are successful, the previous MK is deleted.

To support the rekey process, the MK in OneFS 9.5 now has an ID associated with it. All entries have a new field referencing the MK ID.

During the rekey operation, there are two MK values with different IDs, and all entries in the database will associate which key they are encrypted by.

In OneFS 9.5, the rekey configuration and management is split between the cluster keys and the SED keys:

Rekey component	Detail
SED	SED provider keystore is stored locally on each node. SED provider domain already had existing CLI commands for handling KMIP settings in prior releases.
Cluster	Controls all cluster-wide keystore domains. Status shows information of all cluster provider domains.

SED keys rekey

The SED key manager rekey operation can be managed through a DARE cluster’s CLI or WebUI, and it can either be automatically scheduled or run manually on demand. The following CLI syntax can be used to manually initiate a rekey:

# isi keymanager sed rekey start

Alternatively, to schedule a rekey operation, for example, to schedule a key rotation every two months:

# isi keymanager sed rekey modify --key-rotation=2m

The key manager status for SEDs can be viewed as follows:

# isi keymanager sed status
 Node Status  Location   Remote Key ID  Key Creation Date   Error Info(if any)
-----------------------------------------------------------------------------
1   LOCAL   Local                    1970-01-01T00:00:00
-----------------------------------------------------------------------------
Total: 1

Alternatively, from the WebUI, go to Access > Key Management > SED/Cluster Rekey, select Automatic rekey for SED keys, and configure the rekey frequency:

Note that for SED rekey operations, if a migration from local cluster key management to a KMIP server is in progress, the rekey process will begin once the migration is complete.

Cluster keys rekey

As mentioned previously, OneFS 9.5 also supports the rekey of cluster keystore domains. This cluster rekey operation is available through the CLI and the WebUI and may either be scheduled or run on demand. The available cluster domains can be queried by running the following CLI syntax:

# isi keymanager cluster status
Domain     Status  Key Creation Date   Error Info(if any)
----------------------------------------------------------
CELOG      ACTIVE  2023-04-06T09:19:16
CERTSTORE  ACTIVE  2023-04-06T09:19:16
CLOUDPOOLS ACTIVE   2023-04-06T09:19:16
EMAIL      ACTIVE  2023-04-06T09:19:16
FTP        ACTIVE  2023-04-06T09:19:16
IPMI_MGMT  IN_PROGRESS  2023-04-06T09:19:16
JWT        ACTIVE  2023-04-06T09:19:16
LHOTSE     ACTIVE  2023-04-06T09:19:11
NDMP       ACTIVE  2023-04-06T09:19:16
NETWORK    ACTIVE  2023-04-06T09:19:16
PSTORE     ACTIVE  2023-04-06T09:19:16
RICE       ACTIVE  2023-04-06T09:19:16
S3         ACTIVE  2023-04-06T09:19:16
SIQ        ACTIVE  2023-04-06T09:19:16
SNMP       ACTIVE  2023-04-06T09:19:16
SRS        ACTIVE  2023-04-06T09:19:16
SSO        ACTIVE  2023-04-06T09:19:16
----------------------------------------------------------
Total: 17

The rekey process generates a new key and re-encrypts the entries for the domain. The old key is then deleted.

Performance-wise, the rekey process does consume cluster resources (CPU and disk) as a result of the re-encryption phase, which is fairly write-intensive. As such, a good practice is to perform rekey operations outside of core business hours or during scheduled cluster maintenance windows.

During the rekey process, the old MK is only deleted once a successful re-encryption with the new MK has been confirmed. In the event of a rekey process failure, the old MK is available and remains as the current MK.

A rekey may be requested immediately or may be scheduled with a cadence. The rekey operation is available through the CLI and the WebUI. In the WebUI, go to Access > Key Management > SED/Cluster Rekey.

To start a rekey of the cluster domains immediately, from the CLI run the following syntax:

# isi keymanager cluster rekey start 
Are you sure you want to rekey the master passphrase? (yes/[no]):yes

Alternatively, from the WebUI, go to Access under the SED/Cluster Rekey tab, and click Rekey Now next to Cluster keys:

A scheduled rekey of the cluster keys (excluding the SED keys) can be configured from the CLI with the following syntax:

# isi keymanager cluster rekey modify –-key-rotation [YMWDhms]

Specify the frequency of the Key Rotation field as an integer, using Y for years, M for months, W for weeks, D for days, h for hours, m for minutes, and s for seconds. For example, the following command will schedule the cluster rekey operation to run every six weeks:

# isi keymanager cluster rekey view
 Rekey Time: 1970-01-01T00:00:00
 Key Rotation: Never
 # isi keymanager cluster rekey modify --key-rotation 6W
 # isi keymanager cluster rekey view
 Rekey Time: 2023-04-28T18:38:45
 Key Rotation: 6W

The rekey configuration can be easily reverted back to on demand from a schedule as follows:

# isi keymanager cluster rekey modify --key-rotation Never
 # isi keymanager cluster rekey view
 Rekey Time: 2023-04-28T18:38:45
 Key Rotation: Never

Alternatively, from the WebUI, under the SED/Cluster Rekey tab, select the Automatic rekey for Cluster keys checkbox and specify the rekey frequency. For example:

In an event of a rekeying failure, a CELOG KeyManagerRekeyFailed or KeyManagerSedsRekeyFailed event is created. Since SED rekey is a node-local operation, the KeyManagerSedsRekeyFailed event information will also include which node experienced the failure.

Additionally, current cluster rekey status can also be queried with the following CLI command:

# isi keymanager cluster status
Domain     Status  Key Creation Date   Error Info(if any)
----------------------------------------------------------
CELOG      ACTIVE  2023-04-06T09:19:16
CERTSTORE  ACTIVE  2023-04-06T09:19:16
CLOUDPOOLS ACTIVE   2023-04-06T09:19:16
EMAIL      ACTIVE  2023-04-06T09:19:16
FTP        ACTIVE  2023-04-06T09:19:16
IPMI_MGMT  ACTIVE  2023-04-06T09:19:16
JWT        ACTIVE  2023-04-06T09:19:16
LHOTSE     ACTIVE  2023-04-06T09:19:11
NDMP       ACTIVE  2023-04-06T09:19:16
NETWORK    ACTIVE  2023-04-06T09:19:16
PSTORE     ACTIVE  2023-04-06T09:19:16
RICE       ACTIVE  2023-04-06T09:19:16
S3         ACTIVE  2023-04-06T09:19:16
SIQ        ACTIVE  2023-04-06T09:19:16
SNMP       ACTIVE  2023-04-06T09:19:16
SRS        ACTIVE  2023-04-06T09:19:16
SSO        ACTIVE  2023-04-06T09:19:16
----------------------------------------------------------
Total: 17

Or, for SEDs rekey status:

# isi keymanager sed status
 Node Status  Location   Remote Key ID  Key Creation Date   Error Info(if any)
-----------------------------------------------------------------------------
1   LOCAL   Local                    1970-01-01T00:00:00
2   LOCAL   Local                    1970-01-01T00:00:00
3   LOCAL   Local                    1970-01-01T00:00:00
4   LOCAL   Local                    1970-01-01T00:00:00
-----------------------------------------------------------------------------
Total: 4

The rekey process also outputs to the /var/log/isi_km_d.log file, which is a useful source for additional troubleshooting.

If an error in rekey occurs, the previous MK is not deleted, so entries in the provider store can still be created and read as normal. The key manager daemon will retry the rekey operation in the background every 15 minutes until it succeeds.

Author: Nick Trimbee

Your Browser is Out of Date

OneFS Snapshot Security

Read-only snapshots

Snapshot Locks

Roles-based Access Control

Snapshot replication

Cyber-vaulting

Related Blog Posts

OneFS and HTTP Security

OneFS Key Manager Rekey Support

SED keys rekey

Cluster keys rekey