Your Browser is Out of Date

Nytro.ai uses technology that works best in other browsers.
For a full experience use one of the browsers below

Dell.com Contact Us
United States/English
Nick Trimbee
Nick Trimbee
Nick Trimbee is a twenty-five-year veteran of the data storage industry. Nick’s held positions in systems administration, engineering, product management, pre-sales, solutions architecture, and technical marketing. His primary focus lies within the realm of file and object storage, helping customers to find creative solutions for their unstructured data needs.

Assets

Home > Storage > PowerScale (Isilon) > Blogs

PowerScale OneFS HealthCheck auto-updates

OneFS HealthCheck Auto-updates

Nick Trimbee Nick Trimbee

Tue, 21 May 2024 17:11:27 -0000

|

Read Time: 0 minutes

Prior to OneFS 9.4, Healthchecks were frequently regarded by storage administrators as yet another patch that needed to be installed on a PowerScale cluster. As a result, their adoption was routinely postponed or ignored, potentially jeopardizing a cluster’s well-being. To address this, OneFS HealthCheck auto-updates enable new Healthchecks to be automatically downloaded and non-disruptively installed on a PowerScale cluster without any user intervention.

The automated HealthCheck update framework helps accelerate the adoption of OneFS Healthchecks, by removing the need for manual checks, downloads, and installation. In addition to reducing management overhead, the automated Healthchecks integrate with CloudIQ to update the cluster health score - further improving operational efficiency, while avoiding known issues that affect cluster availability.

This figure shows the OneFS Healthcheck architecture.

Formerly known as Healthcheck patches, or RUPs, with OneFS 9.4 and later these are renamed as ‘Healthcheck definitions’. The Healthcheck framework checks for updates to these definitions using Dell Secure Remote Services (SRS). 

An auto-update configuration setting in the OneFS SRS framework controls whether the Healthcheck definitions are automatically downloaded and installed on a cluster. A OneFS platform API endpoint has been added to verify the Healthcheck version, and Healthchecks also optionally support OneFS compliance mode.

Healthcheck auto-update is enabled by default in OneFS 9.4 and later, and is available for both existing and new clusters, but can also be easily disabled from the CLI. If the auto-update is on and SRS is enabled, the Healthcheck definition is downloaded to the desired staging location and then automatically and non-impactfully installed on the cluster. Any Healthcheck definitions that are automatically downloaded are obviously signed and verified before being applied, to ensure their security and integrity.

So, the Healthcheck auto-update execution process itself is as follows:

This figure lists the six steps of the auto-update execution process. They are:  one - query the current Healthcheck version two - check the Healthcheck definition availability three - compare the versions four - download the Healthcheck definition package to the cluster five - unpack and install the package, and six - send telemetry data and update the Healthcheck framework with the new version

On the cluster, the Healthcheck auto-update utility isi_healthcheck_update monitors for a new package once a night, by default. This Python script checks the cluster’s current Healthcheck definition version and new updates availability using SRS. Next, it performs a version comparison of the install package, after which the new definition is downloaded and installed. Telemetry data is sent and the /var/db/healthcheck_version.json file is created if it’s not already present. This JSON file is then updated with the new Healthcheck version info.

To configure and use the Healthcheck auto-update functionality, you must perform the following steps:

  1. Upgrade the cluster to OneFS 9.4 or later and commit the upgrade.
  2. To use the isi_healthcheck script, OneFS needs to be licensed and connected to the ESRS gateway. OneFS 9.4 also introduces a new option for ESRS, ‘SRS Download Enabled’, which must be set to ‘Yes’ (the default value) to allow the isi_healthcheck_update utility to run. To do this, use the following syntax (in this example, using lab-sea-esrs.onefs.com as the primary ESRS gateway):
# isi esrs modify --enabled=yes --primary-esrs-gateway=10.12.15.50 --srs-download-enabled=true

Confirm the ESRS configuration as follows:

# isi esrs view
                                    Enabled: Yes
                       Primary ESRS Gateway: 10.12.15.50
                     Secondary ESRS Gateway: 
                        Alert on Disconnect: Yes
                       Gateway Access Pools: -
          Gateway Connectivity Check Period: 60
License Usage Intelligence Reporting Period: 86400
                           Download Enabled: No
                       SRS Download Enabled: Yes
          ESRS File Download Timeout Period: 50
           ESRS File Download Error Retries: 3
              ESRS File Download Chunk Size: 1000000
             ESRS Download Filesystem Limit: 80
        Offline Telemetry Collection Period: 7200
                Gateway Connectivity Status: Connected
  1. Next, use the CloudIQ web interface to onboard the cluster. This requires creating a site, and then from the Add Product page, configuring the serial number of each node in the cluster, along with the product type ISILON_NODE, the site ID, and then selecting Submit.

This is a CloudIQ WebUI screenshot that shows cluster onboarding.

CloudIQ cluster onboarding typically takes a couple of hours. When complete, the Product Details page shows the CloudIQ Status, ESRS Data, and CloudIQ Data fields as Enabled, as shown here:

This screenshot shows the CloudIQ product onboarding page.

  1. Examine the cluster status to verify that the cluster is available and connected in CloudIQ.

When these prerequisite steps are complete, use the new isi_healthcheck_update CLI command to enable auto-update. For example, to enable:

# isi_healthcheck_update --enable
2022-05-02 22:21:27,310 - isi_healthcheck.auto_update - INFO - isi_healthcheck_update started
2022-05-02 22:21:27,513 - isi_healthcheck.auto_update - INFO - Enable autoupdate

Similarly, you can also easily disable auto-update:

# isi esrs modify --srs-download-enabled=false

Auto-update also has the following gconfig global config options and default values:

# isi_gconfig -t healthcheck 
Default values: healthcheck_autoupdate.enabled (bool) = true healthcheck_autoupdate.compliance_update (bool) = false healthcheck_autoupdate.alerts (bool) = false healthcheck_autoupdate.max_download_package_time (int) = 600 healthcheck_autoupdate.max_install_package_time (int) = 3600 healthcheck_autoupdate.number_of_failed_upgrades (int) = 0 healthcheck_autoupdate.last_failed_upgrade_package (char*) = healthcheck_autoupdate.download_directory (char*) = /ifs/data/auto_upgrade_healthcheck/downloads

The isi_healthcheck_update Python utility is scheduled by cron and executed across all the nodes in the cluster, as follows:

# grep -i healthcheck /etc/crontab
# Nightly Healthcheck update
0       1       *        *       *       root     /usr/bin/isi_healthcheck_update -s

This default /etc/crontab entry executes auto-update once daily at 1am. However, this schedule can be adjusted to meet the needs of the local environment.

Auto-update checks for new package availability and downloads and performs a version comparison of the installed and the new package. The package is then installed, telemetry data sent, and the healthcheck_version.json file updated with new version.

After the Healthcheck update process has completed, you can use the following CLI command to view any automatically downloaded Healthcheck packages. For example:

# isi upgrade patches list
Patch Name               Description                                Status
-----------------------------------------------------------------------------
HealthCheck_9.4.0_32.0.3 [9.4.0 UHC 32.0.3] HealthCheck definition  Installed
-----------------------------------------------------------------------------
Total: 1

Additionally, viewing the JSON version file will also confirm this:

# cat /var/db/healthcheck_version.json
{“version”: “32.0.3”}

In the unlikely event that auto-updates run into issues, the following troubleshooting steps can be useful:

  1. Confirm that Healthcheck auto-update is actually enabled:

Check the ESRS global config settings and verify they are set to ‘True’.

# isi_gconfig -t esrs esrs.enabled
esrs.enabled (bool) = true
# isi_gconfig -t esrs esrs.srs_download_enabled
esrs.srs_download_enabled (bool) = true

If not, run:

# isi_gconfig -t esrs esrs.enabled=true 
# isi_gconfig -t esrs esrs.srs_download_enabled=true 
  1. If an auto-update patch installation is not completed within 60 minutes, OneFS increments the unsuccessful installations counter for the current patch, and re-attempts installation the following day.
  2. If the unsuccessful installations counter exceeds five attempts, the installation will be aborted. However, you can reset the following auto-update gconfig values, as follows, to re-enable the installation:
# isi_gconfig -t healthcheck healthcheck_autoupdate.last_failed_upgrade_package = 0 
# isi_gconfig -t healthcheck healthcheck_autoupdate.number_of_failed_upgrades = "" 
  1. If a patch installation status is reported as ‘failed’, the recommendation is to contact Dell Support to diagnose and resolve the issue:
# isi upgrade patches list
Patch Name               Description                                Status
-----------------------------------------------------------------------------
HealthCheck_9.4.0_32.0.3 [9.4.0 UHC 32.0.3] HealthCheck definition  Failed
-----------------------------------------------------------------------------
Total: 1

However, the following CLI command can be carefully used to repair the patch system by attempting to abort the most recent failed action: 

# isi upgrade patches abort 

The isi upgrade archive --clear command stops the current upgrade and prevents it from being resumed:

# isi upgrade archive --clear

When the upgrade status is reported as ‘unknown’, run:

# isi upgrade patch uninstall 
  1. The file /var/log/isi_healthcheck.log is also a great source for detailed auto-upgrade information.

Author: Nick Trimbee

Home > Storage > PowerScale (Isilon) > Blogs

OneFS SyncIQ and Windows File Create Date

Nick Trimbee Nick Trimbee

Fri, 17 May 2024 20:27:18 -0000

|

Read Time: 0 minutes

In the POSIX world, files typically possess three fundamental timestamps:

Timestamp

Alias

Description

Access

atime

Access timestamp of the last read.

Change

ctime

Status change timestamp of the last update to the file's metadata.

Modify

mtime

Modification timestamp of the last write.

These timestamps can be easily viewed from a variety of file system tools and utilities. For example, in this case running ‘stat’ from the OneFS CLI:

# stat -x tstr
  File: "tstr"
  Size: 0            FileType: Regular File
  Mode: (0600/-rw-------)         Uid: (    0/     root)  Gid: (    0/    wheel)
Device: 18446744073709551611,18446744072690335895   Inode: 5103485107    Links: 1
Access: Mon Sep 11 23:12:47 2023
Modify: Mon Sep 11 23:12:47 2023
Change: Mon Sep 11 23:12:47 2023

A typical instance of a change, or “ctime”, timestamp update occurs when a file’s access permissions are altered. Since modifying the permissions doesn’t physically open the file (ie. access the file’s data), its “atime” field is not updated. Similarly, since no modification is made to the file’s contents the “mtime” also remains unchanged. However, the file’s metadata has been changed, and the ctime field is used to record this event. As such, the “ctime” stamp allows a workflow such as a backup application to know to make a fresh copy of the file, including its updated permission values. Similarly, a file rename is another operation that modifies its “ctime” entry without affecting the other timestamps.

Certain other file systems also include a fourth timestamp: namely the “birthtime” of when the file was created. Birthtime (by definition) should never change. It’s also an attribute which organizations and their storage administrators may or may not care about.

Within the Windows file system realm, this “birthtime” timestamp, is affectionally known as “create date”. The create date of a file is essentially the date and time when its inode is “born”.

Note: that this is not a recognized POSIX attribute, like ctime or mtime, rather it is something that was introduced as part of Windows compatibility requirements. And, because it’s a birthtime, linking operations do not necessarily affect it unless a new inode is not created.

As shown below, this create, or birth, date can differ from a file’s modified or accessed dates because the creation date is when that file’s inode version originated. So, for instance, if a file is copied, the new file’s create date will be set to the current time since it has a new inode. This can be seen in the following example where a file is copied from a flash drive mounted on a Windows client’s file system under drive “E:”, to a cluster’s SMB share mounted at drive “Z:”.

  

The “Date created” date above is ahead in time of both the “accessed” and “modified”, because the latter two were merely inherited from the source file, whereas the create date was set when the copy was made.

The corresponding “date”, “stat”, and “isi get” CLI output from the cluster confirms this:

# stat TEST.txt
18446744072690400151 5103485107 -rw------- 1 root wheel 18446744073709551615 0 "Sep 11 23:12:47 2023" "Sep 11 23:12:47 2023" "Sep 11 23:12:47 2023" "Sep 11 23:12:47 2023" 8192 48 0xe0 tstr
# isi get -Dd TEST.txt
POLICY   W   LEVEL PERFORMANCE   COAL  ENCODING       FILE              IADDRS
default      16+2/2 concurrency   on    UTF-8         tstr              <34,12,58813849600:8192>, <35,3,58981457920:8192>, <69,12,57897025536:8192> ct: 1694473967 rt: 0
*************************************************
* IFS inode: [ 34,12,58813849600:8192, 35,3,58981457920:8192, 69,12,57897025536:8192 ]
*************************************************
*
*  Inode Version:      8
*  Dir Version:        2
*  Inode Revision:     1
*  Inode Mirror Count: 3
*  Recovered Flag:     0
*  Restripe State:     0
*  Link Count:         1
*  Size:               0
*  Mode:               0100600
*  Flags:              0xe0
*  SmartLinked:        False
*  Physical Blocks:    0
*  Phys. Data Blocks:  0
*  Protection Blocks:  0
*  LIN:                1:3031:00b3
*  Logical Size:       0
*  Shadow refs:        0
*  Do not dedupe:      0
*  In CST stats:       False
*  Last Modified:      1694473967.071973000
*  Last Inode Change:  1694473967.071973000
*  Create Time:        1694473967.071973000
*  Rename Time:        0
<snip>

In releases before OneFS 9.5, when a file is replicated, its create date is timestamped when that file was copied from the source cluster. This means when the replication job ran, or, more specifically, when the individual job worker thread got around to processing that specific file.

By way of contrast, OneFS 9.5 and later releases ensure that SyncIQ fully replicates the full array of metadata, preserving all values, including that of the birth time / create date.

The primary consideration for the new create date functionality is that it requires both source and target clusters in a replication set to be running OneFS 9.5 or later.

If either the source or the target is running pre-9.5 code, this time field retains its old behavior of being set to the time of replication (actual file creation) rather than the correct value associated with the source file.

 

In OneFS 9.5 and later releases, create date timestamping works exactly the same way as SyncIQ replication of other metadata (such as “mtime”, etc), occurring automatically as part of every file replication. Plus, no additional configuration is necessary beyond upgrading both clusters to OneFS 9.5 or later.

One other significant thing to note about this feature is that SyncIQ is changelist-based, using OneFS snapshots under the hood for its checkpointing and delta comparisons.. This means that, if a replication relationship has been configured prior to OneFS 9.5 or later upgrade, the source cluster will have valid birthtime data, but the target’s birthtime data will reflect the local creation time of the files it’s copied.

Note: that, upon upgrading both sides to OneFS 9.5 or later and running a SyncIQ job, nothing will change. This is because SyncIQ will perform its snapshot comparison, determine that no changes were made to the dataset, and so will not perform any replication work. However, if a source file is “touched” so that it’s mtime is changed (or any other action performed that will cause a copy-on-write, or CoW) that will cause the file to show up in the snapshot diff and therefore be replicated. As part of replicating that file, the correct birth time will be written on the target.

Note: that a full replication (re)sync does not get triggered as a result of upgrading a replication cluster pair to OneFS 9.5 or later and thereby enabling this functionality. Instead, any create date timestamp resolution happens opportunistically and in the background as files gets touched or modified - and thereby replicated. Be aware that ‘touching’ a file does change its modification time, in addition to updating the create date, which may be undesirable.

Author: Nick Trimbee

Home > Storage > PowerScale (Isilon) > Blogs

PowerScale OneFS metadata management

OneFS Metadata

Nick Trimbee Nick Trimbee

Fri, 17 May 2024 20:07:06 -0000

|

Read Time: 0 minutes

OneFS uses two principal data structures to enable information about each object, or metadata, within the file system to be searched, managed, and stored efficiently and reliably. These structures are:

  • Inodes
  • B-trees

OneFS uses inodes to store file attributes and pointers to file data locations on disk. Each file, directory, link, and so on, is represented by an inode.

Within OneFS, inodes come in two sizes - either 512B or 8KB. The size that OneFS uses is determined primarily by the physical and logical block formatting of the drives in a diskpool.

All OneFS inodes have both static and dynamic sections. The static section space is limited and valuable because it can be accessed in a single I/O, and does not require a distributed lock to access it. It holds fixed-width, commonly used attributes such as POSIX mode bits, owner, and size.

Graphic illustrating the composition of a OneFS inode.

In contrast, the dynamic portion of an inode allows new attributes to be added, if necessary, without requiring an inode format update. This can be done by simply adding a new type value with code to serialize and deserialize it. Dynamic attributes are stored in the stream-style type-length-value (TLV) format, and include protection policies, OneFS ACLs, embedded b-tree roots, domain membership info, and so on.

If necessary, OneFS can also use extension blocks, which are 8KB blocks to store any attributes that cannot fully fit into the inode itself. OneFS data services such as SnapshotIQ also commonly leverage inode extension blocks.

Graphic illustrating a OneFS inode with extension blocks.

Inodes are dynamically created and stored in locations across all drives in the clusters; OneFS uses b-trees (actually B+ trees) for their indexing and rapid retrieval. The general structure of a OneFS b-tree includes a top-level block, known as the ‘root’. B-tree blocks that reference other b-trees are referred to as ‘inner blocks’. The last blocks at the end of the tree are called ‘leaf blocks’.

 

Graphic depicting the general structure of a OneFS b-tree.

Only the leaf blocks actually contain metadata, whereas the root and inner blocks provide a balanced index of addresses allowing rapid identification of and access to the leaf blocks and their metadata.

A LIN, or logical inode, is accessed every time a file, directory, or b-tree is accessed. The function of the LIN Tree is to store the mapping between a unique LIN number and its inode mirror addresses.

The LIN is represented as a 64-bit hexadecimal number. Each file is assigned a single LIN and, because LINs are never reused, it is unique for the cluster’s lifespan. For example, the file /ifs/data/test/file1 has the following LIN:

# isi get -D /ifs/data/test/f1 | grep LIN:
*   LIN:                1:2d29:4204

Similarly, its parent directory, /ifs/data/testhas:

# isi get -D /ifs/data/test | grep LIN:
*   LIN:                1:0353:bb59
*   LIN:                1:0009:0004
*   LIN:                1:2d29:4204

The file above’s LIN tree entry includes the mapping between the LIN and its three mirrored inode disk addresses.

# isi get -D /ifs/data/test/f1 | grep "inode"
* IFS inode: [ 92,14,524557565440:512, 93,19,399535074304:512, 95,19,610321964032:512 ]

Taking the first of these inode addresses, 92,14,524557565440:512, the following can be inferred, reading from left to right:

  • It’s on node 92.
  • Stored on drive lnum 14.
  • At block address 524557565440.
  • And is a 512byte inode.

The file’s parent LIN can also be easily determined:

# isi get -D /ifs/data/test/f1 | grep -i "Parent Lin"
*  Parent Lin          1:0353:bb59

In addition to the LIN tree, OneFS also uses b-trees to support file and directory access, plus the management of several other data services. That said, the three principal b-trees that OneFS employs are:

Category

B+ Tree Name

Description

Files

Metatree or Inode Format Manager (IFM B-tree)

  • This B-tree stores a mapping of Logical Block Number (LBN) to protection group
  • It is responsible to storing the physical location of file blocks on disk.

Directories

Directory Format Manager (DFM B-tree)

  • This B-tree stores directory entries (File names and directory/sub-directories)
  • It includes the full /ifs namespace and everything under it.

System

System B-tree (SBT)

  • Standardized B+ Tree implementation to store records for OneFS internal use, typically related to a particular feature including: Diskpool DB, IFS Domains, WORM, Idmap. Quota (QDB) and Snapshot Tracking Files (STF) are actually separate/unique B+ Tree implementations.

OneFS also relies heavily on several other metadata structures too, including:

  • Shadow Store - Dedupe/clone metadata structures including SINs
  • QDB – Quota Database structures
  • System B+ Tree Files
  • STF – Snapshot Tracking Files
  • WORM
  • IFM Indirect
  • Idmap
  • System Directories
  • Delta Blocks
  • Logstore Files

Both inodes and b-tree blocks are mirrored on disk. Mirror-based protection is used exclusively for all OneFS metadata because it is simple and lightweight, thereby avoiding the additional processing of erasure coding. Because metadata typically only consumes around 2% of the overall cluster’s capacity, the mirroring overhead for metadata is minimal.

The number of inode mirrors (minimum 2x up to 8x) is determined by the nodepool’s achieved protection policy and the metadata type. The following is a mapping of the default number of mirrors for all metadata types.

Protection Level

Metadata Type

Number of Mirrors

+1n

File inode

2 inodes per file

+2d:1n

File inode

3 inodes per file

+2n

File inode

3 inodes per file

+3d:1n

File inode

4 inodes per file

+3d:1n1d

File inode

4 inodes per file

+3n

File inode

4 inodes per file

+4d:1n

File inode

5 inodes per file

+4d:2n

File inode

5 inodes per file

+4n

File inode

5 inodes per file

2x->8x

File inode

Same as protection level. I.e. 2x == 2 inode mirrors

+1n

Directory inode

3 inodes per file

+2d:1n

Directory inode

4 inodes per file

+2n

Directory inode

4 inodes per file

+3d:1n

Directory inode

5 inodes per file

+3d:1n1d

Directory inode

5 inodes per file

+3n

Directory inode

5 inodes per file

+4d:1n

Directory inode

6 inodes per file

+4d:2n

Directory inode

6 inodes per file

+4n

Directory inode

6 inodes per file

2x->8x

Directory inode

+1 protection level. I.e. 2x == 3 inode mirrors

 

LIN root/master

8x

 

LIN inner/leaf

Variable – per-entry protection

 

IFM/DFM b-tree

Variable – per-entry protection

 

Quota database b-tree (QDB)

8x

 

SBT System b-tree (SBT)

Variable – per-entry protection

 

Snapshot tracking files (STF)

8x

Note that, by default, directory inodes are mirrored at one level higher than the achieved protection policy, because directories are more critical and make up the OneFS single namespace. The root of the LIN Tree is the most critical metadata type and is always mirrored at 8x.

OneFS SSD strategy governs where and how much metadata is placed on SSD or HDD. There are five SSD Strategies, and these can be configured using OneFS’ file pool policies:

SSD Strategy

Description

L3 Cache

All drives in a Node Pool are used as a read-only evection cache from L2 Cache. Currently used data and metadata will fill the entire capacity of the SSD Drives in this mode. Note: L3 mode does not guarantee that all metadata will be on SSD, so this may not be the most performant mode for metadata intensive workflows.

Metadata Read

One metadata mirror is placed on SSD. All other mirrors will be on HDD for hybrid and archive models. This mode can boost read performance for metadata intensive workflows.

Metadata Write

All metadata mirrors are placed on SSD. This mode can boost both read and write performance when there is significant demand on metadata IO. Note: It is important to understand the SSD capacity requirements needed to support Metadata strategies.  

Data

Place data on SSD. This is not a widely used strategy, because Hybrid and Archive nodes have limited SSD capacities, and metadata should take priority on SSD for best performance.

Avoid

Avoid using SSD for a specific path. This is not a widely used strategy but could be handy if you had archive workflows that did not require SSD and wanted to dedicate your SSD space for other more important paths/workflows.

Fundamentally, OneFS metadata placement is determined by the following attributes:

  • The model of the nodes in each node pool (F-series, H-series, A-series)
  • The current SSD Strategy on the node pool configured using the default filepool policy and custom administrator-created filepool policies
  • The cluster’s global storage pool settings

You can use the following CLI commands to verify the current SSD strategy and metadata placement details on a cluster. For example, to check whether L3 Mode is enabled on a specific node pool:

# isi storagepool nodepool list
ID     Name                       Nodes  Node Type IDs   Protection Policy  Manual
----------------------------------------------------------------------------------
1      h500_30tb_3.2tb-ssd_128gb  1      1               +2d:1n             No

In this output, there is a single H500 node pool reported with an ID of 1. To display the details of this pool, use the following command:

# isi storagepool nodepool view 1
                 ID: 1
               Name: h500_30tb_3.2tb-ssd_128gb
              Nodes: 1, 2, 3, 4, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40
      Node Type IDs: 1
  Protection Policy: +2d:1n
             Manual: No
         L3 Enabled: Yes
L3 Migration Status: l3
               Tier: -
              Usage
                Avail Bytes: 321.91T
            Avail SSD Bytes: 0.00
                   Balanced: No
                 Free Bytes: 329.77T
             Free SSD Bytes: 0.00
                Total Bytes: 643.13T
            Total SSD Bytes: 0.00
    Virtual Hot Spare Bytes: 7.86T

Note that if, as in this case, L3 is enabled on a node pool, any changes to this pool’s SSD Strategy configuration using file pool policies, and so on, will not be honored. This will remain until L3 cache has been disabled and the SSDs reformatted for use as metadata mirrors.

You can use the following command to check the cluster’s default file pool policy configuration:

# isi filepool default-policy view
          Set Requested Protection: default
               Data Access Pattern: concurrency
                  Enable Coalescer: Yes
                    Enable Packing: No
               Data Storage Target: anywhere
                 Data SSD Strategy: metadata
           Snapshot Storage Target: anywhere
             Snapshot SSD Strategy: metadata
                        Cloud Pool: -
         Cloud Compression Enabled: -
          Cloud Encryption Enabled: -
              Cloud Data Retention: -
Cloud Incremental Backup Retention: -
       Cloud Full Backup Retention: -
               Cloud Accessibility: -
                  Cloud Read Ahead: -
            Cloud Cache Expiration: -
         Cloud Writeback Frequency: -
      Cloud Archive Snapshot Files: -
                                ID: -

And to list all FilePool Policies configured on a cluster:

# isi filepool policies list

View a specific FilePool Policy:

# isi filepool policies view <Policy Name>

OneFS also provides global storagepool configuration settings that control additional metadata placement. For example: 

# isi storagepool settings view
     Automatically Manage Protection: files_at_default
Automatically Manage Io Optimization: files_at_default
Protect Directories One Level Higher: Yes
       Global Namespace Acceleration: disabled
       Virtual Hot Spare Deny Writes: Yes
        Virtual Hot Spare Hide Spare: Yes
      Virtual Hot Spare Limit Drives: 2
     Virtual Hot Spare Limit Percent: 0
             Global Spillover Target: anywhere
                   Spillover Enabled: Yes
        SSD L3 Cache Default Enabled: Yes
                     SSD Qab Mirrors: one
            SSD System Btree Mirrors: one
            SSD System Delta Mirrors: one

The CLI output below includes descriptions of the relevant metadata options available.  

# isi storagepool settings modify -h | egrep -i options -A 30
Options:
    --automatically-manage-protection (all | files_at_default | none)
        Set whether SmartPools manages files' protection settings.
    --automatically-manage-io-optimization (all | files_at_default | none)
        Set whether SmartPools manages files' I/O optimization settings.
    --protect-directories-one-level-higher <boolean>
        Protect directories at one level higher.
    --global-namespace-acceleration-enabled <boolean>
        Global namespace acceleration enabled.
    --virtual-hot-spare-deny-writes <boolean>
        Virtual hot spare: deny new data writes.
    --virtual-hot-spare-hide-spare <boolean>
        Virtual hot spare: reduce amount of available space.
    --virtual-hot-spare-limit-drives <integer>
        Virtual hot spare: number of virtual drives.
    --virtual-hot-spare-limit-percent <integer>
        Virtual hot spare: percent of total storage.
    --spillover-target <str>
        Spillover target.
    --spillover-anywhere
        Set global spillover to anywhere.
    --spillover-enabled <boolean>
        Spill writes into pools within spillover_target as needed.
    --ssd-l3-cache-default-enabled <boolean>
        Default setting for enabling L3 on new Node Pools.
    --ssd-qab-mirrors (one | all)
        Controls number of mirrors of QAB blocks to place on SSDs.
    --ssd-system-btree-mirrors (one | all)
        Controls number of mirrors of system B-tree blocks to place on SSDs.
    --ssd-system-delta-mirrors (one | all)
        Controls number of mirrors of system delta blocks to place on SSDs.

OneFS defaults to protecting directories one level higher than the configured protection policy and retaining one mirror of system b-trees on SSD. For optimal performance on hybrid platform nodes, the recommendation is to place all metadata mirrors on SSD, assuming that the capacity is available. Be aware, however, that the metadata SSD mirroring options only become active if L3 Mode is disabled.

Additionally, global namespace acceleration (GNA) is a legacy option that allows nodes without SSD to place their metadata on nodes with SSD. All currently shipping PowerScale node models include at least one SSD drive.

Author: Nick Trimbee

Home > Storage > PowerScale (Isilon) > Blogs

security PowerScale OneFS upgrades

OneFS Signed Upgrades

Nick Trimbee Nick Trimbee

Fri, 17 May 2024 16:42:45 -0000

|

Read Time: 0 minutes

Introduced as part of the OneFS security enhancements, signed upgrades help maintain system integrity by preventing a cluster from being compromised by the installation of maliciously modified upgrade packages. This is required by several industry security compliance mandates, such as the DoD Network Device Management Security Requirements Guide, which stipulates “The network device must prevent the installation of patches, service packs, or application components without verification the software component has been digitally signed using a certificate that is recognized and approved by the organization”.

With this signed upgrade functionality, all packages must be cryptographically signed before they can be installed. This applies to all upgrade types, including core OneFS, patches, cluster firmware, and drive firmware. The underlying components that comprise this feature include an updated .isi format for all package types, plus a new OneFS Catalog to store the verified packages. In OneFS 9.4 and later, the actual upgrades themselves are still performed using either the CLI or WebUI, and are very similar to previous versions.

Under the hood, the signed upgrade process works as follows:

This image depicts the OneFS signed upgrade process.

Everything goes through the catalog, which comprises four basic components. There’s a small SQLite database that tracks metadata, a library which has the basic logic for the catalog, the signature library based around OpenSSL which handles all of the verification, and a couple of directories in which to store the verified packages.

With signed upgrades, there’s a single file to download that contains the upgrade package, README text, and all signature data. No file unpacking is required.

The .isi file format is a follows:

This graphic illustrates the .isi file format.

This graphic illustrates the .isi package file format.

In the second region of the package file, you can directly incorporate a ‘readme’ text file that provides instructions, version compatibility requirements, and so on.

The first region, which contains the main package data, is also compatible with previous OneFS versions that don’t support the .isi format. This allows a signed firmware of the DSP package to be installed on OneFS 9.3 and earlier.

The new OneFS catalog provides a secure place to store verified .isi packages, and only the root account has direct access. The catalog itself is stored at /ifs/,ifsvar/catalog and all maintenance and interaction is performed using the isi upgrade catalog CLI command set. The contents, or artifacts, of the catalog each have an ID that corresponds to the SHA256 hash of the file.

Any user account with the ISI_PRIV_SYS_UPGRADE privilege can perform the following catalog-related actions, expressed as flags to the isi upgrade catalog command:

Action

Description

Clean

List packages in the catalog

Export

Save a catalog item to a user specified file location

Import

Verify and add a new .isi package file into the catalog

List

List packages in the catalog

Readme

Display the README text from a catalog item or .isi package file

Remove

Manually remove a package from the catalog

Repair

Re-verify all catalog packages an rebuild the database 

Verify

Verify the signature of a catalog item or .isi package file 

Package verification leverages the OneFS OpenSSL library, which enables a SHA256 hash of the manifest to be verified against the certificate. As part of this process, the chain-of-trust for the included certificate is compared with contents of the /etc/ssl/certs directory, and the distinguished name on the manifest checked against /etc/upgrade/identities file. Finally, the SHA256 hash of the data regions is compared against values from the manifest.

To check the signature, use the isi upgrade catalog verify command. For example:

# isi upgrade catalog verify --file /ifs/install.isi
Item             Verified
--------------------------
/ifs/install.isi True
--------------------------
Total: 1

To display additional install image details, use the isi_packager view command:

# isi_packager view --package /ifs/install.isi
== Region 1 ==
Type: OneFS Install Image
Name: OneFS_Install_0x90500B000000AC8_B_MAIN_2760(RELEASE)
Hash: ef7926cfe2255d7a620eb4557a17f7650314ce1788c623046929516d2d672304
Size: 397666098
 
== Footer Details ==
Format Version: 1
 Manifest Size: 296
Signature Size: 2838
Timestamp Size: 1495
 Manifest Hash: 066f5d6e6b12081d3643060f33d1a25fe3c13c1d13807f49f51475a9fc9fd191
Signature Hash: 5be88d23ac249e6a07c2c169219f4f663220d4985e58b16be793936053a563a3
Timestamp Hash: eca62a3c7c3f503ca38b5daf67d6be9d57c4fadbfd04dbc7c5d7f1ff80f9d948
 
== Signature Details ==
Fingerprint:     33fba394a5a0ebb11e8224a30627d3cd91985ccd
Issuer:          ISLN
Subject:         US / WA / Sea / Isln OneFS.
Organization:    Isln Powerscale OneFS
Expiration:      2022-09-07 22:00:22
Ext Key Usage:   codesigning

You can list the packages in the catalog, as follows:

# isi upgrade catalog list
ID    Type  Description                                              README
-----------------------------------------------------------------------------
cdb88 OneFS OneFS 9.4.0.0_build(2797)style(11) / B_MAIN_2797(RELEASE) -
3a145 DSP   Drive_Support_v1.39.1                                    Included 
840b8 Patch HealthCheck_9.2.1_2021-09                                Included 
aa19b Patch 9.3.0.2_GA-RUP_2021-12_PSP-1643                          Included
-----------------------------------------------------------------------------
Total: 4

Note that the package ID is comprised of the first few characters of SHA256 hash.

Packages are automatically imported when used, and verified upon import. You can also perform verification and import manually, if desired:

# isi upgrade catalog verify --file Drive_Support_v1.39.1.isi 
Item                                      Verified 
------------------------------------------------- 
/ifs/packages/Drive_Support_v1.39.1.isi True 
------------------------------------------------- 
# isi upgrade catalog import Drive_Support_v1.39.1.isi

You can also export packages from the catalog and copy them to another cluster, for example. Generally, exported packages can be re-imported, too.

# isi upgrade catalog list 
ID    Type  Description                                               README 
----------------------------------------------------------------------------- 
00b9c OneFS OneFS 9.5.0.0_build(2625)style(11) / B_MAIN_2625(RELEASE) – 
3a145 DSP Drive_Support_v1.39.1 Included 
----------------------------------------------------------------------------- 
Total: 5 
# isi upgrade catalog export --id 3a145 --file /ifs/Drive_Support_v1.39.1.isi

However, auto-generated OneFS images cannot be reimported.

The README column of the isi upgrade catalog list output indicates whether release notes are included for a .isi file or catalog item. If available, you can view them as follows:

# isi upgrade catalog readme --file HealthCheck_9.2.1_2021-09.isi | less Updated: September 02, 2021 *****************************************************************************
HealthCheck_9.2.1_2021-09: Patch for OneFS 9.2.1.x. 
This patch contains the 2021-09 RUP for the Isilon HealthCheck System 
***************************************************************************** 
This patch can be installed on clusters running the following OneFS version: 
* 9.2.1.x 
:

Within a readme file, details typically include a short description of the artifact, and which minimum OneFS version the cluster is required to be running for installation.

Cleanup of patches and OneFS images is performed automatically upon commit. Any installed packages require the artifact to be present in the catalog for a successful uninstall. Similarly, the committed OneFS image is required when removing a patch or when expanding the cluster by adding a node.

You can remove artifacts manually, as follows:

# isi upgrade catalog remove --id 840b8 
This will remove the specified artifact and all related metadata. 
Are you sure? (yes/[no]): yes

However, always use caution if attempting to manually remove a package.

When it comes to catalog housekeeping, the ‘clean’ function removes any catalog artifact files without database entries, although normally this happens automatically when an item is removed.

# isi upgrade catalog clean 
This will remove any artifacts that do not have associated metadata in the database. 
Are you sure? (yes/[no]): yes

Additionally, the catalog ‘repair’ function rebuilds the database, re-imports all valid items, and re-verifies their signatures:

# isi upgrade catalog repair 
This will attempt to repair the catalog directory. This will result in all stored artifacts being re-verified. Artifacts that fail to be verified will be deleted. Additionally, a new catalog directory will be initialized with the remaining artifacts. 
Are you sure? (yes/[no]): yes

When installing a signed upgrade, patch, firmware, or drive support package (DSP) on a cluster running OneFS 9.4 or later, the command syntax used is fundamentally the same as in prior OneFS versions, with only the file extension itself having changed. The actual install file will have the ‘.isi’ extension, and the file containing the hash value for download verification will have a ‘.isi.sha256’ suffix. For example, take the OneFS install files:

  • OneFS_v9.5.0.0_Install.isi
  • OneFS_v9.5.0.0_Install.isi.sha256

You can use the following syntax to initiate a parallel OneFS signed upgrade:

# isi upgrade start --install-image-path /ifs/install.isi -–parallel 

Or, if the desired upgrade image package is already in the catalog, you can instead use the —install-image-id flag to install it:

# isi upgrade start --install-image-id 00b9c –parallel

Or to upgrade a cluster’s firmware:

# isi upgrade firmware start --fw-pkg /ifs/IsiFw_Package_v10.3.7.isi –-rolling

To upgrade a cluster’s firmware using the ID of a package that’s in the catalog: 

# isi upgrade firmware start --fw-pkg-id cf01b -–rolling

To initiate a simultaneous upgrade of a patch:

# isi upgrade patches install --patch /ifs/patch.isi -–simultaneous 

And finally, to initiate a simultaneous upgrade of a drive firmware package:

# isi_dsp_install Drive_Support_v1.39.1.isi

Note that patches and drive support firmware are not currently able to be installed by their package IDs.

A committed upgrade image from the previous OneFS upgrade is automatically saved in the catalog, and also created automatically when a new cluster is configured. This image is required for new node joins, as well as when uninstalling patches. However, it’s worth noting that auto-created images will not have a signature and, although you can export them, they cannot be re-imported back into the catalog.

If the committed upgrade image is somehow missing, CELOG events are generated and the isi upgrade catalog repair command output displays an error. Additionally, when it comes to troubleshooting the signed upgrade process, it can pay to check the /var/log/messages and /var/log/isi_papi_d.log files, and the OneFS upgrade logs.

Author: Nick Trimbee

Home > Storage > PowerScale (Isilon) > Blogs

security PowerScale OneFS HTTP

OneFS and HTTP Security

Nick Trimbee Nick Trimbee

Mon, 22 Apr 2024 20:35:30 -0000

|

Read Time: 0 minutes

To enable granular HTTP security configuration, OneFS provides an option to disable nonessential HTTP components selectively. This can help reduce the overall attack surface of your infrastructure. Disabling a specific component’s service still allows other essential services on the cluster to continue to run unimpeded. In OneFS 9.4 and later, you can disable the following nonessential HTTP services:

Service

Description

PowerScaleUI

The OneFS WebUI configuration interface.

Platform-API-External

External access to the OneFS platform API endpoints.

Rest Access to Namespace (RAN)

REST-ful access by HTTP to a cluster’s /ifs namespace.

RemoteService

Remote Support and In-Product Activation.

SWIFT (deprecated)

Deprecated object access to the cluster using the SWIFT protocol. This has been replaced by the S3 protocol in OneFS.

You can enable or disable each of these services independently, using the CLI or platform API, if you have a user account with the ISI_PRIV_HTTP RBAC privilege.

You can use the isi http services CLI command set to view and modify the nonessential HTTP services:

# isi http services list
ID                     Enabled
------------------------------
Platform-API-External Yes
PowerScaleUI          Yes
RAN                   Yes
RemoteService         Yes
SWIFT                 No
------------------------------
Total: 5

For example, you can easily disable remote HTTP access to the OneFS /ifs namespace as follows:

# isi http services modify RAN --enabled=0

You are about to modify the service RAN. Are you sure? (yes/[no]): yes

Similarly, you can also use the WebUI to view and edit a subset of the HTTP configuration settings, by navigating to Protocols > HTTP settings:

WebUI screenshot showing HTTP configuration settings.

That said, the implications and impact of disabling each of the services is as follows:

Service

Disabling impacts

WebUI

The WebUI is completely disabled, and access attempts (default TCP port 8080) are denied with the warning Service Unavailable. Please contact Administrator.

If the WebUI is re-enabled, the external platform API service (Platform-API-External) is also started if it is not running. Note that disabling the WebUI does not affect the PlatformAPI service.

Platform API

External API requests to the cluster are denied, and the WebUI is disabled, because it uses the Platform-API-External service.

Note that the Platform-API-Internal service is not impacted if/when the Platform-API-External is disabled, and internal pAPI services continue to function as expected.

If the Platform-API-External service is re-enabled, the WebUI will remain inactive until the PowerScaleUI service is also enabled.

RAN

If RAN is disabled, the WebUI components for File System Explorer and File Browser are also automatically disabled.

From the WebUI, attempts to access the OneFS file system explorer (File System > File System Explorer) fail with the warning message Browse is disabled as RAN service is not running. Contact your administrator to enable the service.

This same warning also appears when attempting to access any other WebUI components that require directory selection.

RemoteService

If RemoteService is disabled, the WebUI components for Remote Support and In-Product Activation are disabled.

In the WebUI, going to Cluster Management > General Settings and selecting the Remote Support tab displays the message The service required for the feature is disabled. Contact your administrator to enable the service.

In the WebUI, going to Cluster Management > Licensing and scrolling to the License Activation section displays the message The service required for the feature is disabled. Contact your administrator to enable the service.

SWIFT

Deprecated object protocol and disabled by default.

You can use the CLI command isi http settings view to display the OneFS HTTP configuration:

# isi http settings view
            Access Control: No
      Basic Authentication: No
    WebHDFS Ran HTTPS Port: 8443
                        Dav: No
         Enable Access Log: Yes
                      HTTPS: No
 Integrated Authentication: No
               Server Root: /ifs
                    Service: disabled
           Service Timeout: 8m20s
          Inactive Timeout: 15m
           Session Max Age: 4H
Httpd Controlpath Redirect: No

Similarly, you can manage and change the HTTP configuration using the isi http settings modify CLI command.

For example, to reduce the maximum session age from four to two hours:

# isi http settings view | grep -i age
           Session Max Age: 4H
# isi http settings modify --session-max-age=2H
# isi http settings view | grep -i age
           Session Max Age: 2H

The full set of configuration options for isi http settings includes:

Option

Description

--access-control <boolean>

Enable Access Control Authentication for the HTTP service. Access Control Authentication requires at least one type of authentication to be enabled.

--basic-authentication <boolean>

Enable Basic Authentication for the HTTP service.

--webhdfs-ran-https-port <integer>

Configure Data Services Port for the HTTP service.

--revert-webhdfs-ran-https-port

Set value to system default for --webhdfs-ran-https-port.

--dav <boolean>

Comply with Class 1 and 2 of the DAV specification (RFC 2518) for the HTTP service. All DAV clients must go through a single node. DAV compliance is NOT met if you go through SmartConnect, or using 2 or more node IPs.

--enable-access-log <boolean>

Enable writing to a log when the HTTP server is accessed for the HTTP service.

--https <boolean>

Enable the HTTPS transport protocol for the HTTP service.

--https <boolean>

Enable the HTTPS transport protocol for the HTTP service.

--integrated-authentication <boolean>

Enable Integrated Authentication for the HTTP service.

--server-root <path>

Document root directory for the HTTP service. Must be within /ifs.

--service (enabled | disabled | redirect | disabled_basicfile)

Enable/disable the HTTP Service or redirect to WebUI or disabled BasicFileAccess.

--service-timeout <duration>

The amount of time (in seconds) that the server will wait for certain events before failing a request. A value of 0 indicates that the service timeout value is the Apache default.

--revert-service-timeout

Set value to system default for --service-timeout.

--inactive-timeout <duration>

Get the HTTP RequestReadTimeout directive from both the WebUI and the HTTP service.

--revert-inactive-timeout

Set value to system default for --inactive-timeout.

--session-max-age <duration>

Get the HTTP SessionMaxAge directive from both WebUI and HTTP service.

--revert-session-max-age

Set value to system default for --session-max-age.

--httpd-controlpath-redirect <boolean>

Enable or disable WebUI redirection to the HTTP service.

Note that while the OneFS S3 service uses HTTP, it is considered a tier-1 protocol, and as such is managed using its own isi s3 CLI command set and corresponding WebUI area. For example, the following CLI command forces the cluster to only accept encrypted HTTPS/SSL traffic on TCP port 9999 (rather than the default TCP port 9021):

# isi s3 settings global modify --https-only 1 –https-port 9921
# isi s3 settings global view
         HTTP Port: 9020
        HTTPS Port: 9999
        HTTPS only: Yes
S3 Service Enabled: Yes

Additionally, you can entirely disable the S3 service with the following CLI command:

# isi services s3 disable
The service 's3' has been disabled.

Or from the WebUI, under Protocols > S3 > Global settings:

WebUI Screenshot showing the S3 global configuration settings.

 Author: Nick Trimbee



Home > Storage > PowerScale (Isilon) > Blogs

PowerScale OneFS management ports

OneFS and PowerScale F-series Management Ports

Nick Trimbee Nick Trimbee

Mon, 22 Apr 2024 20:12:20 -0000

|

Read Time: 0 minutes

Another security enhancement that OneFS 9.5 and later releases brings to the table is the ability to configure 1GbE NIC ports dedicated to cluster management on the PowerScale F900, F710, F600, F210, and F200 all-flash storage nodes and P100 and B100 accelerators. Since these platforms were released, customers have been requesting the ability to activate the 1GbE NIC ports so that the node management activity and front end protocol traffic can be separated on physically distinct interfaces.

For background, since their introduction, the F600 and F900 have shipped with a quad port 1GbE rNDC (rack Converged Network Daughter Card) adapter. However, these 1GbE ports were non-functional and unsupported in OneFS releases prior to 9.5. As such, the node management and front-end traffic was co-mingled on the front-end interface.

In OneFS 9.5 and later, 1GbE network ports are now supported on all of the PowerScale PowerEdge based platforms for the purposes of node management, and are physically separate from the other network interfaces. Specifically, this enhancement applies to the F900, F600, F200 all-flash nodes, and P100 and B100 accelerators.

Under the hood, OneFS has been updated to recognize the 1GbE rNDC NIC ports as usable for a management interface. Note that the focus of this enhancement is on factory enablement and support for existing F600 customers that have the unused 1GbE rNDC hardware. This functionality has also been back-ported to OneFS 9.4.0.3 and later RUPs. Since the introduction of this feature, there have been several requests raised about field upgrades, but that use case is separate and will be addressed in a later release through scripts, updates of node receipts, procedures, and so on.

Architecturally, aside from some device driver and accounting work, no substantial changes were required to the underlying OneFS or platform architecture to implement this feature. This means that in addition to activating the rNDC, OneFS now supports the relocated front-end NIC in PCI slots 2 or 3 for the F200, B100, and P100.

OneFS 9.5 and later recognizes the 1GbE rNDC as usable for the management interface in the OneFS Wizard, in the same way it always has for the H-series and A-series chassis-based nodes.

All four ports in the 1GbE NIC are active, and for the Broadcom board, the interfaces are initialized and reported as bge0, bge1, bge2, and bge3.

The pciconf CLI utility can be used to determine whether the rNDC NIC is present in a node. If it is, a variety of identification and configuration details are displayed. For example, let’s look at the following output from a Broadcom rNDC NIC in an F200 node:

# pciconf -lvV pci0:24:0:0
bge2@pci0:24:0:0: class=0x020000 card=0x1f5b1028 chip=0x165f14e4 rev=0x00 hdr=0x00
      class       = network
      subclass    = ethernet
      VPD ident   = ‘Broadcom NetXtreme Gigabit Ethernet’
      VPD ro PN   = ‘BCM95720’
      VPD ro MN   = ‘1028’
      VPD ro V0   = ‘FFV7.2.14’
      VPD ro V1   = ‘DSV1028VPDR.VER1.0’
      VPD ro V2   = ‘NPY2’
      VPD ro V3   = ‘PMT1’
      VPD ro V4   = ‘NMVBroadcom Corp’
      VPD ro V5   = ‘DTINIC’
      VPD ro V6   = ‘DCM1001008d452101000d45’

We can use the ifconfig CLI utility to determine the specific IP/interface mapping on the Broadcom rNDC interface. For example:

# ifconfig bge0
 TME-1: bge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
 TME-1:      ether 00:60:16:9e:X:X
 TME-1:      inet 10.11.12.13 netmask 0xffffff00 broadcast 10.11.12.255 zone 1
 TME-1:      inet 10.11.12.13 netmask 0xffffff00 broadcast 10.11.12.255 zone 0
 TME-1:      media: Ethernet autoselect (1000baseT <full-duplex>)
 TME-1:      status: active

In this output, the first IP address of the management interface’s pool is bound to bge0, which is the first port on the Broadcom rNDC NIC.

We can use the isi network pools CLI command to determine the corresponding interface. Within the system zone, the management interface is allocated an address from the configured IP range within its associated interface pool. For example:

# isi network pools list
ID                      SC Zone                  IP Ranges                   Allocation Method
----------------------------------------------------------------------------------------------
groupnet0.mgt.mgt       cluster_mgt_isln.com     10.11.12.13-10.11.12.20     static
# isi network pools view groupnet0.mgt.mgt | grep -i ifaces
               Ifaces: 1:mgmt-1, 2:mgmt-1, 3:mgmt-1, 4:mgmt-1, 5:mgmt-1

Or from the WebUI, under Network configuration > External network:

WebUI Network configuration screenshot, focusing on the External network tab

Drilling down into the mgt pool details shows the 1GbE management interfaces as the pool interface members:

WebUI screenshot shoing 1GbE management interfaces.

Note that the 1GbE rNDC network ports are solely intended as cluster management interfaces. As such, they are not supported for use with regular front-end data traffic.

The F900 and F600 nodes already ship with a four port 1GbE rNDC NIC installed. However, the F200, B100, and P100 platform configurations have also been updated to include a quad port 1GbE rNDC card. These new configurations have been shipping by default since January 2023. This required relocating the front end network’s 25GbE NIC (Mellanox CX4) to PCI slot 2 in the motherboard. Additionally, the OneFS updates needed for this feature have also now allowed the F200 platform to be offered with a 100GbE option too. The 100GbE option uses a Mellanox CX6 NIC in place of the CX4 in slot 2.

With this 1GbE management interface enhancement, the same quad-port rNDC card (typically the Broadcom 5720) that has been shipped in the F900 and F600 since their introduction, is now included in the F200, B100 and P100 nodes as well. All four 1GbE rNDC ports are enabled and active under OneFS 9.5 and later, too.

Node port ordering continues to follow the standard, increasing numerically from left to right. However, be aware that the port labels are not visible externally because they are obscured by the enclosure’s sheet metal.

The following back-of-chassis hardware images show the new placements of the NICs in the various F-series and accelerator platforms:

F600

F600 rear view.

F900

F900 rear view.

For both the F600 and F900, the NIC placement remains unchanged, because these nodes have always shipped with the 1GbE quad port in the rNDC slot since their launch.

F200

F200 rear view.

The F200 sees its front-end NIC moved to slot 3, freeing up the rNDC slot for the quad-port 1GbE Broadcom 5720.

B100 rear view.

Because the B100 backup accelerator has a fibre-channel card in slot 2, it sees its front-end NIC moved to slot 3, freeing up the rNDC slot for the quad-port 1GbE Broadcom 5720.

P100 rear view.

Finally, the P100 accelerator sees its front-end NIC moved to slot 3, freeing up the rNDC slot for the quad-port 1GbE Broadcom 5720.

Note that, while there is currently no field hardware upgrade process for adding rNDC cards to legacy F200 nodes or B100 and P100 accelerators, this will be addressed in a future release.

Author: Nick Trimbee

Home > Storage > PowerScale (Isilon) > Blogs

PowerScale API OneFS CLI USB ports

OneFS Security and USB Device Control

Nick Trimbee Nick Trimbee

Fri, 19 Apr 2024 17:34:44 -0000

|

Read Time: 0 minutes

As we’ve seen over the course of the last several articles, OneFS 9.5 delivers a wealth of security focused features. These span the realms of core file system, protocols, data services, platform, and peripherals. Among these security enhancements is the ability to manually or automatically disable a cluster’s USB ports from either the CLI, platform API, or by activating a security hardening policy.

In support of this functionality, the basic USB port control architecture is as follows:

Graphic depicting basic USB port control architecture.

To facilitate this, OneFS 9.5 and subsequent releases see the addition of a new gconfig variable, ‘usb_ports_disabled’, in ‘security_config’, specifically to track the status of USB Ports on a cluster. On receiving an admin request either from the CLI or the platform API handler to disable the USB port, OneFS modifies the security config parameter in gconfig. For example:

# isi_gconfig -t security_config | grep -i usb
usb_ports_disabled (bool) = true

Under the hood, the MCP (master control process) daemon watches for any changes to the ‘isi_security.gcfg’ security config file on the cluster. If the value for the ‘usb_ports_disabled’ variable in the ‘isi_security.gcfg’ file is updated, then MCP executes the ‘isi_config_usb’ utility to enact the desired change. Note that because ‘isi_config_usb’ operates per-node but the MCP actions are global (executed cluster wide), isi_config_usb is invoked across each node by a Python script to enable or disable the cluster’s USB Ports.

The USB Ports enable/disable feature is only supported on PowerScale F900, F600, F200, H700/7000, and A300/3000 clusters running OneFS 9.5 and later, and PowerScale F710 and F210 running OneFS 9.7 or later.

In OneFS 9.5 and later, USB port control can be manually configured from either the CLI or platform API.

Graphic showing USB port control manually configuration from either the CLI or platform API.

Note that there is no WebUI option at this time.

The following table lists the CLI and platform API configuration options for USB port control in OneFS 9.5 and later:

Action

CLI Syntax

Description

View

isi security settings view

Report the state of a cluster’s USB ports.

Enable

isi security settings modify --usb-ports-disabled=False 

Activate a cluster’s USB ports.

Disable

isi security settings modify --usb-ports-disabled=True

Disable a cluster’s USB ports.

For example:

# isi security settings view | grep -i usb
      USB Ports Disabled: No
# isi security settings modify --usb-ports-disabled=True
# isi security settings view | grep -i usb
      USB Ports Disabled: Yes

Similarly, to re-enable a cluster’s USB ports:

# isi security settings modify --usb-ports-disabled=False
# isi security settings view | grep -i usb
      USB Ports Disabled: No

Note that a user account with the OneFS ISI_PRIV_CLUSTER RBAC privilege is required to configure USB port changes on a cluster.

In addition to the ‘isi security settings’ CLI command, there is also a node-local CLI utility:

# whereis isi_config_usb
isi_config_usb: /usr/bin/isi_hwtools/isi_config_usb

As mentioned previously, ‘isi security settings’ acts globally on a cluster, using ‘isi_config_usb’ to effect its changes on each node.

Alternatively, cluster USB ports can also be enabled and disabled using the OneFS platform API with the following endpoints:

API

Method

Argument

Output

/16/security/settings

GET

No argument required.

JSON object for security settings with USB ports setting.

/16/security/settings

PUT

JSON object with boolean value for USB ports setting.

None or Error.

For example:

# curl -k -u <username>:<passwd> https://localhost:8080/platform/security/settings”
 
 {
 "settings" :
 {
 "fips_mode_enabled" : false,
 "restricted_shell_enabled" : false,
 "usb_ports_disabled" : true
 }
 }

In addition to manual configuration, the USB ports are automatically disabled if the STIG security hardening profile is applied to a cluster. 

Graphic depicting the USB ports being automatically disabled if the STIG security hardening profile is applied to a cluster. 

This is governed by the following section of XML code in the isi_hardening configuration file, which can be found at /etc/isi_hardening/profiles/isi_hardening.xml:

<CONFIG_ITEM id ="isi_usb_ports" version = "1">
              <PapiOperation>
                           <DO>
                                        <URI>/security/settings</URI>
                                        <BODY>{"usb_ports_disabled": true}</BODY>
                                        <KEY>settings</KEY>
                           </DO>
                           <UNDO>
                                        <URI>/security/settings</URI>
                                        <BODY>{"usb_ports_disabled": false}</BODY>
                                        <KEY>settings</KEY>
                           </UNDO>
                           <ACTION_SCOPE>CLUSTER</ACTION_SCOPE>
                           <IGNORE>FALSE</IGNORE>
              </PapiOperation>
 </CONFIG_ITEM>

The ‘isi_config_usb’ CLI utility can be used to display the USB port status on a subset of nodes. For example:

# isi_config_usb --nodes 1-10 --mode display
    Node   |   Current  |  Pending
-----------------------------------
    TME-9  |   UNSUP    | INFO: This platform is not supported to run this script.
    TME-8  |   UNSUP    | INFO: This platform is not supported to run this script.
    TME-1  |     On     |
    TME-3  |     On     |
    TME-2  |     On     |
   TME-10  |     On     |
    TME-7  |   AllOn    |
    TME-5  |   AllOn    |
    TME-6  |   AllOn    |
Unable to connect: TME-4

Note: In addition to port status, the output identifies any nodes that do not support USB port control (nodes 8 and 9 above) or that are unreachable (node 4 above).

When investigating or troubleshooting issues with USB port control, the following log files are the first places to check:

Log file

Description

/var/log/isi_papi_d.log

Will log any requests to enable or disable USB ports.

/var/log/isi_config_usb.log

Logs activity from the isi_config_usb script execution.

/var/log/isi_mcp

Logs activity related to MCP actions on invoking the API.

Author: Nick Trimbee

Home > Storage > PowerScale (Isilon) > Blogs

OneFS System Configuration Auditing – Part 2

Nick Trimbee Nick Trimbee

Thu, 18 Apr 2024 22:28:35 -0000

|

Read Time: 0 minutes

In the previous article of this series, we looked at the architecture and operation of OneFS configuration auditing. Now, we’ll turn our attention to its management, event viewing, and troubleshooting. 

The CLI command set for configuring ‘isi audit’ is split between two areas: 

Area 

Detail 

Syntax 

Events 

Specifies which events get logged, across three categories: 

•Audit Failure 

•Audit Success 

•Syslog Audit Events 

isi audit settings … 

Global 

Configuration of global audit parameters, including topics, zones, CEE, syslog, puring, retention, and more. 

isi audit settings global ... 



The ‘view’ argument for each command returns the following output: 

  1. Events: 

isi audit settings view 

            Audit Failure: create_filecreate_directoryopen_file_writeopen_file_readclose_file_unmodifiedclose_file_modifieddelete_filedelete_directoryrename_filerename_directoryset_security_fileset_security_directory 

            Audit Success: create_filecreate_directoryopen_file_writeopen_file_readclose_file_unmodifiedclose_file_modifieddelete_filedelete_directoryrename_filerename_directoryset_security_fileset_security_directory 

      Syslog Audit Events: create_filecreate_directoryopen_file_writeopen_file_readclose_file_unmodifiedclose_file_modifieddelete_filedelete_directoryrename_filerename_directoryset_security_fileset_security_directory 

Syslog Forwarding Enabled: No 

  1. Global: 

isi audit settings global view 

     Protocol Auditing Enabled: Yes 

                 Audited Zones: - 

               CEE Server URIs: - 

                      Hostname: 

       Config Auditing Enabled: Yes 

         Config Syslog Enabled: No 

         Config Syslog Servers: - 

     Config Syslog TLS Enabled: No 

  Config Syslog Certificate ID: 

       Protocol Syslog Servers: - 

   Protocol Syslog TLS Enabled: No 

Protocol Syslog Certificate ID: 

         System Syslog Enabled: No 

         System Syslog Servers: - 

     System Syslog TLS Enabled: No 

  System Syslog Certificate ID: 

          Auto Purging Enabled: No 

              Retention Period: 180 

       System Auditing Enabled: No 

While configuration auditing is disabled on OneFS by default, the following CLI syntax can be used to enable and verify config auditing across the cluster: 

isi audit settings global modify --config-auditing-enabled 1 

isi audit settings global view | grep -i 'config audit' 

       Config Auditing Enabled: Yes 

Similarly, to enable configuration change audit redirection to syslog: 

isi audit settings global modify --config-auditing-enabled true 

isi audit settings global modify --config-syslog-enabled true 

isi audit settings global view | grep -i 'config audit' 

       Config Auditing Enabled: Yes 

Or to disable redirection to syslog: 

isi audit settings global modify --config-syslog-enabled false 

isi audit settings global modify --config-auditing-enabled false 

CEE servers can be configured as follows: 

#isi audit settings global modify --add-cee-server-uris='<URL>’ 

For example:

#isi audit settings global modify --add-cee-server 

-uris='http://cee1.isilon.com:12228/cee' 

 

Auditing can be constrained by access zone, too: 

isi audit settings modify --add-audited-zones=audit_az1 

 

Note that, when auditing is enabled, the system zone is included by default. However, it can be excluded, if desired: 

isi audit setting modify --remove-audited-zones=System 

Access zone’s audit parameters can also be configured via the ‘isi zones’ CLI command set. For example: 

#isi zone zones create --all-auth-providers=true --audit-failure=all --audit-success=all --path=/ifs/data --name=audit_az1 

Granular audit event type configuration can be specified, if desired, to narrow the scope and reduce the amount of audit logging. 

For example, the following command syntax constrains auditing to read and logon failures and successful writes and deletes under path /ifs/data in the audit_az1 access zone:  

#isi zone zones create --all-auth-providers=true --audit-failure=read,logon --audit-success=write,delete --path=/ifs/data --name=audit_az1 

In addition to the CLI, the OneFS platform API can also be used to configure and manage auditing. For example, to enable configuration auditing on a cluster: 

PUT /platform/1/audit/settings 

Authorization: Basic QWxhZGRpbjpvcGVuIHN1c2FtZQ== 

{ 

'config_auditing_enabled': True 

} 

The following ‘204’ HTTP response code from the cluster indicates that the request was successful, and that configuration auditing is now enabled on the cluster. No message body is returned for this request. 

204 No Content 

Content-type: text/plain,  

Allow: 'GET, PUT, HEAD' 

Similarly, to modify the config audit topic’s maximum cached messages threshold to a value of ‘1000’ via the API: 

PUT /1/audit/topics/config 

Authorization: Basic QWxhZGRpbjpvcGVuIHN1c2FtZQ== 

    { 

        "max_cached_messages": 1000 

    } 

Again, no message body is returned from OneFS for this request. 

204 No Content  

Content-type: text/plain,  

Allow: 'GET, PUT, HEAD' 

Note that, in the unlikely event that a cluster experiences an outage during which it loses quorum, auditing will be suspended until it is regained. Events similar to the following will be written to the /var/log/audit_d.log file: 

940b5c700]: Lost quorum! Audit logging will be disabled until /ifs is writeable again. 

2023-08-28T15:37:32.132780+00:00 <1.6> TME-1(id1) isi_audit_d[6495]: [0x345940b5c700]: Regained quorum. Logging resuming. 

When it comes to reading audit events on the cluster, OneFS natively provides the handy ‘isi_audit_viewer’ utility. For example, the following audit viewer output shows the events logged when the cluster admin added the ‘/ifs/tmp’ path to the SmartDedupe configuration, and created a new user named ‘test’1’: 

isi_audit_viewer 

[0: Tue Aug 29 23:01:16 2023] {"id":"f54a6bec-46bf-11ee-920d-0060486e0a26","timestamp":1693350076315499,"payload":{"user":{"token": {"UID":0, "GID":0, "SID": "SID:S-1-22-1-0", "GSID": "SID:S-1-22-2-0", "GROUPS": ["SID:S-1-5-11", "GID:5", "GID:10", "GID:20", "GID:70"], "protocol": 17, "zone id": 1, "client": "10.135.6.255", "local": "10.219.64.11" }},"uri":"/1/dedupe/settings","method":"PUT","args":{} 

,"body":{"paths":["/ifs/tmp"]} 

}} 

[1: Tue Aug 29 23:01:16 2023] {"id":"f54a6bec-46bf-11ee-920d-0060486e0a26","timestamp":1693350076391422,"payload":{"status":204,"statusmsg":"No Content","body":{}}} 

[2: Tue Aug 29 23:03:43 2023] {"id":"4cfce7a5-46c0-11ee-920d-0060486e0a26","timestamp":1693350223446993,"payload":{"user":{"token": {"UID":0, "GID":0, "SID": "SID:S-1-22-1-0", "GSID": "SID:S-1-22-2-0", "GROUPS": ["SID:S-1-5-11", "GID:5", "GID:10", "GID:20", "GID:70"], "protocol": 17, "zone id": 1, "client": "10.135.6.255", "local": "10.219.64.11" }},"uri":"/18/auth/users","method":"POST","args":{} 

,"body":{"name":"test1"} 

}} 

[3: Tue Aug 29 23:03:43 2023] {"id":"4cfce7a5-46c0-11ee-920d-0060486e0a26","timestamp":1693350223507797,"payload":{"status":201,"statusmsg":"Created","body":{"id":"SID:S-1-5-21-593535466-4266055735-3901207217-1000"} 

}} 

The audit log entries, such as those above, typically comprise the following components:


Order

Component

Detail

1

Timestamp

Timestamp in human readable form

2

ID

Unique entry ID

3

Timestamp

Timestamp in UNIX epoch time

4

Node

Node number

5

User tokens

The user tokens of the Roles and rights of user executing the command.
1. User persona (Unix/Windows
2. Primary group persona (Unix/Windows
3. Supplemental group personas (Unix/Windows)
4. RBAC privileges of the user executing the command

6

Interface

Interface used to generate the command:

1. 10 = pAPI / WebUI

2. 16 = Console CLI

3. 17 = SSH CLI

7

Zone

Access zone that the command was executed against

8

Client IP

Where the user connected from

9

Local node

Local node address where the command was executed

10

Command

Command syntax

11

Arguments

Command arguments

12

Body

Command body


The ‘isi_audit_viewer’ utility automatically reads the ‘config’ log topic by default, but can also be used read the ‘protocol’ log topic too. Its CLI command syntax is as follows: 

isi_audit_viewer -h 

Usage: isi_audit_viewer [ -n <nodeid> | -t <topic> | -s <starttime>| 

         -e <endtime> | -v ] 

         -n <nodeid> : Specify node id to browse (default: local node) 

         -t <topic>  : Choose topic to browse. 

            Topics are "config" and "protocol" (default: "config") 

         -s <start>  : Browse audit logs starting at <starttime> 

         -e <end>    : Browse audit logs ending at <endtime> 

         -v verbose  : Prints out start / end time range before printing 

             records 

Note that, on large clusters where there is heavy (in the 100,000’s) of audit writes, when running the isi_audit_viewer utility across the cluster with ‘isi_for_array’, it can potentially lead to memory starvation and other issues – especially if outputting to a directory under /ifs. As such, consider directing the output to a non-IFS location such as /var/temp. Also, the isi_audit_viewer ‘-s’ (start time) and ‘-e’ (end time) flags can be used to limit a search (iefor  1-5 minutes), helping reduce the size of data. 

In addition to reading audit events, the view is also a useful tool to assist with troubleshoot any auditing issues. Additionally, any errors that are encountered while processing audit events, and when delivering them to an external CEE server, are written to the log file ‘/var/log/isi_audit_cee.log’. Additionally, the protocol specific logs will contain any issues the audit filter has collecting while auditing events. 

Author: Nick Trimbee


Home > Storage > PowerScale (Isilon) > Blogs

OneFS System Configuration Auditing

Nick Trimbee Nick Trimbee

Thu, 18 Apr 2024 04:55:18 -0000

|

Read Time: 0 minutes

OneFS auditing can detect potential sources of data loss, fraud, inappropriate entitlements, access attempts that should not occur, and a range of other anomalies that are indicators of risk. This can be especially useful when the audit associates data access with specific user identities. 

In the interests of data security, OneFS provides chain of custody auditing by logging specific activity on the cluster. This includes OneFS configuration changes plus NFS, SMB, and HDFS client protocol activity which are required for organizational IT security compliance, as mandated by regulatory bodies like HIPAA, SOX, FISMA, MPAA, and more. 

OneFS auditing uses Dell’s Common Event Enabler (CEE) to provide compatibility with external audit applications. A cluster can write audit events across up to five CEE servers per node in a parallel, load-balanced configuration. This allows OneFS to deliver an end to end, enterprise grade audit solution which efficiently integrates with third party solutions like Varonis DatAdvantage. 

The following diagram outlines the basic architecture of OneFS audit: 

  Both system configuration changes, as well as protocol activity, can be easily audited on a PowerScale cluster. However, the protocol path is greyed out above, since it is outside the focus of this article. More information on OneFS protocol auditing can be found here. 

As illustrated above, the OneFS audit framework is centered around three main services. 

Service

Description

isi_audit_cee 

Service allowing OneFS to support third-party auditing applications. The main method of accessing protocol audit data from OneFS is through a third-party auditing application. 

isi_audit_d 

Responsible for per-node audit queues and managing the data store for those queues. It provides a protocol on which clients may produce event payloads within a given context. It establishes a Unix domain socket for queue producers and handles writing and rotation of log files in /ifs/.ifsvar/audit/logs/node###/{config,protocol}/*. 

isi_audit_syslog 

Daemon providing forwarding of audit config and protocol events to syslog. 

 The basic configuration auditing workflow sees a cluster config change request come in via either the OneFS CLI, WebUI or platform API. The API handler infrastructure passes this request to the isi_audit_d service which intercepts it as a client thread and adds it to the audit queue. It is then processed and passed via a backend thread and written to the audit log files (IFS) as appropriate.

If audit syslog forwarding has been configured, IFS also passes the event to the isi_audit_syslog daemon, where a supervisor process instructs a writer thread to send it to the syslog which in turn updates its pertinent /var/log/ logfiles.  

Similarly, if Common Event Enabler (CEE) forwarding has been enabled, IFS will also pass the request to the isi_audit_cee service where a delivery worker threads will intercept it and send the event to the CEE server pool. The isi_audit_cee heartbeat task makes CEE servers available for audit event delivery. Only after a CEE server has received a successful heartbeat will audit events be delivered to it. Every ten seconds, the heartbeat task wakes up and sends each CEE server in the configuration a heartbeat. While CEE servers are available and events are in memory, an attempt will be made to deliver these. Shutdown will only save audit log position if all the events are delivered to CEE since audit should not lose events. It isn't critical that all events are delivered at shutdown since any unsaved events can be resent to CEE on the next start of isi_audit_cee since CEE handles duplicates.

 Within OneFS, all audit data is organized by topic and is securely stored in the file system.

# isi audit topics list

Name     Max Cached Messages

-----------------------------

protocol 2048

config   1024

-----------------------------

Total: 2

Auditing can detect a variety of potential sources of data loss. These include unauthorized access attempts, inappropriate entitlements, plus a bevy of other fraudulent activities that plague organizations across the gamut of industries. Enterprises are increasingly required to comply with stringent regulatory mandates developed to protect against these sources of data theft and loss.

OneFS system configuration auditing is designed to track and record all configuration events that are handled by the API through the command-line interface (CLI).  

# isi audit topics view config

               Name: config

Max Cached Messages: 1024

Once enabled, system configuration auditing requires no additional configuration, and auditing events are automatically stored in the config audit topic directories. Audit access and management is governed by the ‘ISI_PRIV_AUDIT’ RBAC privilege, and OneFS provides a default ‘AuditAdmin’ role for this purpose.

Audit events are stored in a binary file under /ifs/.ifsvar/audit/logs. The logs automatically roll over to a new file after the size reaches 1 GB. The audit logs are consumable by auditing applications that support the Dell Common Event Enabler (CEE).

OneFS audit topics and settings can easily be viewed and modified. For example, to increase the configuration auditing maximum cached messages threshold to 2048 from the CLI:

# isi audit topics modify config --max-cached-messages 2048

# isi audit topics view config

 

               Name: config

Max Cached Messages: 2048

Audit configuration can also be modified or viewed per access zone and/or topic.

Operation 

CLI Syntax 

Method and URI 

Get audit settings 

isi audit settings  view 

GET <cluster-ip:port>/platform/3/audit/settings 

Modify audit settings 

isi audit settings modify … 

PUT <cluster-ip:port>/platform/3/audit/settings 

View JSON schema for this resource, including query parameters and object properties info. 

 

GET <cluster-ip:port>/platform/3/audit/settings?describe 

View JSON schema for this resource, including query parameters and object properties info. 

 

GET <cluster-ip:port>/platform/1/audit/topics?describe 

Configuration auditing can be enabled on a cluster from either the CLI or platform API. The current global audit configuration can be viewed as follows:

1# isi audit settings global view

     Protocol Auditing Enabled: No

                 Audited Zones: -

               CEE Server URIs: -

                       Hostname:

       Config Auditing Enabled: No

         Config Syslog Enabled: No

         Config Syslog Servers: -

     Config Syslog TLS Enabled: No

  Config Syslog Certificate ID:

       Protocol Syslog Servers: -

   Protocol Syslog TLS Enabled: No

Protocol Syslog Certificate ID:

         System Syslog Enabled: No

         System Syslog Servers: -

     System Syslog TLS Enabled: No

  System Syslog Certificate ID:

          Auto Purging Enabled: No

              Retention Period: 180

       System Auditing Enabled: No

In this case, configuration auditing is disabled – its default setting. The following CLI syntax will enable (and verify) configuration auditing across the cluster:

# isi audit settings global modify --config-auditing-enabled 1

# isi audit settings global view | grep -i 'config audit'

       Config Auditing Enabled: Yes

In the next article, we’ll look at the config audit management, event viewing, and troubleshooting.

To enable configuration change audit redirection to syslog:

# isi audit settings global modify --config-auditing-enabled true

# isi audit settings global modify --config-syslog-enabled true

# isi audit settings global view | grep -i 'config audit'

       Config Auditing Enabled: Yes

Similarly, to disable configuration change audit redirection to syslog:

# isi audit settings global modify --config-syslog-enabled false

# isi audit settings global modify --config-auditing-enabled false

configure audit

2.

#isi audit setting modify --add-cee-server-uris='http://seavee5.west.isilon.com:12228/cee'

4.

# isi audit settings modify --add-audited-zones=auditgti

4' if you don't want audit that much

# isi audit setting modify --remove-audited-zones=System

config zone

3.

#isi zone zones create --all-auth-providers=true --audit-failure=all --audit-success=all --path=/ifs/data --name=auditgti

3'. if you dont' want to audit that much

#isi zone zones create --all-auth-providers=true --audit-failure=read,logon --audit-success=write,delete --path=/ifs/data --name=auditgti

network pool

5.

#isi network create pool --name=subnet0:auditpool --access-zone=auditgit --iface=<your interface> --range=<your range>  

5' you can also audit System by default, so this step can be ignored  

other settings

#isi audit setting modify --hostname="<any name you want really, this just gets inserted into the payload>"

 

#isi audit setting modify --cee-log-time="Protocol@1900-01-01 00:00:01"

The platform API can also be used to configure and manage auditing. For example, to enable configuration auditing on a cluster:

PUT /platform/1/audit/settings

Authorization: Basic QWxhZGRpbjpvcGVuIHN1c2FtZQ==

{

'config_auditing_enabled': True

}

Response example

 

The HTTP ‘204 response code from the cluster indicates that the request was successful, and that configuration auditing is now enabled on the cluster. No message body is returned for this request.

204 No Content

Content-type: text/plain,  

Allow: 'GET, PUT, HEAD'

 

Similarly, to modify the config audit topic’s maximum cached messages threshold to a value of ‘1000’ via the API:

PUT /1/audit/topics/config

Authorization: Basic QWxhZGRpbjpvcGVuIHN1c2FtZQ==

    {

         "max_cached_messages": 1000

    }

Again, no message body is returned from OneFS for this request.

204 No Content  

Content-type: text/plain,  

Allow: 'GET, PUT, HEAD'

Note that, in the unlikely event that a cluster experiences an outage during which it loses quorum, auditing will be suspended until it is regained. Events similar to the following will be written to the /var/log/audit_d.log file:

940b5c700]: Lost quorum! Audit logging will be disabled until /ifs is writeable again.

2023-08-28T15:37:32.132780+00:00 <1.6> TME-1(id1) isi_audit_d[6495]: [0x345940b5c700]: Regained quorum. Logging resuming.

When it comes to reading audit events on the cluster, OneFS natively provides the handy ‘isi_audit_viewer’ utility. For example, the following audit viewer output shows the events logged when the cluster admin added the ‘/ifs/tmp’ path to the SmartDedupe configuration, and created a new user named ‘test’1’:

# isi_audit_viewer

[0: Tue Aug 29 23:01:16 2023] {"id":"f54a6bec-46bf-11ee-920d-0060486e0a26","timestamp":1693350076315499,"payload":{"user":{"token": {"UID":0, "GID":0, "SID": "SID:S-1-22-1-0", "GSID": "SID:S-1-22-2-0", "GROUPS": ["SID:S-1-5-11", "GID:5", "GID:10", "GID:20", "GID:70"], "protocol": 17, "zone id": 1, "client": "10.135.6.255", "local": "10.219.64.11" }},"uri":"/1/dedupe/settings","method":"PUT","args":{}

,"body":{"paths":["/ifs/tmp"]}

}}

[1: Tue Aug 29 23:01:16 2023] {"id":"f54a6bec-46bf-11ee-920d-0060486e0a26","timestamp":1693350076391422,"payload":{"status":204,"statusmsg":"No Content","body":{}}}

[2: Tue Aug 29 23:03:43 2023] {"id":"4cfce7a5-46c0-11ee-920d-0060486e0a26","timestamp":1693350223446993,"payload":{"user":{"token": {"UID":0, "GID":0, "SID": "SID:S-1-22-1-0", "GSID": "SID:S-1-22-2-0", "GROUPS": ["SID:S-1-5-11", "GID:5", "GID:10", "GID:20", "GID:70"], "protocol": 17, "zone id": 1, "client": "10.135.6.255", "local": "10.219.64.11" }},"uri":"/18/auth/users","method":"POST","args":{}

,"body":{"name":"test1"}

}}

[3: Tue Aug 29 23:03:43 2023] {"id":"4cfce7a5-46c0-11ee-920d-0060486e0a26","timestamp":1693350223507797,"payload":{"status":201,"statusmsg":"Created","body":{"id":"SID:S-1-5-21-593535466-4266055735-3901207217-1000"}

}}

The audit log entries, such as those above, typically comprise the following components:

  1. Timestamp (Human readable)
  2. Unique Entry ID
  3. Timestamp (Unix Epoch Time)
  4. Node Number
  5. The user tokens of the person executing the command
    1. User persona (Unix/Windows)
    2. Primary group persona (Unix/Windows)
    3. Supplemental group personas (Unix/Windows)
    4. RBAC privileges of the person executing the command
  6. Interface used to generate the command
    1. 10 = PAPI / WebUI
    2. 16 = Console
    3. 17 = SSH
  7. Access Zone that the command was executed against
  8. Where the user connected from
  9. The local node address where the command was executed
  10. Command
  11. Command arguments
  12. Command body

The ‘isi_audit_viewer’ utility automatically reads the ‘config’ log topic by default, but can also be used read the ‘protocol’ log topic too. Its CLI command syntax is as follows:

# isi_audit_viewer -h

Usage: isi_audit_viewer [ -n <nodeid> | -t <topic> | -s <starttime>|

         -e <endtime> | -v ]

         -n <nodeid> : Specify node id to browse (default: local node)

         -t <topic>  : Choose topic to browse.

            Topics are "config" and "protocol" (default: "config")

         -s <start>  : Browse audit logs starting at <starttime>

         -e <end>    : Browse audit logs ending at <endtime>

         -v verbose  : Prints out start / end time range before printing

             records

Note that, on large clusters where there is heavy (up to the 100,000’s) of audit writes, when running the isi_audit_viewer utility across the cluster with ‘isi_for_array’, it can potentially lead to memory starvation and other issues – especially if outputting to a directory under /ifs. As such, consider directing the output to a non-IFS location such as /var/temp. Also, the isi_audit_viewer ‘-s’ (start time) and ‘-e’ (end time) flags can be used to limit a search (for  1-5 minutes), helping reduce the size of data.

In addition to reading audit events, the view is also a useful tool to assist with troubleshoot any auditing issues. Additionally, any errors that are encountered while processing audit events, and when delivering them to an external CEE server, are written to the log file ‘/var/log/isi_audit_cee.log’. Additionally, the protocol specific logs will contain any issues the audit filter has collecting while auditing events.

Home > Storage > PowerScale (Isilon) > Blogs

PowerScale OneFS logfiles SupportAssist

OneFS Log Gather Transmission

Nick Trimbee Nick Trimbee

Wed, 17 Apr 2024 15:45:51 -0000

|

Read Time: 0 minutes

The OneFS isi_gather_info utility is the ubiquitous method for collecting and uploading a PowerScale cluster’s context and configuration to assist with the identification and resolution of bugs and issues. As such, it performs the following roles:

  • Executes many commands, scripts, and utilities on a cluster, and saves their results
  • Collates, or gathers, all these files into a single ‘gzipped’ package
  • Optionally transmits this log gather package back to Dell using a choice of several transport methods

By default, a log gather tarfile is written to the /ifs/data/Isilon_Support/pkg/ directory. It can also be uploaded to Dell by the following means:

Upload mechanism

Description 

TCP port

OneFS release support

SupportAssist / ESRS

Uses Dell Secure Remote Support (SRS) for gather upload.

443/8443

Any

FTP

Use FTP to upload the completed gather.

21

Any

FTPS

Use SSH-based encrypted FTPS to upload the gather.

22

Default in OneFS 9.5 and later

HTTP

Use HTTP to upload the gather.

80/443

Any

As indicated in this table, OneFS 9.5 and later releases now leverage FTPS as the default option for FTP upload, thereby protecting the upload of cluster configuration and logs with an encrypted transmission session.

Under the hood, the log gather process comprises an eight phase workflow, with transmission comprising the penultimate ‘Upload’ phase:

Graphic depicting log gathering process.

The details of each phase are as follows:

Phase

Description

1. Setup

Reads from the arguments passed in, and from any config files on disk, and sets up the config dictionary, which will be used throughout the rest of the codebase. Most of the code for this step is contained in isilon/lib/python/gather/igi_config/configuration.py. This is also the step in which the program is most likely to exit, if some config arguments end up being invalid.

2. Run local

Executes all the cluster commands, which are run on the same node that is starting the gather. All these commands run in parallel (up to the current parallelism value). This is typically the second longest running phase.

3. Run nodes

Executes the node commands across all of the cluster’s nodes. This runs on each node, and while these commands run in parallel (up to the current parallelism value), they do not run in parallel with the ‘Run local’ step.

4. Collect

Ensures that all of the results end up on the overlord node (the node that started the gather). If the gather is using /ifs, it is very fast; if it is not using /ifs, it needs to SCP all the node results to a single node.

5. Generate Extra Files

Generates nodes_info.xml and package_info.xml. These two files are present in every gather, and provide important metadata about the cluster.

6. Packing

Packs (tars and gzips) all the results. This is typically the longest running phase, often by an order of magnitude.

7. Upload

Transports the tarfile package to its specified destination using SupportAssist, ESRS, FTPS, FTP, HTTP, and so on. Depending on the geographic location, this phase might also be lengthy.

8. Cleanup

Cleans up any intermediary files that were created on the cluster. This phase will run even if the gather fails, or is interrupted.

Because the isi_gather_info tool is primarily intended for troubleshooting clusters with issues, it runs as root (or compadmin in compliance mode), because it needs to be able to execute under degraded conditions (such as without GMP, during upgrade, and under cluster splits, and so on). Given these atypical requirements, isi_gather_info is built as a standalone utility, rather than using the platform API for data collection.

While FTPS is the new default and recommended transport, the legacy plaintext FTP upload method is still available in OneFS 9.5 and later. As such, Dell’s log server, ftp.isilon.com, also supports both encrypted FTPS and plaintext FTP, so will not impact older release FTP log upload behavior.

This OneFS 9.5 FTPS security enhancement encompasses three primary areas where an FTPS option is now supported:

  • Directly executing the /usr/bin/isi_gather_info utility
  • Running using the isi diagnostics gather CLI command set
  • Creating a diagnostics gather through the OneFS WebUI

For the isi_gather_info utility, two new options are included in OneFS 9.5 and later releases:

New option for isi_gather_info

Description

Default value

--ftp-insecure

Enables the gather to use unencrypted FTP transfer.

False

--ftp-ssl-cert

Enables the user to specify the location of a special SSL certificate file.

Empty string. Not typically required.

Similarly, there are two corresponding options in OneFS 9.5 and later for the isi diagnostics CLI command:

New option for isi diagnostics

Description

Default value

--ftp-upload-insecure

Enables the gather to use unencrypted FTP transfer.

No

--ftp-upload-ssl-cert

Enables the user to specify the location of a special SSL certificate file.

Empty string. Not typically required.

Based on these options, the following table provides some command syntax usage examples, for both FTPS and FTP uploads:

FTP upload type

Description

Example isi_gather_info syntax

Example isi diagnostics syntax

Secure upload (default)

Upload cluster logs to the Dell log server (ftp.isilon.com) using encrypted FTP (FTPS).

# isi_gather_info

Or

# isi_gather_info --ftp

# isi diagnostics gather start

Or

# isi diagnostics gather start --ftp-upload-insecure=no

Secure upload

Upload cluster logs to an alternative server using encrypted FTPS.

# isi_gather_info --ftp-host <FQDN> --ftp-ssl-cert <SSL_CERT_PATH>

# isi diagnostics gather start --ftp-upload-host=<FQDN> --ftp-ssl-cert= <SSL_CERT_PATH>

Unencrypted upload

Upload cluster logs to the Dell log server (ftp.isilon.com) using plaintext FTP.

# isi_gather_info --ftp-insecure

# isi diagnostics gather start --ftp-upload-insecure=yes

Unencrypted upload

Upload cluster logs to an alternative server using plaintext FTP.

# isi_gather_info --ftp-insecure --ftp-host <FQDN>

# isi diagnostics gather start --ftp-upload-host=<FQDN> --ftp-upload-insecure=yes

Note that OneFS 9.5 and later releases provide a warning if the cluster admin elects to continue using non-secure FTP for the isi_gather_info tool. Specifically, if the --ftp-insecure option is configured, the following message is displayed, informing the user that plaintext FTP upload is being used, and that the connection and data stream will not be encrypted:

# isi_gather_info --ftp-insecure
You are performing plain text FTP logs upload.
This feature is deprecated and will be removed
in a future release. Please consider the possibility
of using FTPS for logs upload. For further information,
please contact PowerScale support
...

In addition to the command line, log gathers can also be configured using the OneFS WebUI by navigating to Cluster management > Diagnostics > Gather settings.

WebUI screenshot showing FTP/FTPS upload options.

The Edit gather settings page in OneFS 9.5 and later has been updated to reflect FTPS as the default transport method, plus the addition of radio buttons and text boxes to accommodate the new configuration options.

If plaintext FTP upload is configured, the healthcheck command will display a warning that plaintext upload is used and is no longer a recommended option. For example:

CLI screenshot showing a healthcheck warning that plain-text upload is used and is no longer a recommended option.

For reference, the OneFS 9.5 and later isi_gather_info CLI command syntax includes the following options:

Option

Description

--upload <boolean>

Enable gather upload.

--esrs <boolean>

Use ESRS for gather upload.

--noesrs

Do not attempt to upload using ESRS.

--supportassist

Attempt SupportAssist upload.

--nosupportassist

Do not attempt to upload using SupportAssist.

--gather-mode (incremental | full)

Type of gather: incremental or full.

--http-insecure <boolean>

Enable insecure HTTP upload on completed gather.

--http-host <string>

HTTP Host to use for HTTP upload.

--http-path <string>

Path on HTTP server to use for HTTP upload.

--http-proxy <string>

Proxy server to use for HTTP upload.

--http-proxy-port <integer>

Proxy server port to use for HTTP upload.

--ftp <boolean>

Enable FTP upload on completed gather.

--noftp

Do not attempt FTP upload.

--set-ftp-password

Interactively specify alternate password for FTP.

--ftp-host <string>

FTP host to use for FTP upload.

--ftp-path <string>

Path on FTP server to use for FTP upload.

--ftp-port <string>

Specifies alternate FTP port for upload.

--ftp-proxy <string>

Proxy server to use for FTP upload.

--ftp-proxy-port <integer>

Proxy server port to use for FTP upload.

--ftp-mode <value>

Mode of FTP file transfer. Valid values are both, active, and passive.

--ftp-user <string>

FTP user to use for FTP upload.

--ftp-pass <string>

Specify alternative password for FTP.

--ftp-ssl-cert <string>

Specifies the SSL certificate to use in FTPS connection.

--ftp-upload-insecure <boolean>

Whether to attempt a plaintext FTP upload.

--ftp-upload-pass <string>

FTP user to use for FTP upload password.

--set-ftp-upload-pass

Specify the FTP upload password interactively.

When a logfile gather arrives at Dell, it is automatically unpacked by a support process and analyzed using the logviewer tool.

Author: Nick Trimbee

Home > Storage > PowerScale (Isilon) > Blogs

PowerScale OneFS

PowerScale OneFS 9.8

Nick Trimbee Nick Trimbee

Tue, 09 Apr 2024 14:00:00 -0000

|

Read Time: 0 minutes

It’s launch season here at Dell Technologies, and PowerScale is already scaling up spring with the innovative OneFS 9.8 release which shipped today, 9th April 2024. This new 9.8 release has something for everyone, introducing PowerScale innovations in cloud, performance, serviceability, and ease of use.

 A figure describing the differences in OneFS versions 9.6, 9.7, and 9.8. OneFS 9.8 includes APEX File Storage for Azure, NFSv4.1 over RDMA, Job Engine SmartThrottling, Serviceability enhancements in SmartLog and Auto-analysis, IPv6 Source-based routing, streaming write performance, and multipath client driver.Figure 1. OneFS 9.8 release features

APEX File Storage for Azure

After the debut of APEX File Storage for AWS last year, OneFS 9.8 amplifies PowerScale’s presence in the public cloud by introducing APEX File Storage for Azure.

A figure describing how APEX File Storage for Azure interacts with OneFS and the cloud.Figure 2. OneFS 9.8 APEX File Storage for Azure

In addition to providing the same OneFS software platform on-prem and in the cloud as well as customer-managed for full control, APEX File Storage for Azure in OneFS 9.8 provides linear capacity and performance scaling from four to eighteen SSD nodes and up to 3PB per cluster, making it a solid fit for AI, ML, and analytics applications, as well as traditional file shares and home directories and vertical workloads like M&E, healthcare, life sciences, and financial services.

A diagram showing how OneFS 9.8 works within PowerScale and alongside APEX file storage for Azure and AWS, including multi-protocol access, data reduction, CloudPools, SmartQuotas, SyncIQ, SnapshotIQ, SmartQoS, and SmartConnect.Figure 3. Dell PowerScale scale-out architecture

PowerScale’s scale-out architecture can be deployed on customer-managed AWS and Azure infrastructure, providing the capacity and performance needed to run a variety of unstructured workflows in the public cloud.

Once in the cloud, existing PowerScale investments can be further leveraged by accessing and orchestrating your data through the platform's multi-protocol access and APIs. 

This includes the common OneFS control plane (CLI, WebUI, and platform API) and the same enterprise features, such as Multi-protocol, SnapshotIQ, SmartQuotas, Identity management, and so on.        

Simplicity and efficiency

OneFS 9.8 SmartThrottling is an automated impact control mechanism for the job engine, allowing the cluster to automatically throttle job resource consumption if it exceeds pre-defined thresholds in order to prioritize client workloads. 

OneFS 9.8 also delivers automatic on-cluster core file analysis, and SmartLog provides an efficient, granular log file gathering and transmission framework. Both of these new features help dramatically accelerate the ease and time to resolution of cluster issues.

Performance

OneFS 9.8 also adds support for Remote Direct Memory Access (RDMA) over NFS 4.1 support for applications and clients. This allows for substantially higher throughput performance – especially in the case of single-connection and read-intensive workloads such as machine learning and generative AI model training – while also reducing both cluster and client CPU utilization and provides the foundation for interoperability with NVIDIA’s GPUDirect.

RDMA over NFSv4.1 in OneFS 9.8 leverages the ROCEv2 network protocol. OneFS CLI and WebUI configuration options include global enablement and IP pool configuration, filtering, and verification of RoCEv2 capable network interfaces. NFS over RDMA is available on all PowerScale platforms containing Mellanox ConnectX network adapters on the front end and with a choice of 25, 40, or 100 Gigabit Ethernet connectivity. The OneFS user interface helps easily identify which of a cluster’s NICs support RDMA.

Under the hood, OneFS 9.8 introduces efficiencies such as lock sharding and parallel thread handling, delivering a substantial performance boost for streaming write-heavy workloads such as generative AI inferencing and model training. Performance scales linearly as compute is increased, keeping GPUs busy and allowing PowerScale to easily support AI and ML workflows both small and large. OneFS 9.8 also includes infrastructure support for future node hardware platform generations.

Multipath Client Driver

The addition of a new Multipath Client Driver helps expand PowerScale’s role in Dell Technologies’ strategic collaboration with NVIDIA, delivering the first and only end-to-end large scale AI system. This is based on the PowerScale F710 platform in conjunction with PowerEdge XE9680 GPU servers and NVIDIA’s Spectrum-X Ethernet switching platform to optimize performance and throughput at scale.

In summary, OneFS 9.8 brings the following new features to the Dell PowerScale ecosystem:

Feature

Info

Cloud

  • APEX File Storage for Azure
  • Up to 18 SSD nodes and 3PB per cluster

Simplicity

  • Job Engine SmartThrottling
  • Source-based routing for IPv6 networks

Performance

  • NFSv4.1 over RDMA
  • Streaming write performance enhancements
  • Infrastructure support for next generation all-flash node hardware platform

Serviceability

  • Automatic on-cluster core file analysis
  • SmartLog efficient, granular log file gathering

We’ll be taking a deeper look at this new functionality in blog articles over the course of the next few weeks. 

Meanwhile, the new OneFS 9.8 code is available on the Dell Online Support site, both as an upgrade and reimage file, allowing installation and upgrade of this new release.

 

Author: Nick Trimbee

Home > Storage > PowerScale (Isilon) > Blogs

PowerScale AWS OneFS

PowerScale OneFS 9.7

Nick Trimbee Nick Trimbee

Wed, 13 Dec 2023 13:55:00 -0000

|

Read Time: 0 minutes

Dell PowerScale is already powering up the holiday season with the launch of the innovative OneFS 9.7 release, which shipped today (13th December 2023). This new 9.7 release is an all-rounder, introducing PowerScale innovations in Cloud, Performance, Security, and ease of use.

After the debut of APEX File Storage for AWS earlier this year, OneFS 9.7 extends and simplifies the PowerScale in the public cloud offering, delivering more features on more instance types across more regions.

In addition to providing the same OneFS software platform on-prem and in the cloud, and customer-managed for full control, APEX File Storage for AWS in OneFS 9.7 sees a 60% capacity increase, providing linear capacity and performance scaling up to six SSD nodes and 1.6 PiB per namespace/cluster, and up to 10GB/s reads and 4GB/s writes per cluster. This can make it a solid fit for traditional file shares and home directories, vertical workloads like M&E, healthcare, life sciences, finserv, and next-gen AI, ML and analytics applications.

Enhancements to APEX File Storage for AWS

PowerScale’s scale-out architecture can be deployed on customer managed AWS EBS and ECS infrastructure, providing the scale and performance needed to run a variety of unstructured workflows in the public cloud. Plus, OneFS 9.7 provides an ‘easy button’ for streamlined AWS infrastructure provisioning and deployment.

Once in the cloud, you can further leverage existing PowerScale investments by accessing and orchestrating your data through the platform's multi-protocol access and APIs.

This includes the common OneFS control plane (CLI, WebUI, and platform API), and the same enterprise features: Multi-protocol, SnapshotIQ, SmartQuotas, Identity management, and so on.

With OneFS 9.7, APEX File Storage for AWS also sees the addition of support for HDFS and FTP protocols, in addition to NFS, SMB, and S3. Granular performance prioritization and throttling is also enabled with SmartQoS, allowing admins to configure limits on the maximum number of protocol operations that NFS, S3, SMB, or mixed protocol workloads can consume on an APEX File Storage for AWS cluster.

Security

With data integrity and protection being top of mind in this era of unprecedented cyber threats, OneFS 9.7 brings a bevy of new features and functionality to keep your unstructured data and workloads more secure than ever. These new OneFS 9.7 security enhancements help address US Federal and DoD mandates, such as FIPS 140-2 and DISA STIGs – in addition to general enterprise data security requirements. Included in the new OneFS 9.7 release is a simple cluster configuration backup and restore utility, address space layout randomization, and single sign-on (SSO) lookup enhancements.

Data mobility

On the data replication front, SmartSync sees the introduction of GCP as an object storage target in OneFS 9.7, in addition to ECS, AWS and Azure. The SmartSync data mover allows flexible data movement and copying, incremental resyncs, push and pull data transfer, and one-time file to object copy.

Performance improvements

Building on the streaming read performance delivered in a prior release, OneFS 9.7 also unlocks dramatic write performance enhancements, particularly for the all-flash NVMe platforms - plus infrastructure support for future node hardware platform generations. A sizable boost in throughput to a single client helps deliver performance for the most demanding GenAI workloads, particularly for the model training and inferencing phases. Additionally, the scale-out cluster architecture enables performance to scale linearly as GPUs are increased, allowing PowerScale to easily support AI workflows from small to large.

Cluster support for InsightIQ 5.0

The new InsightIQ 5.0 software expands PowerScale monitoring capabilities, including a new user interface, automated email alerts, and added security. InsightIQ 5.0 is available today for all existing and new PowerScale customers at no additional charge. These innovations are designed to simplify management, expand scale and security, and automate operations for PowerScale performance monitoring for AI, GenAI, and all other workloads.

In summary, OneFS 9.7 brings the following new features and functionality to the Dell PowerScale ecosystem:

We’ll be taking a deeper look at these new features and functionality in blog articles over the course of the next few weeks. 

Meanwhile, the new OneFS 9.7 code is available on the Dell Support site, as both an upgrade and reimage file, allowing both installation and upgrade of this new release. 

Author: Nick Trimbee

Home > Storage > PowerScale (Isilon) > Blogs

PowerScale OneFS

PowerScale Platform Update

Nick Trimbee Nick Trimbee

Thu, 07 Dec 2023 00:51:33 -0000

|

Read Time: 0 minutes

In this article, we’ll take a quick peek at the new PowerScale Hybrid H700/7000 and Archive A300/3000 hardware platforms that were released last month. So, the current PowerScale platform family hierarchy is as follows:

 

Here’s the lowdown on the new additions to the hardware portfolio: 

Model

Tier

Drive per Chassis & Drives

Max Chassis Capacity (16TB HDD)

CPU per Node

Memory per Node

Network

H700

Hybrid/Utility

Standard:

60 x 3.5” HDD

960TB

CPU: 2.9Ghz, 16c

Mem: 384GB

FE: 100GbE

BE: 100GbE or IB

H7000

Hybrid/Utility

Deep:

80 x 3.5” HDD

1280TB

CPU: 2.9Ghz, 16c

Mem: 384GB

FE: 100GbE

BE: 100GbE or IB

A300

Archive

Standard:

60 x 3.5” HDD

960TB

CPU: 1.9Ghz, 6c

Mem: 96GB

FE: 25GbE

BE: 25GbE or IB

A3000

Archive

Deep:

80 x 3.5” HDD

1280TB

CPU: 1.9Ghz, 6c

Mem: 96GB

FE: 25GbE

BE: 25GbE or IB

 

The PowerScale H700 provides performance and value to support demanding file workloads. With up to 960 TB of HDD per chassis, the H700 also includes inline compression and deduplication capabilities to further extend the usable capacity.

The PowerScale H7000 is a versatile, high performance, and high capacity hybrid platform with up to 1280 TB per chassis. The deep chassis based H7000 is an ideal to consolidate a range of file workloads on a single platform. The H7000 includes inline compression and deduplication capabilities.

On the active archive side, the PowerScale A300 combines performance, near-primary accessibility, value, and ease of use. The A300 provides between 120 TB to 960 TB per chassis and scales to 60 PB in a single cluster. The A300 includes inline compression and deduplication capabilities. 

PowerScale A3000: is an ideal solution for high performance, high density, and deep archive storage that safeguards data efficiently for long-term retention. The A3000 stores up to 1280 TB per chassis and scales to north of 80 PB in a single cluster. The A3000 also includes inline compression and deduplication.

These new H700/7000 and A300/3000 nodes require OneFS 9.2.1, and can be seamlessly added to an existing cluster. The benefits of offering the full complement of OneFS data services includes: snapshots, replication, quotas, analytics, data reduction, load balancing, and local and cloud tiering. All also contain SSD.

Unlike the all-flash PowerScale F900, F600, and F200 stand-alone nodes, which required a minimum of 3 nodes to form a cluster, a single chassis of 4 nodes is required to create a cluster which offers support for both InfiniBand and Ethernet backend network connectivity. 

Each F700/7000 and A300/3000 chassis contains four compute modules (one per node), and five drive containers, or sleds, per node. These sleds occupy bays in the front of each chassis, with a node’s drive sleds stacked vertically:

 

 

The drive sled is a tray which slides into the front of the chassis and contains between three and four 3.5 inch drives in an H700/0 or A300/0, depending on the drive size and configuration of the particular node. Both regular hard drives or self-encrypting drives (SEDs) are available in 2,4, 8, 12, and 16TB capacities.

 

 

Each drive sled has a white ‘not safe to remove’ LED on its front top left, as well as a blue power/activity LED, and an amber fault LED.

The compute modules for each node are housed in the rear of the chassis, and contain CPU, memory, networking, and SSDs, and power supplies. Nodes 1 & 2 are a node pair, as are nodes 3 & 4. Each node-pair shares a mirrored journal and two power supplies.

Here’s the detail of an individual compute module, which contains a multi core Cascade Lake CPU, memory, M2 flash journal, up to two SSDs for L3 cache, six DIMM channels, front end 40/100 or 10/25 Gb ethernet, 40/100 or 10/25 Gb ethernet or Infiniband, an ethernet management interface, and power supply and cooling fans:

Of particular interest is the ‘journal active’ LED, which is displayed as a white ‘hand icon’. When this is illuminated, it indicates that the mirrored journal is actively vaulting. 

A node’s compute module should not be removed from the chassis while this while LED is lit!

On the front of each chassis is an LCD front panel control with back-lit buttons and 4 LED Light Bar Segments - 1 per Node. These LEDs typically display blue for normal operation or yellow to indicate a node fault. This LCD display is hinged so it can be swung clear of the drive sleds for non-disruptive HDD replacement, for example.

So, in summary, the new Gen6 hardware delivers:

  • More Power
    • More cores, more memory and more cache 
    • A300/3000 up to 2x faster than previous generation (A200/2000)
  • More Choice
    • 100GbE, 25GbE and Infiniband options for cluster interconnect
    • Node compatibility for all hybrid and archive nodes
    • 30TB to 320TB per rack unit 
  • More Value
    • Inline data reduction across the PowerScale family
    • Lowest $/GB and most density among comparable solutions

 

Home > Storage > PowerScale (Isilon) > Blogs

PowerScale OneFS troubleshooting SSO

OneFS WebUI Single Sign-on Management and Troubleshooting

Nick Trimbee Nick Trimbee

Thu, 16 Nov 2023 20:53:16 -0000

|

Read Time: 0 minutes

Earlier in this series, we took a look at the architecture of the new OneFS WebUI SSO functionality. Now, we move on to its management and troubleshooting.

As we saw in the previous article, once the IdP and SP are configured, a cluster admin can enable SSO per access zone using the OneFS WebUI by navigating to Access > Authentication providers > SSO. From here, select the desired access zone and click the ‘Enable SSO’ toggle:

Or from the OneFS CLI using the following syntax:

# isi auth sso settings modify --sso-enabled 1

Once complete, the SSO configuration can be verified from a client web browser by browsing to the OneFS login screen. If all is operating correctly, redirection to the ADFS login screen will occur. For example:

After successful authentication with ADFS, cluster access is granted and the browser session is redirected back to the OneFS WebUI .

In addition to the new SSO WebUI pages, OneFS 9.5 also adds a subcommand to the ‘isi auth’ command set for configuring SSO from the CLI. This new syntax includes:

  • isi auth sso idps
  • isi auth sso settings  
  • isi auth sso sp

With these, you can use the following procedure to configure and enable SSO using the OneFS command line.

1. Define the ADFS instance in OneFS.

Enter the following command to create the IdP account:

# isi auth ads create <domain_name> <user> --password=<password> ...

where:

Attribute

Description

<domain_name>

Fully qualified Active Directory domain name that identifies the ADFS server. For example, idp1.isilon.com.

<user>

The user account that has permission to join machines to the given domain.

<password>

The password for <user>.

2. Next, add the IdP to the pertinent OneFS zone. Note that each of a cluster’s access zone(s) must have an IdP configured for it. The same IdP can be used for all the zones, but each access zone must be configured separately.

# isi zone zones modify --add-auth-providers

For example:

# isi zone zones modify system --add-auth-providers=lsa-activedirectoryprovider:idp1.isilon.com

3. Verify that OneFS can find users in Active Directory.

# isi auth users view idp1.isilon.com\\<username>

In the output, ensure that an email address is displayed. If not, return to Active Directory and assign email addresses to users.

4. Configure the OneFS hostname for SAML SSO.

# isi auth sso sp modify --hostname=<name>

Where <name> is the name that SAML SSO can use to represent the OneFS cluster to ADFS. SAML redirects clients to this hostname.

5. Obtain the ADFS metadata and store it under /ifs on the cluster.

In the following example, an HTTPS GET request is issued using the 'curl' utility to obtain the metadata from the IDP and store it under /ifs on the cluster.

# curl -o /ifs/adfs.xml https://idp1.isilon.com/FederationMetadata/2007-06/ FederationMetadata.xml

6. Create the IdP on OneFS using the ‘metadata-location’ path for the xml file in the previous step.

# isi auth sso idps create idp1.isilon.com --metadata-location="/ifs/adfs.xml"

7. Enable SSO:

# isi auth sso settings modify --sso-enabled=yes -–zone <zone>

Use the following syntax to view the IdP configuration:

# isi auth sso idps view <idp_ID>

For example:

# isi auth sso idps view idp
ID: idp
Metadata Location: /ifs/adfs.xml
Entity ID: https://dns.isilon.com/adfs/services/trust
Login endpoint of the IDP
URL: https://dns.isilon.com/adfs/ls/
Binding: wrn:oasis:names:tc:SAML:2.0:bidings:HTTP-Redirect
Logout endpoint of the IDP
URL: https://dns.isilon.com/adfs/ls/
Binding: wrn:oasis:names:tc:SAML:2.0:bidings:HTTP-Redirect
Response URL: -
Type: metadata
Signing Certificate: -
        Path:
        Issuer: CN-ADFS Signing – dns.isilon.com
        Subject: CN-ADFS Signing – dns.isilon.com
        Not Before: 2023-02-02T22:22:00
        Not After: 2024-02-02T22:22:00
        Status: valid
Value and Type
        Value: -----BEGIN CERTIFICATE-----
MITC9DCCAdygAwIBAgIQQQQc55appr1CtfPNj5kv+DANBgkqhk1G9w8BAQsFADA2
<snip>

Troubleshooting

If the IdP and/or SP Signing certificate happens to expire, users will be unable to login to the cluster with SSO and an error message will be displayed on the login screen.

In this example, the IdP certificate has expired, as described in the alert message. When this occurs, a warning is also displayed on the SSO Authentication page, as shown here:

To correct this, download either a new signing certificate from the identity provider or a new metadata file containing the IdP certificate details. When this is complete, you can then update the cluster’s IdP configuration by uploading the XML file or the new certificate.

Similarly, if the SP certificate has expired, the following notification alert is displayed upon attempted login:

The following error message is also displayed on the WebUI SSO tab, under Access > Authentication providers > SSO, along with a link to regenerate the metadata file:

The expired SP signing key and certificate can also be easily regenerated from the OneFS CLI:

# isi auth sso sp signing-key rekey
This command will delete any existing signing key and certificate and replace them with a newly generated signing key and certificate. Make sure the newly generated certificate is added to the IDP to ensure that the IDP can verify messages sent from the cluster. Are you sure?  (yes/[no]):   yes
# isi auth sso sp signing-key dump
-----BEGIN CERIFICATE-----
MIIE6TCCAtGgAwIBAgIJAP30nSyYUz/cMA0GCSqGSIb3DQEBCwUAMCYxJDAiBgNVBAMMG1Bvd2VyU2NhbGUgU0FNTCBTaWduaWSnIEtleTAeFw0yMjExMTUwMzU0NTFaFw0yMzExMTUwMzU0NTFaMCYxJDAiBgNVBAMMG1Bvd2VyU2NhbGUgU0FNTCBTaWduaWSnIEtleTCCAilwDQYJKoZIhvcNAQEBBQADggIPADCCAgoCggIBAMOOmYJ1aUuxvyH0nbUMurMbQubgtdpVBevy12D3qn+x7rgym8/v50da/4xpMmv/zbE0zJ0IVbWHZedibtQhLZ1qRSY/vBlaztU/nA90XQzXMnckzpcunOTG29SMO3x3Ud4*fqcP4sKhV
<snip>

When it is regenerated, either the XML file or certificate can be downloaded, and the cluster configuration updated by either metadata download or manual copy:

Finally, upload the SP details back to the identity provider.

For additional troubleshooting of OneFS SSO and authentication issues, there are some key log files to check. These include:

Log file

Information

/var/log/isi_saml_d.log

SAML specific log messages logged by isi_saml_d.

/var/log/apache2/webui_httpd_error.log

WebUI error messages including some SAML errors logged by the WebUI HTTP server.

/var/log/jwt.log

Errors related to token generation logged by the JWT service.

/var/log/lsassd.log

General authentication errors logged by the ‘lsassd’ service, such as failing to lookup users by email.

Author: Nick Trimbee

Home > Storage > PowerScale (Isilon) > Blogs

PowerScale OneFS SSL

OneFS SSL Certificate Renewal – Part 1

Nick Trimbee Nick Trimbee

Thu, 16 Nov 2023 04:57:00 -0000

|

Read Time: 0 minutes

When using either the OneFS WebUI or platform API (pAPI), all communication sessions are encrypted using SSL (Secure Sockets Layer), also known as Transport Layer Security (TLS). In this series, we will look at how to replace or renew the SSL certificate for the OneFS WebUI.

SSL requires a certificate that serves two principal functions: It grants permission to use encrypted communication using Public Key Infrastructure and authenticates the identity of the certificate’s holder.

Architecturally, SSL consists of four fundamental components:

SSL Component

Description

Alert

Reports issues.

Change cipher spec

Implements negotiated crypto parameters.

Handshake

Negotiates crypto parameters for SSL session. Can be used for many SSL/TCP connections.

Record

Provides encryption and MAC.

These sit in the stack as follows:

The basic handshake process begins with a client requesting an HTTPS WebUI session to the cluster. OneFS then returns the SSL certificate and public key. The client creates a session key, encrypted with the public key it is received from OneFS. At this point, the client only knows the session key. The client now sends its encrypted session key to the cluster, which decrypts it with the private key. Now, both the client and OneFS know the session key. So, finally, the session, encrypted using a symmetric session key, can be established. OneFS automatically defaults to the best supported version of SSL, based on the client request.

A PowerScale cluster initially contains a self-signed certificate, which can be used as-is or replaced with a third-party certificate authority (CA)-issued certificate. If the self-signed certificate is used upon expiry, it must be replaced with either a third-party (public or private) CA-issued certificate or another self-signed certificate that is generated on the cluster. The following are the default locations for the server.crt and server.key files.

File

Location

SSL certificate

/usr/local/apache2/conf/ssl.crt/server.crt

SSL certificate key

/usr/local/apache2/conf/ssl.key/server.key

The ‘isi certificate settings view’ CLI command displays all of the certificate-related configuration options. For example:

# isi certificate settings view

         Certificate Monitor Enabled: Yes

Certificate Pre Expiration Threshold: 4W2D

           Default HTTPS Certificate

                                      ID: default

                                 Subject: C=US, ST=Washington, L=Seattle, O="Isilon", OU=Isilon, CN=Dell, emailAddress=tme@isilon.com

                                  Status: valid

The above ‘certificate monitor enabled’ and ‘certificate pre expiration threshold’ configuration options govern a nightly cron job, which monitors the expiration of each managed certificate and fires a CELOG alert if a certificate is set to expire within the configured threshold. Note that the default expiration is 30 days (4W2D, which represents 4 weeks plus 2 days). The ‘ID: default’ configuration option indicates that this certificate is the default TLS certificate.

The basic certificate renewal or creation flow is as follows:

The steps below include options to complete a self-signed certificate replacement or renewal, or to request an SSL replacement or renewal from a Certificate Authority (CA).

Backing up the existing SSL certificate

The first task is to obtain the list of certificates by running the following CLI command, and identify the appropriate one to renew:

# isi certificate server list

ID      Name    Status  Expires

-------------------------------------------

eb0703b default valid   2025-10-11T10:45:52

-------------------------------------------

It’s always a prudent practice to save a backup of the original certificate and key. This can be easily accomplished using the following CLI commands, which, in this case, create the directory ‘/ifs/data/ssl_bkup’ directory, set the perms to root-only access, and copy the original key and certificate to it:

# mkdir -p /ifs/data/ssl_bkup

# chmod 700 /ifs/data/ssl_bkup

# cp /usr/local/apache24/conf/ssl.crt/server.crt /ifs/data/ssl_bkup

# cp /usr/local/apache24/conf/ssl.key/server.key /ifs/data/ssl_bkup

# cd !$

cd /ifs/data/ssl_bkup

# ls

server.crt      server.key

Renewing or creating a certificate

The next step in the process involves either the renewal of an existing certificate or creation of a certificate from scratch. In either case, first, create a temporary directory, for example /ifs/tmp:

# mkdir /ifs/tmp; cd /ifs/tmp

a)       Renew an existing self-signed Certificate.

The following syntax creates a renewal certificate based on the existing ssl.keyThe value of the ‘-days’ parameter can be adjusted to generate a certificate with the wanted expiration date. For example, the following command will create a one-year certificate.

# cp /usr/local/apache2/conf/ssl.key/server.key ./ ; openssl req -new -days 365 -nodes -x509 -key server.key -out server.crt

Answer the system prompts to complete the self-signed SSL certificate generation process, entering the pertinent information location and contact information. For example:

Country Name (2 letter code) [AU]:US
 State or Province Name (full name) [Some-State]:Washington
 Locality Name (eg, city) []:Seattle
 Organization Name (eg, company) [Internet Widgits Pty Ltd]:Isilon
 Organizational Unit Name (eg, section) []:TME
 Common Name (e.g. server FQDN or YOUR name) []:isilon.com
 Email Address []:tme@isilon.com

When all the information has been successfully entered, the server.csr and server.key files will be generated under the /ifs/tmp directory.

Optionally,  the attributes and integrity of the certificate can be verified with the following syntax:

# openssl x509 -text -noout -in server.crt

Next, proceed directly to the ‘Add the certificate to the cluster’ steps in section 4 of this article.

b)      Alternatively, a certificate and key can be generated from scratch, if preferred.

The following CLI command can be used to create an 2048-bit RSA private key:

# openssl genrsa -out server.key 2048

Generating RSA private key, 2048 bit long modulus

............+++++

 

...........................................................+++++

 

e is 65537 (0x10001)

Next, create a certificate signing request:

# openssl req -new -nodes -key server.key -out server.csr

For example: 

# openssl req -new -nodes -key server.key -out server.csr -reqexts SAN -config <(cat /etc/ssl/openssl.cnf <(printf "[SAN]\nsubjectAltName=DNS:isilon.com"))

You are about to be asked to enter information that will be incorporated

into your certificate request.

What you are about to enter is what is called a Distinguished Name or a DN.

There are quite a few fields but you can leave some blank

For some fields there will be a default value,

If you enter '.', the field will be left blank.

-----

Country Name (2 letter code) [AU]:US

State or Province Name (full name) [Some-State]:WA

Locality Name (eg, city) []:Seattle

Organization Name (eg, company) [Internet Widgits Pty Ltd]:Isilon

Organizational Unit Name (eg, section) []:TME

Common Name (e.g. server FQDN or YOUR name) []:h7001

Email Address []:tme@isilon.com

Please enter the following 'extra' attributes

to be sent with your certificate request

A challenge password []:1234

An optional company name []:

#

Answer the system prompts to complete the self-signed SSL certificate generation process, entering the pertinent information location and contact information. Additionally, a ‘challenge password’ with a minimum of 4-bytes in length will need to be selected and entered.

As prompted, enter the information to be incorporated into the certificate request. When completed, the server.csr and server.key files will appear in the /ifs/tmp directory.

If wanted, a CSR file for a Certificate Authority, which includes Subject-Alternative-Names (SAN) can be generated. For example, additional host name entries can be added using a comma (IE. DNS:isilon.com,DNS:www.isilon.com).

In the next article, we will look at the certificate singing, addition, and verification steps of the process.

 

Home > Storage > PowerScale (Isilon) > Blogs

Isilon PowerScale OneFS NAS Dell Cluster Scale-out

OneFS NFS Locking and Reporting – Part 2

Nick Trimbee Nick Trimbee

Mon, 13 Nov 2023 17:58:49 -0000

|

Read Time: 0 minutes

In the previous article in this series, we took a look at the new NFS locks and waiters reporting CLI command set and API endpoints. Next, we turn our attention to some additional context, caveats, and NFSv3 lock removal.

Before the NFS locking enhancements in OneFS 9.5, the legacy CLI commands were somewhat inefficient.  Their output also included other advisory domain locks such as SMB, which made the output more difficult to parse. The table below maps the new 9.5 CLI commands (and corresponding handlers) to the old NLM syntax.

Type / Command set

OneFS 9.5 and later

OneFS 9.4 and earlier

Locks

isi nfs locks

isi nfs nlm locks

Sessions

isi nfs nlm sessions

isi nfs nlm sessions

Waiters

isi nfs locks waiters

isi nfs nlm locks waiters

Note that the isi_classic nfs locks and waiters CLI commands have also been deprecated in OneFS 9.5.

When upgrading to OneFS 9.5 or later from a prior release, the legacy platform API handlers continue to function through and post upgrade. Thus, any legacy scripts and automation are protected from this lock reporting deprecation. Additionally, while the new platform API handlers will work in during a rolling upgrade in mixed-mode, they will only return results for the nodes that have already been upgraded (‘high nodes’).

Be aware that the NFS locking CLI framework does not support partial responses. However, if a node is down or the cluster has a rolling upgrade in progress, the alternative is to query the equivalent platform API endpoint instead.

Performance-wise, on very large busy clusters, there is the possibility that the lock and waiter CLI commands’ output will be sluggish. In such instances, the --timeout flag can be used to increase the command timeout window. Output filtering can also be used to reduce number of locks reported.

When a lock is in a transition state, there is a chance that it may not have/report a version. In these instances, the Version field will be represented as . For example:

# isi nfs locks list -v
Client: 1/TMECLI1:487722/10.22.10.250
Client ID: 487722351064074
LIN: 4295164422
Path: /ifs/locks/nfsv3/10.22.10.250_1
Lock Type: exclusive
Range: 0, 92233772036854775807
Created: 2023-08-18T08:03:52
Version: -
---------------------------------------------------------------
Total: 1

This behavior should be experienced very infrequently. However, if it is encountered, simply execute the CLI command again, and the lock version should be reported correctly.

When it comes to troubleshooting NFSv3/NLM issues, if an NFSv3 client is consistently experiencing NLM_DENIED or other lock management issues, this is often a result of incorrectly configured firewall rules. For example, take the following packet capture (PCAP) excerpt from an NFSv4 Linux client:

   21 08:50:42.173300992  10.22.10.100 → 10.22.10.200 NLM 106    V4 LOCK Reply (Call In 19) NLM_DENIED

Often, the assumption is that only the lockd or statd ports on the server side of the firewall need to be opened and that the client always makes that connection that way. However, this is not the case. Instead, the server will continually respond with a ‘let me get back to you’, then later reconnect to the client. As such, if the firewall blocks access to rcpbind on the client and/or lockd or statd on the client, connection failures will likely occur.

Occasionally, it does become necessary to remove NLM locks and waiters from the cluster. Traditionally, the isi_classic nfs clients rm command was used, however that command has limitations and is fully deprecated in OneFS 9.5 and later. Instead, the preferred method is to use the isi nfs nlm sessions CLI utility in conjunction with various other ancillary OneFS CLI commands to clear problematic locks and waiters. 

Note that the isi nfs nlm sessions CLI command, available in all current OneFS version, is Zone-Aware. The output formatting is seen in the output for the client holding the lock as it now shows the Zone ID number at the beginning. For example:

 4/tme-linux1/10.22.10.250 

This represents:

Zone ID 4 / Client tme-linux1 / IP address of cluster node holding the connection.

A basic procedure to remove NLM locks and waiters from a cluster is as follows: 
 
1. List the NFS locks and search for the pertinent filename. 

In OneFS 8.5 and later, the locks list can be filtered using the --path argument.

# isi nfs locks list --path=<path> | grep <filename>

Be aware that the full path must be specified, starting with /ifs. There is no partial matching or substitution for paths in this command set.

For OneFS 9.4 and earlier, the following CLI syntax can be used:

#  isi_for_array -sX 'isi nfs nlm locks list | grep <filename>'


2. List the lock waiters associated with the same filename using |grep.

For OneFS 8.5 and later, the waiters list can also be filtered using the --path syntax:

# isi nfs locks waiters –path=<path> | grep <filename> 

With OneFS 9.4 and earlier, the following CLI syntax can be used:

# isi_for_array -sX 'isi nfs nlm locks waiters |grep -i <filename>'


3. Confirm the client and logical inode number (LIN) being waited upon. 

This can be accomplished by querying the efs.advlock.failover.lock_waiters sysctrl. For example:

# isi_for_array -sX 'sysctl efs.advlock.failover.lock_waiters'

[truncated output]
 ...
 client = { '4/tme-linux1/10.20.10.200’, 0x26593d37370041 }
 ...
resource = 2:df86:0218

Note that for sanity checking, the isi get -L CLI utility can be used to confirm the path of a file from its LIN:

isi get -L <LIN>


4. Remove the unwanted locks which are causing waiters to stack up. 

Keep in mind that the isi nfs nlm sessions command syntax is access zone-aware.

List the access zones by their IDs.

# isi zone zones list -v | grep -iE "Zone ID|name"

Once the desired zone ID has been determined, the isi_run -z CLI utility can be used to specify the appropriate zone in which to run the isi nfs nlm sessions commands: 

# isi_run -z 4 -l root

Next, the isi nfs nlm sessions delete CLI command will remove the specific lock waiter which is causing the issue. The command syntax requires specifying the client hostname and node IP of the node holding the lock. 

# isi nfs nlm sessions delete –-zone <AZ_zone_ID> <hostname> <cluster-ip>

For example:

# isi nfs nlm sessions delete –zone 4 tme-linux1 10.20.10.200
 Are you sure you want to delete all NFSv3 locks associated with client tme-linux1 against cluster IP 10.20.10.100? (yes/[no]): yes


5. Repeat the commands in step 1 to confirm that the desired NLM locks and waiters have been successfully culled.
 


BEFORE applying the process....

 # isi_for_array -sX 'isi nfs nlm locks list |grep JUN'
 TME-1: 4/tme-linux1/192.168.2.214  /ifs/tmp/TME/sequences/mncr_fabjob_seq_file_27JUN2017
 TME-1: 4/ tme-linux1/192.168.2.214  /ifs/tmp/TME/sequences/mncr_fabjob_seq_file_28JUN2017
 TME-2: 4/ tme-linux1/192.168.2.214  /ifs/tmp/TME/sequences/mncr_fabjob_seq_file_27JUN2017
 TME-2: 4/ tme-linux1/192.168.2.214  /ifs/tmp/TME/sequences/mncr_fabjob_seq_file_28JUN2017
 TME-3: 4/ tme-linux1/192.168.2.214  /ifs/tmp/TME/sequences/mncr_fabjob_seq_file_27JUN2017
 TME-3: 4/ tme-linux1/192.168.2.214  /ifs/tmp/TME/sequences/mncr_fabjob_seq_file_28JUN2017
 TME-4: 4/ tme-linux1/192.168.2.214  /ifs/tmp/TME/sequences/mncr_fabjob_seq_file_27JUN2017
 TME-4: 4/ tme-linux1/192.168.2.214  /ifs/tmp/TME/sequences/mncr_fabjob_seq_file_28JUN2017
 TME-5: 4/ tme-linux1/192.168.2.214  /ifs/tmp/TME/sequences/mncr_fabjob_seq_file_27JUN2017
 TME-5: 4/ tme-linux1/192.168.2.214  /ifs/tmp/TME/sequences/mncr_fabjob_seq_file_28JUN2017
 TME-6: 4/ tme-linux1/192.168.2.214  /ifs/tmp/TME/sequences/mncr_fabjob_seq_file_27JUN2017
 TME-6: 4/ tme-linux1/192.168.2.214  /ifs/tmp/TME/sequences/mncr_fabjob_seq_file_28JUN2017
 
 
 # isi_for_array -sX 'isi nfs nlm locks waiters |grep -i JUN'
 TME-1: 4/ tme-linux1/192.168.2.214  /ifs/tmp/TME/sequences/mncr_fabjob_seq_file_28JUN2017
 TME-1: 4/ tme-linux1/192.168.2.214  /ifs/tmp/TME/sequences/mncr_fabjob_seq_file_28JUN2017
 TME-2 exited with status 1
 TME-3 exited with status 1
 TME-4 exited with status 1
 TME-5 exited with status 1
 TME-6 exited with status 1


AFTER...

TME-1# isi nfs nlm sessions delete --hostname= tme-linux1 --cluster-ip=192.168.2.214
 Are you sure you want to delete all NFSv3 locks associated with client tme-linux1 against cluster IP 192.168.2.214? (yes/[no]): yes
 TME-1#
 TME-1#
 TME-1# isi_for_array -sX 'sysctl efs.advlock.failover.locks |grep 2:ce75:0319'
 TME-1 exited with status 1
 TME-2 exited with status 1
 TME-3 exited with status 1
 TME-4 exited with status 1
 TME-5 exited with status 1
 TME-6 exited with status 1
 TME-1#
 TME-1# isi_for_array -sX 'isi nfs nlm locks list |grep -i JUN'
 TME-1 exited with status 1
 TME-2 exited with status 1
 TME-3 exited with status 1
 TME-4 exited with status 1
 TME-5 exited with status 1
 TME-6 exited with status 1
 TME-1#
 TME-1# isi_for_array -sX 'isi nfs nlm locks waiters |grep -i JUN'
 TME-1 exited with status 1
 TME-2 exited with status 1
 TME-3 exited with status 1
 TME-4 exited with status 1
 TME-5 exited with status 1
 TME-6 exited with status 1

 

Author: Nick Trimbee


Home > Storage > PowerScale (Isilon) > Blogs

Isilon PowerScale OneFS NAS Dell Cluster Scale-out

OneFS NFS Locking

Nick Trimbee Nick Trimbee

Mon, 13 Nov 2023 17:56:59 -0000

|

Read Time: 0 minutes

Included among the plethora of OneFS 9.5 enhancements is an updated NFS lock reporting infrastructure, command set, and corresponding platform API endpoints. This new functionality includes enhanced listing and filtering options for both locks and waiters, based on NFS major version, client, LIN, path, creation time, etc. But first, some backstory.

The ubiquitous NFS protocol underwent some fundamental architectural changes between its versions 3 and 4. One of the major differences concerns the area of file locking.

NFSv4 is the most current major version of the protocol, natively incorporating file locking and thereby avoiding the need for any additional (and convoluted) RPC callback mechanisms necessary with prior NFS versions. With NFSv4, locking is built into the main file protocol and supports new lock types, such as range locks, share reservations, and delegations/oplocks, which emulate those found in Window and SMB.

File lock state is maintained at the server under a lease-based model. A server defines a single lease period for all states held by an NFS client. If the client does not renew its lease within the defined period, all states associated with the client's lease may be released by the server. If released, the client may either explicitly renew its lease or simply issue a read request or other associated operation. Additionally, with NFSv4, a client can elect whether to lock the entire file or a byte range within a file. 

In contrast to NFSv4, the NFSv3 protocol is stateless and does not natively support file locking. Instead, the ancillary Network Lock Manager (NLM) protocol supplies the locking layer. Since file locking is inherently stateful, NLM itself is considered stateful. For example, when an NFSv3 filesystem mounted on an NFS client receives a request to lock a file, it generates an NLM remote procedure call instead of an NFS remote procedure call. 

The NLM protocol itself consists of remote procedure calls that emulate the standard UNIX file control (fcntl) arguments and outputs. Because a process blocks waiting for a lock that conflicts with another lock holder – also known as a ‘blocking lock’ – the NLM protocol has the notion of callbacks from the file server to the NLM client to notify that a lock is available. As such, the NLM client sometimes acts as an RPC server in order to receive delayed results from lock calls. 

Attribute

NFSv3

NFSv4

State

Stateless - A client does not technically establish a new session if it has the correct information to ask for files and so on. This allows for simple failover between OneFS nodes using dynamic IP pools.

Stateful - NFSv4 uses sessions to handle communication. As such, both client and server must track session state to continue communicating.

Presentation

User and Group info is presented numerically - Client and Server communicate user information by numeric identifiers, allowing the same user to appear as different names between client and server.

User and Group info is presented as strings - Both the client and server must resolve the names of the numeric information stored. The server must look up names to present while the client must remap those to numbers on its end.

Locking

File Locking is out of band - uses NLM to perform locks. This requires the client to respond to RPC messages from the server to confirm locks have been granted, etc.

File Locking is in band - No longer uses a separate protocol for file locking, instead making it a type of call that is usually compounded with OPENs, CREATEs, or WRITEs.

Transport

Can run over TCP or UDP - This version of the protocol can run over UDP instead of TCP, leaving handling of loss and retransmission to the software instead of the operating system. We always recommend using TCP.

Only supports TCP - Version 4 of NFS has left loss and retransmission up to the underlying operating system. Can batch a series of calls in a single packet, allowing the server to process all of them and reply at the end. This is used to reduce the number of calls involved in common operations.

Since NFSv3 is stateless, it requires more complexity to recover from failures like client and server outages and network partitions. If an NLM server crashes, NLM clients that are holding locks must reestablish them on the server when it restarts. The NLM protocol deals with this by having the status monitor on the server send a notification message to the status monitor of each NLM client that was holding locks. The initial period after a server restart is known as the grace period, during which only requests to reestablish locks are granted. Thus, clients that reestablish locks during the grace period are guaranteed to not lose their locks. 

When an NLM client crashes, ideally any locks it was holding at the time are removed from the pertinent NLM server(s). The NLM protocol handles this by having the status monitor on the client send a message to each server's status monitor once the client reboots. The client reboot indication informs the server that the client no longer requires its locks. However, if the client crashes and fails to reboot, the client's locks will persist indefinitely. This is undesirable for two primary reasons: Resources are indefinitely leaked. Eventually, another client will want to get a conflicting lock on at least one of the files the crashed client had locked and, as a result, the other client is postponed indefinitely.

Therefore, having NFS server utilities to swiftly and accurately report on lock and waiter status and utilities to clear NFS lock waiters is highly desirable for administrators – particularly on clustered storage architectures.

Prior to OneFS 9.5, the old NFS locking CLI commands were somewhat inefficient and also showed other advisory domain locks, which rendered the output somewhat confusing. The following table shows the new CLI commands (and corresponding handlers) which replace the older NLM syntax.

Type / Command set

OneFS 9.4 and earlier

OneFS 9.5

Locks

isi nfs nlm locks

isi nfs locks

Sessions

isi nfs nlm sessions

isi nfs nlm sessions

Waiters

isi nfs nlm locks waiters

isi nfs locks waiters

In OneFS 9.5 and later, the old API handlers will still exist to avoid breaking existing scripts and automation, however the CLI command syntax is deprecated and will no longer work.

Also be aware that the isi_classic nfs locks and waiters CLI commands have also been disabled in OneFS 9.5. Attempts to run these will yield the following warning message: 

# isi_classic nfs locks
This command has been disabled. Please use isi nfs for this functionality.

The new isi nfs locks CLI command output includes the following locks object fields:

Field

Description

Client

The client host name, Frequently Qualified Domain Name, or IP

Client_ID

The client ID (internally generated)

Created

The UNIX Epoch time that the lock was created

ID

The lock ID (Id necessary for platform API sorting, not shown in CLI output)

LIN

The logical inode number (LIN) of the locked resource

Lock_type

The type of lock (shared, exclusive, none)

Path

Path of locked file

Range

The byte range within the file that is locked

Version

The NFS major version: v3, or v4

Note that the ISI_NFS_PRIV RBAC privilege is required in order to view the NFS locks or waiters via the CLI or PAPI. In addition to ‘root’, the cluster’s ‘SystemAdmin’ and ‘SecurityAdmin’ roles contain this privilege by default.

Additionally, the new locks CLI command sets have a default timeout of 60 seconds. If the cluster is very large, the timeout may need to be increased for the CLI command. For example:

# isi –timeout <timeout value> nfs locks list

 The basic architecture of the enhanced NFS locks reporting framework is as follows:

The new API handlers leverage the platform API proxy, yielding increased performance over the legacy handlers. Additionally, updated syscalls have been implemented to facilitate filtering by NFS service and major version.

Since NFSv3 is stateless, the cluster does not know when a client has lost its state unless it reconnects. For maximum safety, the OneFS locking framework (lk) holds locks forever. The isi nfs nlm sessions CLI command allows administrators to manually free NFSv3 locks in such cases, and this command remains available in OneFS 9.5 as well as prior versions. NFSv3 locks may also be leaked on delete, since a valid inode is required for lock operations. As such, lkf has a lock reaper which periodically checks for locks associated with deleted files.

In OneFS 9.5 and later, current NFS locks can be viewed with the new isi nfs locks list command. This command set also provides a variety of options to limit and format the display output. In its basic form, this command generates a basic list of client IP address and the path. For example:

# isi nfs locks list
Client                              Path
-------------------------------------------------------------------
1/TMECLI1:487722/10.22.10.250       /ifs/locks/nfsv3/10.22.10.250_1
1/TMECLI1:487722/10.22.10.250       /ifs/locks/nfsv3/10.22.10.250_2
Linux NFSv4.0 TMECLI1:487722/10.22.10.250       /ifs/locks/nfsv4/10.22.10.250_1
Linux NFSv4.0 TMECLI1:487722/10.22.10.250       /ifs/locks/nfsv4/10.22.10.250_2
-------------------------------------------------------------------
Total: 4

To include more information, the -v flag can be used to generate a verbose locks listing:

 # isi nfs locks list -v
Client: 1/TMECLI1:487722/10.22.10.250
Client ID: 487722351064074
LIN: 4295164422
Path: /ifs/locks/nfsv3/10.22.10.250_1
Lock Type: exclusive
Range: 0, 92233772036854775807
Created: 2023-08-18T08:03:52
Version: v3
---------------------------------------------------------------
Client: 1/TMECLI1:487722/10.22.10.250
Client ID: 5175867327774721
LIN: 42950335042
Path: /ifs/locks/nfsv3/10.22.10.250_1
Lock Type: exclusive
Range: 0, 92233772036854775807
Created: 2023-08-18T08:10:31
Version: v3
---------------------------------------------------------------
Client: Linux NFSv4.0 TMECLI1:487722/10.22.10.250
Client ID: 487722351064074
LIN: 429516442
Path: /ifs/locks/nfsv3/10.22.10.250_1
Lock Type: exclusive
Range: 0, 92233772036854775807
Created: 2023-08-18T08:19:48
Version: v4
---------------------------------------------------------------
Client: Linux NFSv4.0 TMECLI1:487722/10.22.10.250
Client ID: 487722351064074
LIN: 4295426674
Path: /ifs/locks/nfsv3/10.22.10.250_2
Lock Type: exclusive
Range: 0, 92233772036854775807
Created: 2023-08-18T08:17:02
Version: v4
---------------------------------------------------------------
Total: 4

The previous syntax returns more detailed information for each lock, including client ID, LIN, path, lock type, range, created date, and NFS version.

The lock listings can also be filtered by client or client-id. Note that the --client option must be the full name in quotes:

# isi nfs locks list --client="full_name_of_client/IP_address" -v

For example:

# isi nfs locks list --client="1/TMECLI1:487722/10.22.10.250" -v
Client: 1/TMECLI1:487722/10.22.10.250
Client ID: 5175867327774721
LIN: 42950335042
Path: /ifs/locks/nfsv3/10.22.10.250_1
Lock Type: exclusive
Range: 0, 92233772036854775807
Created: 2023-08-18T08:10:31
Version: v3

Additionally, be aware that the CLI does not support partial names, so the full name of the client must be specified.

Filtering by NFS version can be helpful when attempting to narrow down which client has a lock. For example, to show just the NFSv3 locks:

# isi nfs locks list --version=v3 
Client                              Path
-------------------------------------------------------------------
1/TMECLI1:487722/10.22.10.250       /ifs/locks/nfsv3/10.22.10.250_1
1/TMECLI1:487722/10.22.10.250       /ifs/locks/nfsv3/10.22.10.250_2
-------------------------------------------------------------------
Total: 2

Note that the –-version flag supports both v3 and nlm as arguments and will return the same v3 output in either case. For example:

# isi nfs locks list --version=nlm
Client                              Path
-------------------------------------------------------------------
1/TMECLI1:487722/10.22.10.250       /ifs/locks/nfsv3/10.22.10.250_1
1/TMECLI1:487722/10.22.10.250       /ifs/locks/nfsv3/10.22.10.250_2
-------------------------------------------------------------------
Total: 2

Filtering by LIN or path is also supported. For example, to filter by LIN:

# isi nfs locks list --lin=42950335042 -v
Client: 1/TMECLI1:487722/10.22.10.250
Client ID: 5175867327774721
LIN: 42950335042
Path: /ifs/locks/nfsv3/10.22.10.250_1
Lock Type: exclusive
Range: 0, 92233772036854775807
Created: 2023-08-18T08:10:31
Version: v3

Or by path:

# isi nfs locks list --path=/ifs/locks/nfsv3/10.22.10.250_2
 -v
Client: Linux NFSv4.0 TMECLI1:487722/10.22.10.250
Client ID: 487722351064074
LIN: 4295426674
Path: /ifs/locks/nfsv3/10.22.10.250_2
Lock Type: exclusive
Range: 0, 92233772036854775807
Created: 2023-08-18T08:17:02
Version: v4

Be aware that the full path must be specified, starting with /ifs. There is no partial matching or substitution for paths in this command set.

Filtering can also be performed by creation time, for example:

# isi nfs locks list --created=2023-08-17T09:30:00 -v 

Note that when filtering by created, the output will include all locks that were created before or at the time provided.

The —limits argument can be used to curtail the number of results returned, and limits can be used in conjunction with all other query options. For example, to limit the output of the NFSv4 locks listing to one lock:

# isi nfs locks list -–version=v4 --limit=1 

Note that limit can be used with the range of query types.

The filter options are mutually exclusive with the exception of version. Note that version can be used with any of the other filter options. For example, filtering by both created and version.

This can be helpful when troubleshooting and trying to narrow down results.

In addition to locks, OneFS 9.5 also provides the isi nfs locks waiters CLI command set. Note that waiters are specific to NFSv3 clients, and the CLI reports any v3 locks that are pending and not yet granted.

Since NFSv3 is stateless, a cluster does not know when a client has lost its state unless it reconnects. For maximum safety, lk holds locks forever. The isi nfs nlm command allows administrators to manually free locks in such cases. Locks may also be leaked on delete, since a valid inode is required for lock operations. Thus, lkf has a lock reaper which periodically checks for locks associated with deleted files:

# isi nfs locks waiters

The waiters CLI syntax uses a similar range of query arguments as the isi nfs locks list command set.

In addition to the CLI, the platform API can also be used to query both NFS locks and NFSv3 waiters. For example, using curl to view the waiters via the OneFS pAPI:

# curl -k -u <username>:<passwd> https://localhost:8080/platform/protocols/nfs/waiters”
{
“total” : 2,
“waiters”;
}
{
“client” : “1/TMECLI1487722/10.22.10.250”,
“client_id” : “4894369235106074”,
“created” : “1668146840”,
“id” : “1 1YUIAEIHVDGghSCHGRFHTiytr3u243567klj212-MANJKJHTTy1u23434yui-ouih23ui4yusdftyuySTDGJSDHVHGDRFhgfu234447g4bZHXhiuhsdm”,
“lin” : “4295164422”,
“lock_type” : “exclusive”
“path” : “/ifs/locks/nfsv3/10.22.10.250_1”
“range” : [0, 92233772036854775807 ],
“version” : “v3”
}
},
“total” : 1
}

Similarly, using the platform API to show locks filtered by client ID:

# curl -k -u <username>:<passwd> “https://<address>:8080/platform/protocols/nfs/locks?client=<client_ID>”

For example:

# curl -k -u <username>:<passwd> “https://localhost:8080/platform/protocols/nfs/locks?client=1/TMECLI1487722/10.22.10.250”
{
“locks”;
}
{
“client” : “1/TMECLI1487722/10.22.10.250”,
“client_id” : “487722351064074”,
“created” : “1668146840”,
“id” : “1 1YUIAEIHVDGghSCHGRFHTiytr3u243567FCUJHBKD34NMDagNLKYGHKHGKjhklj212-MANJKJHTTy1u23434yui-ouih23ui4yusdftyuySTDGJSDHVHGDRFhgfu234447g4bZHXhiuhsdm”,
“lin” : “4295164422”,
“lock_type” : “exclusive”
“path” : “/ifs/locks/nfsv3/10.22.10.250_1”
“range” : [0, 92233772036854775807 ],
“version” : “v3”
}
},
“Total” : 1
}

Note that, as with the CLI, the platform API does not support partial name matches, so the full name of the client must be specified.

 

Author: Nick Trimbee


Home > Storage > PowerScale (Isilon) > Blogs

Isilon PowerScale OneFS NAS Dell Cluster Scale-out

OneFS SSL Certificate Creation and Renewal – Part 2

Nick Trimbee Nick Trimbee

Mon, 13 Nov 2023 17:56:44 -0000

|

Read Time: 0 minutes

In the initial article in this series, we took a look at the OneFS SSL architecture, plus the first two steps in the basic certificate renewal or creation flow detailed below:

Backup existing SSL certificate > Renew/create certificate > Sign SSL certificate > Add certificate to cluster > Verify SSL certificate

The following procedure includes options to complete a self-signed certificate replacement or renewal or to request an SSL replacement or renewal from a Certificate Authority (CA).


Signing the SSL Certificate

Sign SSL certificate

At this point, depending on the security requirements of the environment, the certificate can either be self-signed or signed by a Certificate Authority.

Self-Sign the SSL Certificate 

The following CLI syntax can be used to self-sign the certificate with the key, creating a new signed certificate which, in this instance, is valid for 1 year (365 days):     

# openssl x509 -req -days 365 -in server.csr -signkey server.key -out server.crt

To verify that the key matches the certificate, ensure that the output of the following CLI commands return the same md5 checksum value:    

# openssl x509 -noout -modulus -in server.crt | openssl md5           
# openssl rsa -noout -modulus -in server.key | openssl md5

Next, proceed to the Add certificate to cluster section of this article once this step is complete. 

Use a CA to Sign the Certificate

If a CA is signing the certificate, ensure that the new SSL certificate is in x509 format and includes the entire certificate trust chain.

Note that the CA may return the new SSL certificate, the intermediate cert, and the root cert in different files. If this is the case, the PEM formatted certificate will need to be created manually.

Notably, the correct ordering is important when creating the PEM-formatted certificate. The SSL cert must be the top of the file, followed by the intermediate certificate, with the root certificate at the bottom. For example:        


-----BEGIN CERTIFICATE-----

<Contents of new SSL certificate>

-----END CERTIFICATE-----

-----BEGIN CERTIFICATE-----

<Contents of intermediate certificate>

<Repeat as necessary for every intermediate certificate provided by your CA>

-----END CERTIFICATE-----

-----BEGIN CERTIFICATE-----

<Contents of root certificate file>

-----END CERTIFICATE-----


A simple method for creating the PEM formatted file from the CLI is to cat them in the correct order as follows:

# cat CA_signed.crt intermediate.crt root.crt > onefs_pem_formatted.crt

Copy the onefs_pem_formatted.crt file to /ifs/tmp and rename it to server.crt

Note that if any of the aforementioned files are generated with a .cer extension, they should be renamed with a .crt extension instead.

The attributes and integrity of the certificate can be sanity checked with the following CLI syntax:       

# openssl x509 -text -noout -in server.crt

         

Adding the certificate to the cluster    

Add certificate to cluster

The first step in adding the certificate involves importing the new certificate and key into the cluster:      

# isi certificate server import /ifs/tmp/server.crt /ifs/tmp/server.key

Next, verify that the certificate imported successfully:     

# isi certificate server list -v 

The following CLI command can be used to show the names and corresponding IDs of the certificates:

# isi certificate server list -v | grep -A1 "ID:"

Set the imported certificate as default:      

# isi certificate settings modify --default-https-certificate=<id_of_cert_to_set_as_default>

Confirm that the imported certificate is being used as default by verifying status of Default HTTPS Certificate:     

# isi certificate settings view

If there is an unused or outdated cert, it can be deleted with the following CLI syntax:      

# isi certificate server delete --id=<id_of_cert_to_delete>

Next, view the new imported cert with command:      

# isi certificate server view --id=<id_of_cert>

Note that ports 8081 and 8083 still use the certificate from the local directory for SSL. Follow the steps below if you want to use the new certificates for port 8081/8083:

# isi services -a isi_webui disable
# chmod 640 server.key
# chmod 640 server.crt
# isi_for_array -s 'cp /ifs/tmp/server.key /usr/local/apache2/conf/ssl.key/server.key'
# isi_for_array -s 'cp /ifs/tmp/server.crt /usr/local/apache2/conf/ssl.crt/server.crt'
isi services -a isi_webui enable


Verifying the SSL certificate

Verify SSL certificate

There are two methods for verifying the updated SSL certificate.:

  • Via the CLI, using the openssl command as follows:
# echo QUIT | openssl s_client -connect localhost:8080
  • Or via a web browser, using the following URL:

https://<cluster_name>:8080

Note that where <cluster_name> is the FQDN or IP address, that’s typically used to access the cluster’s WebUI interface. The security details for the web page will contain the location and contact info, as above.

In both cases, the output includes location and contact info. For example:      

Subject: C=US, ST=<yourstate>, L=<yourcity>, O=<yourcompany>, CN=isilon.example.com/emailAddress=tme@isilon.com

Additionally, OneFS provides warning of an impending certificate expiry by sending a CELOG event alert, similar to the following:


SW_CERTIFICATE_EXPIRING: X.509 certificate default is nearing expiration: 
 
Event: 400170001
Certificate 'default' in '**' store is nearing expiration:
 


Note that OneFS does not attempt to automatically renew a certificate. Instead, an expiring cert has to be renewed manually, per the procedure described above.

When adding an additional certificate, the matching cert is used any time you connect to that SmartConnect name via HTTPS. If no matching certificate is found, OneFS will automatically revert to using the default self-signed certificate.

 

Author: Nick Trimbee 


Home > Storage > PowerScale (Isilon) > Blogs

OneFS

SMB Redirector Encryption

Nick Trimbee Nick Trimbee

Fri, 10 Nov 2023 19:37:15 -0000

|

Read Time: 0 minutes

As on-the-wire encryption becomes increasingly commonplace, and often mandated via regulatory compliance security requirements, the policies applied in enterprise networks are rapidly shifting towards fully encrypting all traffic.

The OneFS SMB protocol implementation (lwio) has supported encryption for Windows and other SMB client connections to a PowerScale cluster since OneFS 8.1.1.

 

However, prior to OneFS 9.5, this did not include encrypted communications between the SMB redirector and Active Directory (AD) domain controller (DC). While Microsoft added support for SMB encryption in SMB 3.0, the redirector in OneFS 9.4 and prior releases only supported Microsoft’s earlier SMB 2.002 dialect.

When OneFS connects to Active Directory for tasks requiring remote procedure calls (RPCs), such as joining a domain, NTLM authentication, or resolving usernames and SIDs, these SMB connections are established from OneFS as the client connecting to a domain controller server.

As outlined in the Windows SMB security documentation, by default, and starting with Windows 2012 R2, domain admins can choose to encrypt access to a file share, which can include a domain controller. When encryption is enabled, only SMB3 connections are permitted.

With OneFS 9.5, the OneFS SMB redirector now supports SMB3, thereby allowing the Local Security Authority Subsystem Service (LSASS) daemon to communicate with domain controllers running Windows Server 2012 R2 and later over an encrypted session.

The OneFS redirector, also known as the ‘rdr driver’, is a stripped-down SMB client with minimal functionality, only supporting what is absolutely necessary.

Under the hood, OneFS SMB encryption and decryption use standard OpenSSL functions, and AES-128-CCM encryption is negotiated during SMB negotiate phase.

Although everything stems from the NTLM authentication requested by SMB server, the sequence of calls leads to the redirector establishing an SMB connection to the AD domain controller.

With OneFS 9.5, no configuration is required to enable SMB encryption in most situations, and there are no WebUI OR CLI configuration settings for the redirector.

With the default OneFS configuration, the redirector supports encryption if negotiated but it does not require it. Similarly, if the Active Directory domain requires encryption, the OneFS redirector will automatically enable and use encryption. However, if the OneFS redirector is explicitly configured to require encryption and the domain controller does not support encryption, the connection will fail.

The OneFS redirector encryption settings include:

Key

Values

Description

Smb3EncryptionEnabled

Boolean. Default is ‘1’ == Enabled

Enable or disable SMB3 encryption for OneFS redirector.

Smb3EncryptionRequired

Boolean. Default is ‘0’ == Not required.

Require or do not require the redirector connection to be encrypted.

MaxSmb2DialectVersion

Default is ‘max’ == SMB 3.0.2

Set the SMB dialect, so the redirector will support it.  The maximum is currently SMB 3.0.2.

 

The above keys and values are stored in the OneFS Likewise SMB registry and can be viewed and configured with the ‘lwreqshell’ utility. For example, to view the SMB redirector encryption config settings:

# /usr/likewise/bin/lwregshell list_values "HKEY_THIS_MACHINE\Services\lwio\Parameters\Drivers\rdr" | grep -i encrypt

    "Smb3EncryptionEnabled"   REG_DWORD       0x00000001 (1)

    "Smb3EncryptionRequired" REG_DWORD       0x00000000 (0)

The following syntax can be used to disable the ‘Smb3EncryptionRequired’ parameter by setting it to value ‘1’:

# /usr/likewise/bin/lwregshell set_value "[HKEY_THIS_MACHINE\Services\lwio\Parameters\Drivers\rdr]" "Smb3EncryptionRequired" "0x00000001"

# /usr/likewise/bin/lwregshell list_values "HKEY_THIS_MACHINE\Services\lwio\Parameters\Drivers\rdr" | grep -i encrypt

    "Smb3EncryptionEnabled"   REG_DWORD       0x00000001 (1)

   "Smb3EncryptionRequired" REG_DWORD       0x00000001 (1)

Similarly, to restore the ‘Smb3EncryptionRequired’ parameter’s default value of ‘0’ (ie. not required):

# /usr/likewise/bin/lwregshell set_value "[HKEY_THIS_MACHINE\Services\lwio\Parameters\Drivers\rdr]" "Smb3EncryptionEnabled" "0x00000001"

Note that, during the upgrade to OneFS 9.5, any nodes still running the old version will not be able to NTLM-authenticate if the DC they have affinity with requires encryption.

While redirector encryption is implemented in user space (in contrast to the SMB server, which is in the kernel), since it involves OpenSSL, the library does take advantage of hardware acceleration on the processor and utilizes AES-NI. As such, performance is only minimally impacted when the number of NTLM authentications to the AD domain is very large.

Also note that redirector encryption also only currently supports only AES-128-CCM encryption provided in the SMB 3.0.0 and 3.0.2 dialects. OneFS does not use AES-128-GCM encryption, available in the SMB 3.1.1 dialect (the latest), at this time.

When it comes to troubleshooting the redirector, the lwregshell tool can be used to verify its configuration settings. For example, to view the redirector encryption settings:

# /usr/likewise/bin/lwregshell list_values "HKEY_THIS_MACHINE\Services\lwio\Parameters\Drivers\rdr" | grep -i encrypt

    "Smb3EncryptionEnabled"   REG_DWORD       0x00000001 (1)

    "Smb3EncryptionRequired" REG_DWORD       0x00000000 (0)

Similarly, to find the maximum SMB version supported by the redirector:

# /usr/likewise/bin/lwregshell list_values "HKEY_THIS_MACHINE\Services\lwio\Parameters\Drivers\rdr" | grep -i dialect

    "MaxSmb2DialectVersion"   REG_SZ          "max"

The ‘lwsm’ CLI utility with the following syntax will confirm the status of the various lsass components:

# /usr/likewise/bin/lwsm list | grep lsass

lsass                       [service]     running (lsass: 5164)

netlogon                    [service]     running (lsass: 5164)

rdr                         [driver]      running (lsass: 5164)

It can also be used to show and modify the logging level. For example:

# /usr/likewise/bin/lwsm get-log rdr

<default>: syslog LOG_CIFS at WARNING

# /usr/likewise/bin/lwsm set-log-level rdr - debug

# /usr/likewise/bin/lwsm get-log rdr

<default>: syslog LOG_CIFS at DEBUG

When finished, rdr logging can be returned to its previous log level as follows:

# /usr/likewise/bin/lwsm set-log-level rdr - warning

# /usr/likewise/bin/lwsm get-log rdr

<default>: syslog LOG_CIFS at WARNING

Additionally, the existing ‘lwio-tool’ utility has been modified in OneFS 9.5 to include functionality allowing simple test connections to domain controllers (no NTLM) via the new ‘rdr’ syntax:

# /usr/likewise/bin/lwio-tool rdr openpipe //<domain_controller>/NETLOGON

The ‘lwio-tool’ usage in OneFS 9.5 is as follows:

# /usr/likewise/bin/lwio-tool -h

Usage: lwio-tool <command> [command-args]

   commands:

    iotest rundown

    rdr [openpipe|openfile] username@password://domain/path

    srvtest transport [query|start|stop]

    testfileapi [create|createnp] <path>

 

Author: Nick Trimbee

 

Home > Storage > PowerScale (Isilon) > Blogs

security PowerScale OneFS

OneFS Password Security Policy

Nick Trimbee Nick Trimbee

Mon, 24 Jul 2023 20:08:49 -0000

|

Read Time: 0 minutes

Among the slew of security enhancements introduced in OneFS 9.5 is the ability to mandate a more stringent password policy. This is required to comply with security requirements such as the U.S. military STIG, which stipulates:

RequirementDescription

Length

An OS or network device must enforce a minimum 15-character password length.

Percentage

An OS must require the change of at least 50% of the total number of characters when passwords are changed.

Position

A network device must require that when a password is changed, the characters are changed in at least eight of the positions within the password.

Temporary password

The OS must allow the use of a temporary password for system logons with an immediate change to a permanent password.

The OneFS password security architecture can be summarized as follows:

Within the OneFS security subsystem, authentication is handled in OneFS by LSASSD, the daemon used to service authentication requests for lwiod.

ComponentDescription

LSASSD

The local security authority subsystem service (LSASS) handles authentication and identity management as users connect to the cluster.

File provider

The file provider includes users from /etc/password and groups from /etc/groups.

Local provider

The local provider includes local cluster accounts such as anonymous, guest, and so on.

SSHD

The OpenSSH Daemon provides secure encrypted communications between a client and a cluster node over an insecure network.

pAPI

The OneFS Platform API provides programmatic interfaces to OneFS configuration and management through a RESTful HTTPS service.

In OneFS AIMA, there are several different kinds of backend providers: Local provider, file provider, AD provider, NIS provider, and so on. Each provider is responsible for the management of users and groups inside the provider. For OneFS password policy enforcement, the local and file providers are the focus.

The local provider is based on an SamDB style file stored with prefix path of /ifs/.ifsvar, and its provider settings can be viewed by the following CLI syntax: 

# isi auth local view System 

On the other hand, the file provider is based on the FreeBSD spwd.db file, and its configuration can be viewed by the following CLI command: 

# isi auth file view System

Each provider stores and manage its own users. For the local provider, isi auth users create CLI command will create a user inside the provider by default. However, for the file provider, there is no corresponding command. Instead, the OneFS pw CLI command can be used to create a new file provider user.

After the user is created, the isi auth users modify <USER> CLI command can be used to change the attributes of the user for both the file and local providers. However, not all attributes are supported for both providers. For example, the file provider does not support password expiry.

The fundamental password policy CLI changes introduced in OneFS 9.5 are as follows:

OperationOneFS 9.5 changeDetails

change-password

Modified

Needed to provide old password for changing so that we can calculate how many chars/percent changed

reset-password

Added

Generates a temp password that meets current password policy for user to log in

set-password

Deprecated

Doesn't need to provide old password

A user’s password can now be set, changed, and reset by either root or admin. This is supported by the new isi auth users change-password or isi auth users reset-password CLI command syntax. The latter, for example, returns a temporary password and requires the user to change it on next login. After logging in with the temporary (albeit secure) password, OneFS immediately forces the user to change it:

# whoami
admin
# isi auth users reset-password user1
4$_x\d\Q6V9E:sH
# ssh user1@localhost
(user1@localhost) Password:
(user1@localhost) Your password has expired.
You are required to immediately change your password.
Changing password for user1
New password:
(user1@localhost) Re-enter password:
Last login: Wed May 17 08:02:47 from 127.0.0.1
PowerScale OneFS 9.5.0.0
# whoami
user1

Also in OneFS 9.5 and later, the CLI isi auth local view system command sees the addition of four new fields:

  • Password Chars Changed
  • Password Percent Changed
  • Password Hash Type
  • Max Inactivity Days

For example:

# isi auth local view system
                    Name: System
                  Status: active
          Authentication: Yes
    Create Home Directory: Yes
 Home Directory Template: /ifs/home/%U
        Lockout Duration: Now
       Lockout Threshold: 0
          Lockout Window: Now
             Login Shell: /bin/zsh
            Machine Name:
        Min Password Age: Now
        Max Password Age: 4W
      Min Password Length: 0
     Password Prompt Time: 2W
      Password Complexity: -
 Password History Length: 0
   Password Chars Changed: 0
Password Percent Changed: 0
      Password Hash Type: NTHash
      Max Inactivity Days: 0

The following CLI command syntax configures OneFS to require a minimum password length of 15 characters, a 50% or greater change, and 8 or more characters to be altered for a successful password reset:

# isi auth local modify system --min-password-length 15 --password-chars-changed 8 --password-percent-changed 50

Next, a command is issued to create a new user, user2, with a 10-character password:

# isi auth users create user2 --password 0123456789
Failed to add user user1: The specified password does not meet the configured password complexity or history requirements

This attempt fails because the password does not meet the configured password criteria (15 chars, 50% change, 8 chars to be altered).

Instead, the password for the new account, user2, is set to an appropriate value: 0123456789abcdef. Also, the --prompt-password-change flag is used to force the user to change their password on next login.

# isi auth users create user2 --password 0123456789abcdef –prompt-password-change 1

When the user logs in to the user2 account, OneFS immediately prompts for a new password. In the following example, a non-compliant password (012345678zyxw) is entered. 

0123456789abcdef -> 012345678zyxw = Failure

This returns an unsuccessful change attempt failure because it does not meet the 15-character minimum:

# su user2
New password:
Re-enter password:
The specified password does not meet the configured password complexity requirements.
Your password must meet the following requirements:
  * Must contain at least 15 characters.
  * Must change at least 8 characters.
  * Must change at least 50% of characters.
New password:

Instead, a compliant password and successful change could be: 

0123456789abcdef -> 0123456zyxwvuts = Success

The following command can also be used to change the password for a user. For example, to update user2’s password:

# isi auth users change-password user2
Current password (hit enter if none):
New password:
Confirm new password:

If a non-compliant password is entered, the following error is returned:

Password change failed: The specified password does not meet the configured password complexity or history requirements

When employed, OneFS hardening automatically enforces security-based configurations. The hardening engine is profile-based, and its STIG security profile is predicated on security mandates specified in the U.S. Department of Defense (DoD) Security Requirements Guides (SRGs) and Security Technical Implementation Guides (STIGs).

On applying the STIG hardening security profile to a cluster (isi hardening apply --profile=STIG), the password policy settings are automatically reconfigured to the following values:

FieldNormal valueSTIG hardened

Lockout Duration

Now

Now

Lockout Threshold

0

3

Lockout Window

Now

15m

Min Password Age

Now

1D

Max Password Age

4W

8W4D

Min Password Length

0

15

Password Prompt Time

2W

2W

Password Complexity

-

lowercase, numeric, repeat, symbol, uppercase

Password History Length

0

5

Password Chars Changed

0

8

Password Percent Changed

0

50

Password Hash Type

NTHash

SHA512

Max Inactivity Days

0

35

For example:

# uname -or
Isilon OneFS 9.5.0.0
 
# isi hardening list
Name  Description                       Status
---------------------------------------------------
STIG  Enable all STIG security settings Applied
---------------------------------------------------
Total: 1
 
# isi auth local view system
                    Name: System
                  Status: active
          Authentication: Yes
   Create Home Directory: Yes
 Home Directory Template: /ifs/home/%U
        Lockout Duration: Now
       Lockout Threshold: 3
          Lockout Window: 15m
             Login Shell: /bin/zsh
             Machine Name:
        Min Password Age: 1D
        Max Password Age: 8W4D
     Min Password Length: 15
    Password Prompt Time: 2W
     Password Complexity: lowercase, numeric, repeat, symbol, uppercase
 Password History Length: 5
  Password Chars Changed: 8
Password Percent Changed: 50
      Password Hash Type: SHA512
     Max Inactivity Days: 35

Note that Password Hash Type is changed from the default NTHash to the more secure SHA512 encoding, in addition to setting the various password criteria.

The OneFS 9.5 WebUI also sees several additions and alterations to the Password policy page. These include:

OperationOneFS 9.5 changeDetails

Policy page

Added

New Password policy page under Access > Membership and roles

reset-password

Added

Generates a random password that meets current password policy for user to log in

The most obvious change is the transfer of the policy configuration elements from the local provider page to a new dedicated Password policy page.

Here’s the OneFS 9.4 View a local provider page, under Access > Authentication providers > Local providers > System:

This is replaced and augmented in the OneFS 9.5 WebUI with the following page, located under Access > Membership and roles > Password policy:

New password policy configuration options are included to require uppercase, lowercase, numeric, or special characters and limit the number of contiguous repeats of a character, and so on.

When it comes to changing a password, only a permitted user can make their change. This can be performed from a couple of locations in the WebUI. First, the user options on the task bar at the top of each screen now provides a Change password option:

A pop-up warning message will also be displayed by the WebUI, informing the user when password expiration is imminent. This warning provides a Change Password link:

Clicking on the Change Password link displays the following page:

A new password complexity tool-tip message is also displayed, informing the user of safe password selection.

Note that re-login is required after a password change.

On the Users page under Access > Membership and roles > Users, the Action drop-down list on the now also contains a Reset Password option:

The successful reset confirmation pop-up offers both a show and copy option, while informing the cluster administrator to share the new password with the user, and for them to change their password during their next login:  

The Create user page now provides an additional field that requires password confirmation. Additionally, the password complexity tool-tip message is also displayed:

The redesigned Edit user details page no longer provides a field to edit the password directly:

Instead, the Action drop-down list on the Users page now contains a Reset Password option. 


Author: Nick Trimbee

 

Home > Storage > PowerScale (Isilon) > Blogs

security PowerScale OneFS

OneFS Key Manager Rekey Support

Nick Trimbee Nick Trimbee

Mon, 24 Jul 2023 19:16:34 -0000

|

Read Time: 0 minutes

The OneFS key manager is a backend service that orchestrates the storage of sensitive information for PowerScale clusters. To satisfy Dell’s Secure Infrastructure Ready requirements and other public and private sector security mandates, the manager provides the ability to replace, or rekey, cryptographic keys.

The quintessential consumer of OneFS key management is data-at-rest encryption (DARE). Protecting sensitive data stored on the cluster with cryptography ensures that it’s guarded against theft, in the event that drives or nodes are removed from a PowerScale cluster. DARE is a requirement for federal and industry regulations, ensuring data is encrypted when it is stored. OneFS has provided DARE solutions for many years through secure encrypted drives (SEDs) and the OneFS key management system.

A 256-bit key (MK) encrypts the Key Manager Database (KMDB) for SED and cluster domains. In OneFS 9.2 and later, the MK for SEDs can either be stored off-cluster on a KMIP server or locally on a node (the legacy behavior).

However, there are a variety of other consumers of the OneFS key manager, in addition to DARE. These include services and protocols such as:

ServiceDescription

CELOG

Cluster event log

CloudPools

Cluster tier to cloud service

Email

Electronic mail

FTP

File transfer protocol

IPMI

Intelligent platform management interface for remote cluster console access

JWT

JSON web tokens

NDMP

Network data management protocol for cluster backups and DR

Pstore

Active directory and Kerberos password store

S3

S3 object protocol

SyncIQ

Cluster replication service

SmartSync

OneFS push and pull replication cluster and cloud replication service

SNMP

Simple network monitoring protocol

SRS

Old Dell support remote cluster connectivity

SSO

Single sign-on

SupportAssist

Remote cluster connectivity to Dell Support

 OneFS 9.5 introduces a number of enhancements to the venerable key manager, including:

  • The ability to rekey keystores. Rekey operation will generate a new MK and re-encrypt all entries stored with the new key.
  • New CLI commands and WebUI options to perform a rekey operation or schedule key rotation on a time interval.
  • New commands to monitor the progress and status of a rekey operation.

As such, OneFS 9.5 now provides the ability to rekey the MK, irrespective of where it is stored.

Note that when you are upgrading from an earlier OneFS release, the new rekey functionality is only available once the OneFS 9.5 upgrade has been committed.

Under the hood, each provider store in the key manager consists of secure backend storage and an MK. Entries are kept in a SQLite database or key-value store. A provider datastore uses its MK to encrypt all its entries within the store.

During the rekey process, the old MK is only deleted after a successful re-encryption with the new MK. If for any reason the process fails, the old MK is available and remains as the current MK. The rekey daemon retries the rekey every 15 minutes if the process fails.

The OneFS rekey process is as follows:

  1. A new MK is generated, and internal configuration is updated.
  2. Any entries in the provider store are decrypted and encrypted with the new MK.
  3. If the prior steps are successful, the previous MK is deleted.

To support the rekey process, the MK in OneFS 9.5 now has an ID associated with it. All entries have a new field referencing the MK ID.

During the rekey operation, there are two MK values with different IDs, and all entries in the database will associate which key they are encrypted by.

In OneFS 9.5, the rekey configuration and management is split between the cluster keys and the SED keys:

Rekey componentDetail

SED

  • SED provider keystore is stored locally on each node.
  • SED provider domain already had existing CLI commands for handling KMIP settings in prior releases.

Cluster

  • Controls all cluster-wide keystore domains.
  • Status shows information of all cluster provider domains.

SED keys rekey

The SED key manager rekey operation can be managed through a DARE cluster’s CLI or WebUI, and it can either be automatically scheduled or run manually on demand. The following CLI syntax can be used to manually initiate a rekey:

# isi keymanager sed rekey start

Alternatively, to schedule a rekey operation, for example, to schedule a key rotation every two months:

# isi keymanager sed rekey modify --key-rotation=2m

The key manager status for SEDs can be viewed as follows:

# isi keymanager sed status
 Node Status  Location   Remote Key ID  Key Creation Date   Error Info(if any)
-----------------------------------------------------------------------------
1   LOCAL   Local                    1970-01-01T00:00:00
-----------------------------------------------------------------------------
Total: 1

Alternatively, from the WebUI, go to Access > Key Management >  SED/Cluster Rekey, select Automatic rekey for SED keys, and configure the rekey frequency:

Note that for SED rekey operations, if a migration from local cluster key management to a KMIP server is in progress, the rekey process will begin once the migration is complete.

Cluster keys rekey

As mentioned previously, OneFS 9.5 also supports the rekey of cluster keystore domains. This cluster rekey operation is available through the CLI and the WebUI and may either be scheduled or run on demand. The available cluster domains can be queried by running the following CLI syntax:

# isi keymanager cluster status
Domain     Status  Key Creation Date   Error Info(if any)
----------------------------------------------------------
CELOG      ACTIVE  2023-04-06T09:19:16
CERTSTORE  ACTIVE  2023-04-06T09:19:16
CLOUDPOOLS ACTIVE   2023-04-06T09:19:16
EMAIL      ACTIVE  2023-04-06T09:19:16
FTP        ACTIVE  2023-04-06T09:19:16
IPMI_MGMT  IN_PROGRESS  2023-04-06T09:19:16
JWT        ACTIVE  2023-04-06T09:19:16
LHOTSE     ACTIVE  2023-04-06T09:19:11
NDMP       ACTIVE  2023-04-06T09:19:16
NETWORK    ACTIVE  2023-04-06T09:19:16
PSTORE     ACTIVE  2023-04-06T09:19:16
RICE       ACTIVE  2023-04-06T09:19:16
S3         ACTIVE  2023-04-06T09:19:16
SIQ        ACTIVE  2023-04-06T09:19:16
SNMP       ACTIVE  2023-04-06T09:19:16
SRS        ACTIVE  2023-04-06T09:19:16
SSO        ACTIVE  2023-04-06T09:19:16
----------------------------------------------------------
Total: 17

The rekey process generates a new key and re-encrypts the entries for the domain. The old key is then deleted.

Performance-wise, the rekey process does consume cluster resources (CPU and disk) as a result of the re-encryption phase, which is fairly write-intensive. As such, a good practice is to perform rekey operations outside of core business hours or during scheduled cluster maintenance windows.

During the rekey process, the old MK is only deleted once a successful re-encryption with the new MK has been confirmed. In the event of a rekey process failure, the old MK is available and remains as the current MK.

A rekey may be requested immediately or may be scheduled with a cadence. The rekey operation is available through the CLI and the WebUI. In the WebUI, go to Access > Key Management > SED/Cluster Rekey.

To start a rekey of the cluster domains immediately, from the CLI run the following syntax:

# isi keymanager cluster rekey start 
Are you sure you want to rekey the master passphrase? (yes/[no]):yes

Alternatively, from the WebUI, go to Access under the SED/Cluster Rekey tab, and click Rekey Now next to Cluster keys:

A scheduled rekey of the cluster keys (excluding the SED keys) can be configured from the CLI with the following syntax:

# isi keymanager cluster rekey modify –-key-rotation [YMWDhms]

Specify the frequency of the Key Rotation field as an integer, using Y for years, M for months, W for weeks, D for days, h for hours, m for minutes, and s for seconds. For example, the following command will schedule the cluster rekey operation to run every six weeks:

# isi keymanager cluster rekey view
 Rekey Time: 1970-01-01T00:00:00
 Key Rotation: Never
 # isi keymanager cluster rekey modify --key-rotation 6W
 # isi keymanager cluster rekey view
 Rekey Time: 2023-04-28T18:38:45
 Key Rotation: 6W

The rekey configuration can be easily reverted back to on demand from a schedule as follows:

# isi keymanager cluster rekey modify --key-rotation Never
 # isi keymanager cluster rekey view
 Rekey Time: 2023-04-28T18:38:45
 Key Rotation: Never

Alternatively, from the WebUI, under the SED/Cluster Rekey tab, select the Automatic rekey for Cluster keys checkbox and specify the rekey frequency. For example:

In an event of a rekeying failure, a CELOG KeyManagerRekeyFailed or KeyManagerSedsRekeyFailed event is created. Since SED rekey is a node-local operation, the KeyManagerSedsRekeyFailed event information will also include which node experienced the failure.

Additionally, current cluster rekey status can also be queried with the following CLI command:

# isi keymanager cluster status
Domain     Status  Key Creation Date   Error Info(if any)
----------------------------------------------------------
CELOG      ACTIVE  2023-04-06T09:19:16
CERTSTORE  ACTIVE  2023-04-06T09:19:16
CLOUDPOOLS ACTIVE   2023-04-06T09:19:16
EMAIL      ACTIVE  2023-04-06T09:19:16
FTP        ACTIVE  2023-04-06T09:19:16
IPMI_MGMT  ACTIVE  2023-04-06T09:19:16
JWT        ACTIVE  2023-04-06T09:19:16
LHOTSE     ACTIVE  2023-04-06T09:19:11
NDMP       ACTIVE  2023-04-06T09:19:16
NETWORK    ACTIVE  2023-04-06T09:19:16
PSTORE     ACTIVE  2023-04-06T09:19:16
RICE       ACTIVE  2023-04-06T09:19:16
S3         ACTIVE  2023-04-06T09:19:16
SIQ        ACTIVE  2023-04-06T09:19:16
SNMP       ACTIVE  2023-04-06T09:19:16
SRS        ACTIVE  2023-04-06T09:19:16
SSO        ACTIVE  2023-04-06T09:19:16
----------------------------------------------------------
Total: 17

Or, for SEDs rekey status:

# isi keymanager sed status
 Node Status  Location   Remote Key ID  Key Creation Date   Error Info(if any)
-----------------------------------------------------------------------------
1   LOCAL   Local                    1970-01-01T00:00:00
2   LOCAL   Local                    1970-01-01T00:00:00
3   LOCAL   Local                    1970-01-01T00:00:00
4   LOCAL   Local                    1970-01-01T00:00:00
-----------------------------------------------------------------------------
Total: 4

The rekey process also outputs to the /var/log/isi_km_d.log file, which is a useful source for additional troubleshooting.

If an error in rekey occurs, the previous MK is not deleted, so entries in the provider store can still be created and read as normal. The key manager daemon will retry the rekey operation in the background every 15 minutes until it succeeds.

Author: Nick Trimbee

Home > Storage > PowerScale (Isilon) > Blogs

PowerScale OneFS single sign-on SSO SAML

OneFS WebUI Single Sign-on Configuration and Deployment

Nick Trimbee Nick Trimbee

Thu, 20 Jul 2023 18:27:32 -0000

|

Read Time: 0 minutes

In the first article in this series, we took a look at the architecture of the new OneFS WebUI SSO functionality. Now, we move on to its provisioning and setup.

SSO on PowerScale can be configured through either the OneFS WebUI or CLI. OneFS 9.5 debuts a new dedicated WebUI SSO configuration page under Access > Authentication Providers > SSO. Alternatively, for command line afficionados, the CLI now includes a new isi auth sso command set.

Here is the overall configuration flow:

 

 
1.  Upgrade to OneFS 9.5

First, ensure the cluster is running OneFS 9.5 or a later release. If upgrading from an earlier OneFS version, note that the SSO service requires this upgrade to be committed prior to configuration and use.

Next, configure an SSO administrator. In OneFS, this account requires at least one of the following privileges:

PrivilegeDescription

ISI_PRIV_LOGIN_PAPI

Required for the admin to use the OneFS WebUI to administer SSO

ISI_PRIV_LOGIN_SSH

Required for the admin to use the OneFS CLI through SSH to administer SSO

ISI_PRIV_LOGIN_CONSOLE

Required for the admin to use the OneFS CLI on the serial console to administer SSO

The user account used for identity provider management should have an associated email address configured.

2.  Setup Identity Provider

OneFS SSO activation also requires having a suitable identity provider (IdP), such as ADFS, provisioned and available before setting up OneFS SSO.

ADFS can be configured through either the Windows GUI or command shell, and detailed information on the deployment and configuration of ADFS can be found in the Microsoft Windows Server documentation.

 
The Windows remote desktop utility (RDP) can be used to provision, connect to, and configure an ADFS server.

  1. When connected to ADFS, configure a rule defining access. For example, the following command line syntax can be used to create a simple rule that permits all users to log in:
    $AuthRules = @" 
    @RuleTemplate="AllowAllAuthzRule" => issue(Type = "http://schemas.microsoft.com/ 
    authorization/claims/permit", Value="true"); 
    "@

    or from the ADFS UI:


    Note that more complex rules can be crafted to meet the particular requirements of an organization.
  2. Create a rule parameter to map the Active Directory user email address to the SAML NameID.
    $TransformRules = @" 
    @RuleTemplate = "LdapClaims" 
    @RuleName = "LDAP mail" 
    c:[Type == "http://schemas.microsoft.com/ws/2008/06/identity/claims/ 
    windowsaccountname", Issuer == "AD AUTHORITY"] 
          => issue(store = "Active Directory", 
               types = 
               ("http://schemas.xmlsoap.org/ws/2005/05/identity/claims/
               emailaddress"), query = ";mail;{0}", param = c.Value); 
    @RuleTemplate = "MapClaims" 
    @RuleName = "NameID" 
    c:[Type == 
    "http://schemas.xmlsoap.org/ws/2005/05/identity/claims/emailaddress"] 
          => issue(Type = 
    "http://schemas.xmlsoap.org/ws/2005/05/identity/claims/ 
    nameidentifier", Issuer = c.
                Issuer, OriginalIssuer = c.OriginalIssuer, 
                Value = c.Value, ValueType = c.ValueType, 
                Properties["http://schemas.xmlsoap.org/ws/2005/05/identity
                / claimproperties/format"] = 
                "urn:oasis:names:tc:SAML:1.1:nameid-format:emailAddress"); 
    "@
  3. Configure AD to trust the OneFS WebUI certificate.
  4. Create the relying party trust.

    Add-AdfsRelyingPartyTrust -Name <cluster-name>\ 
         -MetadataUrl "https://<cluster-node-
    ip>:8080/session/1/saml/metadata" \ 
         -IssuanceAuthorizationRules $AuthRules -IssuanceTransformRules 
    $TransformRules 

or from Windows Server Manager:


3.  Select Access Zone

 Because OneFS SSO is zone-aware, the next step involves choosing the access zone to configure. Go to Access > Authentication providers > SSO, select an access zone (that is, the system zone), and click Add IdP.

 Note that each of a cluster’s access zone or zones must have an IdP configured for it. The same IdP can be used for all the zones, but each access zone must be configured separately.

4.  Add IdP Configuration

 In OneFS 9.5 and later, the WebUI SSO configuration is a wizard-driven, “guided workflow” process involving the following steps:

 
First, go to Access > Authentication providers > SSO, select an access zone (that is, the system zone), and then click Add IdP.

 
On the Add Identity Provider page, enter a unique name for the IdP. For example, Isln-IdP1 in this case:

 
When done, click Next, select the default Upload metadata XML option, and browse to the XML file downloaded from the ADFS system:

 
Alternatively, if the preference is to enter the information by hand, select Manual entry and complete the configuration form fields:

 
If the manual entry method is selected, you must have the IdP certificate ready to upload. With the manual entry option, the following information is required:

FieldDescription

Binding

Select POST or Redirect binding.

Entity ID

Unique identifier of the IdP as configured on the IdP. For example: 

http://idp1.isilon.com/adfs/services/trust

Login URL

Log in endpoint for the IdP. For example: 

http://idp1.isilon.com/adfs/ls/

Logout URL

Log out endpoint for the IdP. For example: http://idp1.example.com/adfs/ls/

Signing Certificate

Provide the PEM encoded certificate obtained from the IdP. This certificate is required to verify messages from the IdP.

Upload the IdP certificate:

 
For example:

Repeat this step for each access zone in which SSO is to be configured.

When complete, click Next to move on to the service provider configuration step.

5.  Configure Service Provider

 On the Service Provider page, confirm that the current access zone is carried over from the previous page.


Select Metadata download or Manual copy, depending on the chosen method of entering OneFS details about this service provider (SP) to the IdP.

 
Provide the hostname or IP address for the SP for the current access zone.

 
Click Generate to create the information (metadata) about OneFS and this access zone for use in configuring the IdP.


This generated information can now be used to configure the IdP (in this case, Windows ADFS) to accept requests from PowerScale as the SP and its configured access zone.

As shown, the WebUI page provides two methods for obtaining the information:

MethodAction

Metadata download

Download the XML file that contains the signing certificate, etc.

Manual copy

Select Copy Link in the lower half of the form to copy the information to the IdP.

 
Next, download the Signing Certificate.

 
When completed, click Next to finish the configuration.

6.  Enable SSO and Verify Operation

Once the IdP and SP are configured, a cluster admin can enable SSO per access zone through the OneFS WebUI by going to Access > Authentication providers > SSO. From here, select the access zone and select the toggle to enable SSO:

 Or from the OneFS CLI, use the following syntax:

# isi auth sso settings modify --sso-enabled 1

  

Author: Nick Trimbee

 

Home > Storage > PowerScale (Isilon) > Blogs

PowerScale OneFS single sign-on SSO SAML

OneFS WebUI Single Sign-on

Nick Trimbee Nick Trimbee

Thu, 20 Jul 2023 16:32:13 -0000

|

Read Time: 0 minutes

The Security Assertion Markup Language (SAML) is an open standard for sharing security information about identity, authentication, and authorization across different systems. SAML is implemented using the Extensible Markup Language (XML) standard for sharing data. The SAML framework enables single sign-on (SSO), which in turn allows users to log in once, and their login credential can be reused to authenticate with and access other different service providers. It defines several entities including end users, service providers, and identity providers, and is used to manage identity information. For example, the Windows Active Directory Federation Services (ADFS) is one of the ubiquitous identity providers for SAML contexts.

EntityDescription

End user

Requires authentication prior to being allowed to use an application.

Identity provider (IdP)

Performs authentication and passes the user's identity and authorization level to the service provider—for example, ADFS.

Service provider (SP)

Trusts the identity provider and authorizes the given user to access the requested resource. With SAML 2.0, a PowerScale cluster is a service provider. 

SAML Assertion

XML document that the identity provider sends to the service provider that contains the user authorization. 

OneFS 9.5 introduces SAML-based SSO for the WebUI to provide a more convenient authentication method, in addition to meeting the security compliance requirements for federal and enterprise customers. In OneFS 9.5, the WebUI’s initial login page has been redesigned to support SSO and, when enabled, a new Log in with SSO button is displayed on the login page under the traditional username and password text boxes. For example:

 
OneFS SSO is also zone-aware in support of multi-tenant cluster configurations. As such, a separate IdP can be configured independently for each OneFS access zone.


Under the hood, OneFS SSO employs the following high-level architecture:

 

In OneFS 9.5, the SSO operates through HTTP REDIRECT and POST bindings, with the cluster acting as the service provider. 

There are three different types of SAML Assertions—authentication, attribute, and authorization decision.

  • Authentication assertions prove identification of the user and provide the time the user logged in and what method of authentication they used (that is, Kerberos, two-factor, and so on).
  • The attribution assertion passes the SAML attributes to the service provider. SAML attributes are specific pieces of data that provide information about the user.
  • An authorization decision assertion states whether the user is authorized to use the service or if the identify provider denied their request due to a password failure or lack of rights to the service.

SAML SSO works by transferring the user’s identity from one place (the identity provider) to another (the service provider). This is done through an exchange of digitally signed XML documents.

A SAML Request, also known as an authentication request, is generated by the service provider to “request” an authentication.

A SAML Response is generated by the identity provider and contains the actual assertion of the authenticated user. In addition, a SAML Response may contain additional information, such as user profile information and group/role information, depending on what the service provider can support. Note that the service provider never directly interacts with the identity provider, with a browser acting as the agent facilitating any redirections.

Because SAML authentication is asynchronous, the service provider does not maintain the state of any authentication requests. As such, when the service provider receives a response from an identity provider, the response must contain all the necessary information.

The general flow is as follows:


When OneFS redirects a user to the configured IdP for login, it makes an HTTP GET request (SAMLRequest), instructing the IdP that the cluster is attempting to perform a login (SAMLAuthnRequest). When the user successfully authenticates, the IdP responds back to OneFS with an HTTP POST containing an HTML form (SAMLResponse) that indicates whether the login was successful, who logged in, plus any additional claims configured on the IdP. 

On receiving the SAMLResponse, OneFS verifies the signature using the public key (X.509 certificate) in to ensure that it really came from its trusted IdP and that none of the contents have been tampered with. OneFS then extracts the identity of the user, along with any other pertinent attributes. At this point, the user is redirected back to the OneFS WebUI dashboard (landing page), as if logged into the site manually.

In the next article in this series, we’ll take a detailed look at the following procedure to deploy SSO on a PowerScale cluster:

 

Author: Nick Trimbee

Home > Storage > PowerScale (Isilon) > Blogs

security PowerScale OneFS STIG

OneFS Account Security Policy

Nick Trimbee Nick Trimbee

Thu, 20 Jul 2023 16:23:21 -0000

|

Read Time: 0 minutes

Another of the core security enhancements introduced in OneFS 9.5 is the ability to enforce strict user account security policies. This is required for compliance with both private and public sector security mandates. For example, the account policy restriction requirements expressed within the U.S. military STIG requirements stipulate:

RequirementDescription

Delay

The OS must enforce a delay of at least 4 seconds between logon prompts following a failed logon attempt.

Disable

The OS must disable account identifiers (individuals, groups, roles, and devices) after 35 days of inactivity.

Limit

The OS must limit the number of concurrent sessions to ten for all accounts and/or account types.

 To directly address these security edicts, OneFS 9.5 adds the following account policy restriction controls:

Account policy functionDetails

Delay after failed login

  • After a failed login, OneFS enforces a configurable delay for subsequent logins on same cluster node 
  • Only applicable to administrative logins (not Protocol logins) 

Disable inactive accounts

  • Disables an inactive account after specified number of days. 
  • Only applicable to Local user accounts 
  • Cluster wide 

Concurrent session limit

  • Limits the number of active sessions a user can have on a cluster node 
  • Only applicable to administrative logins 
  • Node specific

Architecture

OneFS provides a variety of access mechanisms for administering a cluster. These include SSH, serial console, WebUI, and platform API, all of which use different underlying access methods. The serial console and SSH are standard FreeBSD third-party applications and are accounted for per node, whereas the WebUI and pAPI use HTTP module extensions to facilitate access to the system and services and are accounted for cluster-wide. Before OneFS 9.5, there was no common mechanism to represent or account for sessions across these disparate applications.

Under the hood, the OneFS account security policy framework encompasses the following high-level architecture: 

 


With SSH, there’s no explicit or reliable “log-off” event sent to OneFS, beyond actually disconnecting the connection. As such, accounting for active sessions can be problematic and unreliable, especially when connections time out or unexpectedly disconnect. However, OneFS does include an accounting database that stores records of system activities like user login and logout, which can be queried to determine active SSH sessions. Each active SSH connection has an isi_ssh_d process owned by the account associated with it, and this information can be gathered via standard syscalls. OneFS enumerates the number of SSHD processes per account to calculate the total number of active established sessions. This value is then used as part of the total concurrent administrative sessions limit. Since SSH only supports user access through the system zone, there is no need for any zone-aware accounting.

The WebUI and platform API use JSON web tokens (JWTs) for authenticated sessions. OneFS stores the JWTs in the cluster-wide kvstore, and access policy uses valid session tokens in the kvstore to account for active sessions when a user logs on through the WebUI or pAPI. When the user logs off, the associated token is removed, and a message is sent to JWT service with an explicit log off notification. If a session times out or disconnects, the JWT service will not get an event, but the tokens have a limited, short lifespan, and any expired tokens are purged from the list on a scheduled basis in conjunction with the JWT timer. OneFS enumerates the unique session IDs associated with each user’s JWT tokens in the kvstore to get a number of active WebUI and pAPI sessions to use as part of user’s session limit check.

For serial console access accounting, the process table will have information when an STTY connection is active, and OneFS extrapolates user data from it to determine the session count, similar to ssh with a syscall for process data. There is an accounting database that stores records of system activities like user login and logout, which is also queried for active console sessions. Serial console access is only from the system zone, so there is no need for zone-aware accounting.

An API call retrieves user session data from the process table and kvstore to calculate number of user active sessions. As such, the checking and enforcement of session limits is performed in similar manner to the verification of user privileges for SSH, serial console, or WebUI access.

Delaying failed login reconnections

OneFS 9.5 provides the ability to enforce a configurable delay period. This delay is specified in seconds, after which every unsuccessful authentication attempt results in the user being denied the ability to reconnect to the cluster until after the configured delay period has passed. The login delay period is defined in seconds through the FailedLoginDelayTime global attribute and, by default, OneFS is configured for no delay through a FailedLoginDelayTime value of 0. When a cluster is placed into hardened mode with the STIG policy enacted, the delay value is automatically set to 4 seconds. Note that the delay happens in the lsass client, so that the authentication service is not affected.

The configured failed login delay time limit can be viewed with following CLI command:

# isi auth settings global view
                            Send NTLMv2: No
                      Space Replacement:
                              Workgroup: WORKGROUP
               Provider Hostname Lookup: disabled
                          Alloc Retries: 5
                 User Object Cache Size: 47.68M
                       On Disk Identity: native
                         RPC Block Time: Now
                       RPC Max Requests: 64
                            RPC Timeout: 30s
Default LDAP TLS Revocation Check Level: none
                   System GID Threshold: 80
                   System UID Threshold: 80
                         Min Mapped Rid: 2147483648
                              Group UID: 4294967292
                               Null GID: 4294967293
                               Null UID: 4294967293
                            Unknown GID: 4294967294
                            Unknown UID: 4294967294
                Failed Login Delay Time: Now
               Concurrent Session Limit: 0


Similarly, the following syntax will configure the failed login delay time to a value of 4 seconds:

# isi auth settings global modify --failed-login-delay-time 4s
# isi auth settings global view | grep -i delay
                Failed Login Delay Time: 4s

However, when a cluster is put into STIG hardening mode, the “Concurrent sessions limit” is automatically set to 10.

# isi auth settings global view | grep -i delay
                Failed Login Delay Time: 10s

The delay time after login failure can also be configured from the WebUI under Access > Settings Global provider settings:


The valid range of the FailedLoginDelayTime global attribute is from 0 to 65535, and the delay time is limited to the same cluster node.

Note that this maximum session limit is only applicable to administrative logins.

Disabling inactive accounts

In OneFS 9.5, any user account that has been inactive for a configurable duration can be automatically disabled. Administrative intervention is required to re-enable a deactivated user account. The last activity time of a user is determined by their previous logon, and a timer runs every midnight during which all “inactive” accounts are disabled. If the last logon record for a user is unavailable, or stale, the timestamp when the account was enabled is taken as their last activity instead. If inactivity tracking is enabled after the last logon (or enabled) time of a user, the inactivity tracking time is considered for inactivity period.

This feature is disabled by default in OneFS, and all users are exempted from inactivity tracking until configured otherwise. However, individual accounts can be exempted from this behavior, and this can be configured through the user-specific DisableWhenInactive attribute. For example:

# isi auth user view user1 | grep -i inactive
   Disable When Inactive: Yes
# isi auth user modify user1 --disable-when-inactive 0
# isi auth user view user1 | grep -i inactive
   Disable When Inactive: No

If a cluster is put into STIG hardened mode, the value for the MaxInactivityDays parameter is automatically reconfigured to 35, meaning a user will be disabled after 35 days of inactivity. All the local users are removed from exemption when in STIG hardened mode.

Note that this functionality is limited to only the local provider and does not apply to file providers.

The inactive account disabling configuration can be viewed from the CLI with the following syntax. In this example, the MaxInactivityDays attribute is configured for 35 days:

# isi auth local view system
                    Name: System
                  Status: active
          Authentication: Yes
   Create Home Directory: Yes
 Home Directory Template: /ifs/home/%U
        Lockout Duration: Now
       Lockout Threshold: 0
          Lockout Window: Now
             Login Shell: /bin/zsh
            Machine Name:
        Min Password Age: Now
        Max Password Age: 4W
     Min Password Length: 15
    Password Prompt Time: 2W
     Password Complexity: -
 Password History Length: 0
  Password Chars Changed: 8
Password Percent Changed: 50
      Password Hash Type: NTHash
     Max Inactivity Days: 35

Inactive account disabling can also be configured from the WebUI under Access > Authentication providers > Local provider:


The valid range of the MaxInactivityDays parameter is from 0 to UINT_MAX. As such, the following CLI syntax will configure the maximum number of days a user account can be inactive before it will be disabled to 10 days:

# isi auth local modify system --max-inactivity-days 10
# isi auth local view system | grep -i inactiv
     Max Inactivity Days: 0tem –max-inactivity-days 10

Setting this value to 0 days will disable the feature:

# isi auth local modify system --max-inactivity-days 0
# isi auth local view system | grep -i inactiv
     Max Inactivity Days: 0tem –max-inactivity-days 0

Inactivity account disabling, as well as password expiry, can also be configured granularly, per user account. For example, user1 has a default configuration of the Disable When Inactive threshold set to No.

# isi auth users view user1
                    Name: user1
                      DN: CN=user1,CN=Users,DC=GLADOS
              DNS Domain: -
                  Domain: GLADOS
                Provider: lsa-local-provider:System
        Sam Account Name: user1
                     UID: 2000
                     SID: S-1-5-21-1839173366-2940572996-2365153926-1000
                 Enabled: Yes
                 Expired: No
                  Expiry: -
                  Locked: No
                   Email: -
                   GECOS: -
           Generated GID: No
           Generated UID: No
           Generated UPN: Yes
           Primary Group
                          ID: GID:1800
                        Name: Isilon Users
          Home Directory: /ifs/home/user1
        Max Password Age: 4W
        Password Expired: No
         Password Expiry: 2023-06-15T17:45:55
       Password Last Set: 2023-05-18T17:45:55
        Password Expired: No
              Last Logon: -
                   Shell: /bin/zsh
                     UPN: user1@GLADOS
User Can Change Password: Yes
   Disable When Inactive: No


The following CLI command will activate the account inactivity disabling setting and enable password expiry for the user1 account:

# isi auth users modify user1 --disable-when-inactive Yes --password-expires Yes 

Inactive account disabling can also be configured from the WebUI under Access > Membership and roles > Users > Providers:

 

Limiting concurrent sessions

OneFS 9.5 can limit the number of administrative sessions active on a OneFS cluster node, and all WebUI, SSH, pAPI, and serial console sessions are accounted for when calculating the session limit. The SSH and console session count is node-local, whereas WebUI and pAPI sessions are tracked cluster-wide. As such, the formula used to calculate a node’s total active sessions is as follows:

Total active user sessions on a node = Total WebUI and pAPI sessions across the cluster + Total SSH and Console sessions on the node

This feature leverages the cluster-wide session management through JWT for calculating the total number of sessions on a cluster’s node. By default, OneFS 9.5 has no configured limit, and the Concurrent Session Limit parameter has a value of 0. For example:

# isi auth settings global view
                            Send NTLMv2: No
                      Space Replacement:
                              Workgroup: WORKGROUP
               Provider Hostname Lookup: disabled
                          Alloc Retries: 5
                 User Object Cache Size: 47.68M
                       On Disk Identity: native
                         RPC Block Time: Now
                       RPC Max Requests: 64
                            RPC Timeout: 30s
Default LDAP TLS Revocation Check Level: none
                   System GID Threshold: 80
                   System UID Threshold: 80
                         Min Mapped Rid: 2147483648
                              Group UID: 4294967292
                               Null GID: 4294967293
                               Null UID: 4294967293
                            Unknown GID: 4294967294
                            Unknown UID: 4294967294
                Failed Login Delay Time: Now
               Concurrent Session Limit: 0

The following CLI syntax will configure Concurrent Session Limit to a value of 5:

# isi auth settings global modify –-concurrent-session-limit 5
# isi auth settings global view | grep -i concur
                Concurrent Session Limit: 5

Once the session limit has been exceeded, attempts to connect, in this case as root through SSH, will be met with the following Access denied error message:

login as: root
Keyboard-interactive authentication prompts from server:
| Password:
End of keyboard-interactive prompts from server                      
Access denied
password:

The concurrent sessions limit can also be configured from the WebUI under Access > Settings > Global provider settings:


However, when a cluster is put into STIG hardening mode, the concurrent session limit is automatically set to a maximum of 10 sessions.

Note that this maximum session limit is only applicable to administrative logins.

Performance

Disabling an account after a period of inactivity in OneFS requires a SQLite database update every time a user has successfully logged on to the OneFS cluster. After a successful logon, the time to logon is recorded in the database, which is later used to compute the inactivity period.

Inactivity tracking is disabled by default in OneFS 9.5, but can be easily enabled by configuring the MaxInactivityDays attribute to a non-zero value. In cases where inactivity tracking is enabled and many users are not exempt from inactivity tracking, a significant number of logons within a short period of time can generate significant SQLite database requests. However, OneFS consolidates multiple database updates during user logon to a single commit to minimize the overall load.

Troubleshooting

When it comes to troubleshooting OneFS account security policy configurations, there are these main logfiles to check:

  • /var/log/lsassd.log
  • /var/log/messages
  • /var/log/isi_papi_d.log

For additional reporting detail, debug level logging can be enabled on the lsassd.log file with the following CLI command:

# /usr/likewise/bin/lwsm set-log-level lsass – debug

When finished, logging can be returned to the regular error level:

# /us/likewise/bin/lwsm set-log-level lsass - error


Author: Nick Trimbee

Home > Storage > PowerScale (Isilon) > Blogs

security PowerScale OneFS

OneFS Restricted Shell—Log Viewing and Recovery

Nick Trimbee Nick Trimbee

Tue, 27 Jun 2023 20:37:27 -0000

|

Read Time: 0 minutes

Complementary to the restricted shell itself, which was covered in the previous article in this series, OneFS 9.5 also sees the addition of a new log viewer, plus a recovery shell option.

 

The new isi_log_access CLI utility enables an SSH user to read, page, and query the log files in the /var/log directory. The ability to run this tool is governed by the user’s role being granted the ISI_PRIV_SYS_SUPPORT role-based access control (RBAC) privilege.

OneFS RBAC is used to explicitly limit who has access to the range of cluster configurations and operations. This granular control allows for crafting of administrative roles, which can create and manage the various OneFS core components and data services, isolating each to specific security roles or to admin only, and so on.

In this case, a cluster security administrator selects the access zone, creates a zone-aware role within it, assigns the ISI_PRIV_SYS_SUPPORT privileges for isi_log_access use, and then assigns users to the role.

Note that the integrated OneFS AuditAdmin RBAC role does not contain the ISI_PRIV_SYS_SUPPORT privileges by default. Also, the integrated RBAC roles cannot be reconfigured:

# isi auth roles modify AuditAdmin --add-priv=ISI_PRIV_SYS_SUPPORT
The privileges of built-in role AuditAdmin cannot be modified

Therefore, the ISI_PRIV_SYS_SUPPORT role has to be added to a custom role.

For example, the following CLI syntax adds the user usr_admin_restricted to the rl_ssh role and adds the privilege ISI_PRIV_SYS_SUPPORT to the rl_ssh role:

# isi auth roles modify rl_ssh --add-user=usr_admin_restricted
# isi auth roles modify rl_ssh --add-priv=ISI_PRIV_SYS_SUPPORT
# isi auth roles view rl_ssh
        Name: rl_ssh
 Description: -
     Members: u_ssh_restricted
              u_admin_restricted
  Privileges
              ID: ISI_PRIV_LOGIN_SSH
      Permission: r
             ID: ISI_PRIV_SYS_SUPPORT
      Permission: r

The usr_admin_restricted user could also be added to the AuditAdmin role:

# isi auth roles modify AuditAdmin --add-user=usr_admin_restricted
# isi auth roles view AuditAdmin | grep -i member
     Members: usr_admin_restricted

The isi_log_access tool supports the following command options and arguments:

OptionDescription

–grep

Match a pattern against the file and display on stdout

–help

Display the command description and usage message

–list

List all the files in the /var/log tree

–less

Display the file on stdout with a pager in secure_mode

–more

Display the file on stdout with a pager in secure_mode

–view

Display the file on stdout

–watch

Display the end of the file and new content as it is written

–zgrep

Match a pattern against the unzipped file contents and display on stdout

–zview

Display an unzipped version of the file on stdout

Here the u_admin_restricted user logs in to the SSH and runs the isi_log_access utility to list the /var/log/messages log file:

# ssh u_admin_restricted@10.246.178.121
 (u_admin_restricted@10.246.178.121) 
 Password:
 Last login: Wed May  3 18:02:18 2023 from 10.246.159.107
 Copyright (c) 2001-2023 Dell Inc. or its subsidiaries. All Rights Reserved.
 Copyright (c) 1992-2018 The FreeBSD Project.
 Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
         The Regents of the University of California. All rights reserved.
PowerScale OneFS 9.5.0.0
Allowed commands are
         clear ...
         isi ...
         isi_recovery_shell ...
         isi_log_access ...
         exit
         logout
 # isi_log_access –list
 LAST MODIFICATION TIME         SIZE       FILE
 Mon Apr 10 14:22:18 2023       56         alert.log
 Fri May  5 00:30:00 2023       62         all.log
 Fri May  5 00:30:00 2023       99         all.log.0.gz
 Fri May  5 00:00:00 2023       106        all.log.1.gz
 Thu May  4 00:30:00 2023       100        all.log.2.gz
 Thu May  4 00:00:00 2023       107        all.log.3.gz
 Wed May  3 00:30:00 2023       99         all.log.4.gz
 Wed May  3 00:00:00 2023       107        all.log.5.gz
 Tue May  2 00:30:00 2023       100        all.log.6.gz
 Mon Apr 10 14:22:18 2023       56         audit_config.log
 Mon Apr 10 14:22:18 2023       56         audit_protocol.log
 Fri May  5 17:23:53 2023       82064      auth.log
 Sat Apr 22 12:09:31 2023       10750      auth.log.0.gz
 Mon Apr 10 15:31:36 2023       0          bam.log
 Mon Apr 10 14:22:18 2023       56         boxend.log
 Mon Apr 10 14:22:18 2023       56         bwt.log
 Mon Apr 10 14:22:18 2023       56         cloud_interface.log
 Mon Apr 10 14:22:18 2023       56         console.log
 Fri May  5 18:20:32 2023       23769      cron
 Fri May  5 15:30:00 2023       8803       cron.0.gz
 Fri May  5 03:10:00 2023       9013       cron.1.gz
 Thu May  4 15:00:00 2023       8847       cron.2.gz
 Fri May  5 03:01:02 2023       3012       daily.log
 Fri May  5 00:30:00 2023       101        daily.log.0.gz
 Fri May  5 00:00:00 2023       1201       daily.log.1.gz
 Thu May  4 00:30:00 2023       102        daily.log.2.gz
 Thu May  4 00:00:00 2023       1637       daily.log.3.gz
 Wed May  3 00:30:00 2023       101        daily.log.4.gz
 Wed May  3 00:00:00 2023       1200       daily.log.5.gz
 Tue May  2 00:30:00 2023       102        daily.log.6.gz
 Mon Apr 10 14:22:18 2023       56         debug.log
 Tue Apr 11 12:29:37 2023       3694       diskpools.log
 Fri May  5 03:01:00 2023       244566     dmesg.today
 Thu May  4 03:01:00 2023       244662     dmesg.yesterday
 Tue Apr 11 11:49:32 2023       788        drive_purposing.log
 Mon Apr 10 14:22:18 2023       56         ethmixer.log
 Mon Apr 10 14:22:18 2023       56         gssd.log
 Fri May  5 00:00:35 2023       41641      hardening.log
 Mon Apr 10 15:31:05 2023       17996      hardening_engine.log
 Mon Apr 10 14:22:18 2023       56         hdfs.log
 Fri May  5 15:51:28 2023       31359      hw_ata.log
 Fri May  5 15:51:28 2023       56527      hw_da.log
 Mon Apr 10 14:22:18 2023       56         hw_nvd.log
 Mon Apr 10 14:22:18 2023       56         idi.log

In addition to parsing an entire log file with the more and less flags, the isi_log_access utility can also be used to watch (that is, tail) a log. For example, the /var/log/messages log file:

% isi_log_access --watch messages
 2023-05-03T18:00:12.233916-04:00 <1.5> h7001-2(id2) limited[68236]: Called ['/usr/bin/isi_log_access', 'messages'], which returned 2.
 2023-05-03T18:00:23.759198-04:00 <1.5> h7001-2(id2) limited[68236]: Calling ['/usr/bin/isi_log_access'].
 2023-05-03T18:00:23.797928-04:00 <1.5> h7001-2(id2) limited[68236]: Called ['/usr/bin/isi_log_access'], which returned 0.
 2023-05-03T18:00:36.077093-04:00 <1.5> h7001-2(id2) limited[68236]: Calling ['/usr/bin/isi_log_access', '--help'].
 2023-05-03T18:00:36.119688-04:00 <1.5> h7001-2(id2) limited[68236]: Called ['/usr/bin/isi_log_access', '--help'], which returned 0.
 2023-05-03T18:02:14.545070-04:00 <1.5> h7001-2(id2) limited[68236]: Command not in list of allowed commands.
 2023-05-03T18:02:50.384665-04:00 <1.5> h7001-2(id2) limited[68594]: Calling ['/usr/bin/isi_log_access', '--list'].
 2023-05-03T18:02:50.440518-04:00 <1.5> h7001-2(id2) limited[68594]: Called ['/usr/bin/isi_log_access', '--list'], which returned 0.
 2023-05-03T18:03:13.362411-04:00 <1.5> h7001-2(id2) limited[68594]: Command not in list of allowed commands.
 2023-05-03T18:03:52.107538-04:00 <1.5> h7001-2(id2) limited[68738]: Calling ['/usr/bin/isi_log_access', '--watch', 'messages'].

As expected, the last few lines of the messages log file are displayed. These log entries include the command audit entries for the usr_admin_secure user running the isi_log_access utility with both the --help, --list, and --watch arguments.

The isi_log_access utility also allows zipped log files to be read (–zview) or searched (–zgrep) without uncompressing them. For example, to find all the usr_admin entries in the zipped vmlog.0.gz file:

# isi_log_access --zgrep usr_admin vmlog.0.gz
0.0 64468 usr_admin_restricted /usr/local/bin/zsh 
    0.0 64346 usr_admin_restricted python /usr/local/restricted_shell/bin/restricted_shell.py (python3.8)
    0.0 64468 usr_admin_restricted /usr/local/bin/zsh
    0.0 64346 usr_admin_restricted python /usr/local/restricted_shell/bin/restricted_shell.py (python3.8)
    0.0 64342 usr_admin_restricted sshd: usr_admin_restricted@pts/3 (sshd)
    0.0 64331 root               sshd: usr_admin_restricted [priv] (sshd)
    0.0 64468 usr_admin_restricted /usr/local/bin/zsh
    0.0 64346 usr_admin_restricted python /usr/local/restricted_shell/bin/restricted_shell.py (python3.8)
    0.0 64342 usr_admin_restricted sshd: usr_admin_restricted@pts/3 (sshd)
    0.0 64331 root               sshd: usr_admin_restricted [priv] (sshd)
    0.0 64468 usr_admin_restricted /usr/local/bin/zsh
    0.0 64346 usr_admin_restricted python /usr/local/restricted_shell/bin/restricted_shell.py (python3.8)
    0.0 64342 usr_admin_restricted sshd: usr_admin_restricted@pts/3 (sshd)
    0.0 64331 root               sshd: usr_admin_restricted [priv] (sshd)
    0.0 64468 usr_admin_restricted /usr/local/bin/zsh
    0.0 64346 usr_admin_restricted python /usr/local/restricted_shell/bin/restricted_shell.py (python3.8)
    0.0 64342 usr_admin_restricted sshd: u_admin_restricted@pts/3 (sshd)
    0.0 64331 root               sshd: usr_admin_restricted [priv] (sshd)

OneFS recovery shell

The purpose of the recovery shell is to allow a restricted shell user to access a regular UNIX shell and its associated command set, if needed. As such, the recovery shell is primarily designed and intended for reactive cluster recovery operations and other unforeseen support issues. Note that the isi_recovery_shell CLI command can only be run, and the recovery shell entered, from within the restricted shell.

The ISI_PRIV_RECOVERY_SHELL privilege is required for a user to elevate their shell from restricted to recovery. The following syntax can be used to add this privilege to a role, in this case the rl_ssh role:

% isi auth roles modify rl_ssh --add-priv=ISI_PRIV_RECOVERY_SHELL
% isi auth roles view rl_ssh
        Name: rl_ssh
 Description: -
     Members: usr_ssh_restricted
              usr_admin_restricted
  Privileges
              ID: ISI_PRIV_LOGIN_SSH
      Permission: r
             ID: ISI_PRIV_SYS_SUPPORT
      Permission: r
             ID: ISI_PRIV_RECOVERY_SHELL
      Permission: r

However, note that the –-restricted-shell-enabled security parameter must be set to true before a user with the ISI_PRIV_RECOVERY_SHELL privilege can enter the recovery shell. For example:

% isi security settings view | grep -i restr
Restricted shell Enabled: No
% isi security settings modify –restricted-shell-enabled=true
% isi security settings view | grep -i restr
Restricted shell Enabled: Yes

The restricted shell user must enter the cluster’s root password to successfully enter the recovery shell. For example:

% isi_recovery_shell -h
 Description:
         This command is used to enter the Recovery shell i.e. normal zsh shell from the PowerScale Restricted shell. This command is supported only in the PowerScale Restricted shell.
Required Privilege:
         ISI_PRIV_RECOVERY_SHELL
Usage:
         isi_recovery_shell
            [{--help | -h}]

If the root password is entered incorrectly, the following error is displayed:

% isi_recovery_shell
 Enter 'root' credentials to enter the Recovery shell
 Password:
 Invalid credentials.
 isi_recovery_shell: PAM Auth Failed

A successful recovery shell launch is as follows:

$ ssh u_admin_restricted@10.246.178.121
 (u_admin_restricted@10.246.178.121) Password:
 Last login: Thu May  4 17:26:10 2023 from 10.246.159.107
 Copyright (c) 2001-2023 Dell Inc. or its subsidiaries. All Rights Reserved.
 Copyright (c) 1992-2018 The FreeBSD Project.
 Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
         The Regents of the University of California. All rights reserved.
PowerScale OneFS 9.5.0.0
Allowed commands are
         clear ...
         isi ...
         isi_recovery_shell ...
         isi_log_access ...
         exit
         logout
% isi_recovery_shell
 Enter 'root' credentials to enter the Recovery shell
 Password:
 %

At this point, regular shell/UNIX commands (including the vi editor) are available again:

% whoami
 u_admin_restricted
% pwd
 /ifs/home/u_admin_restricted
 % top | head -n 10
 last pid: 65044;  load averages:  0.12,  0.24,  0.29  up 24+04:17:23    18:38:39
 118 processes: 1 running, 117 sleeping
 CPU:  0.1% user,  0.0% nice,  0.9% system,  0.1% interrupt, 98.9% idle
 Mem: 233M Active, 19G Inact, 2152K Laundry, 137G Wired, 60G Buf, 13G Free
 Swap:
   PID USERNAME    THR PRI NICE   SIZE    RES STATE    C   TIME    WCPU COMMAND
  3955 root          1 -22  r30    50M    14M select  24 142:28   0.54% isi_drive_d
  5715 root         20  20    0   231M    69M kqread   5  55:53   0.15% isi_stats_d
  3864 root         14  20    0    81M    21M kqread  16 133:02   0.10% isi_mcp

The specifics of the recovery shell (ZSH) for the u_admin_restricted user are reported as follows:

% printenv $SHELL
 _=/usr/bin/printenv
 PAGER=less
 SAVEHIST=2000
 HISTFILE=/ifs/home/u_admin_restricted/.zsh_history
 HISTSIZE=1000
 OLDPWD=/ifs/home/u_admin_restricted
 PWD=/ifs/home/u_admin_restricted
 SHLVL=1
 LOGNAME=u_admin_restricted
 HOME=/ifs/home/u_admin_restricted
 RECOVERY_SHELL=TRUE
 TERM=xterm
 PATH=/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/sbin:/usr/local/bin:/root/bin

Shell logic conditions and scripts can be run. For example:

% while true; do uptime; sleep 5; done
  5:47PM  up 24 days,  3:26, 5 users, load averages: 0.44, 0.38, 0.34
  5:47PM  up 24 days,  3:26, 5 users, load averages: 0.41, 0.38, 0.34

ISI commands can be run, and cluster management tasks can be performed.

% isi hardening list
 Name  Description                       Status
 ---------------------------------------------------
 STIG  Enable all STIG security settings Not Applied
 ---------------------------------------------------
 Total: 1

For example, creating and deleting a snapshot:

% isi snap snap list
 ID Name Path
 ------------
 ------------
 Total: 0
% isi snap snap create /ifs/data
% isi snap snap list
 ID   Name  Path
 --------------------
 2    s2    /ifs/data
 --------------------
 Total: 1
% isi snap snap delete 2
 Are you sure? (yes/[no]): yes

Sysctls can be read and managed:

% sysctl efs.gmp.group
efs.gmp.group: <10539754> (4) :{ 1:0-14, 2:0-12,14,17, 3-4:0-14, smb: 1-4, nfs: 1-4, all_enabled_protocols: 1-4, isi_cbind_d: 1-4, lsass: 1-4, external_connectivity: 1-4 }

The restricted shell can be disabled:

% isi security settings modify --restricted-shell-enabled=false
% isi security settings view | grep -i restr
 Restricted shell Enabled: No

However, the isi underscore (isi_*) commands, such as isi_for_array, are still not permitted to run:

% /usr/bin/isi_for_array -s uptime
 zsh: permission denied: /usr/bin/isi_for_array
% isi_gather_info
 zsh: permission denied: isi_gather_info
% isi_cstats
 isi_cstats: Syscall ifs_prefetch_lin() failed: Operation not permitted

When finished, the user can either end the session entirely with the logout command or quit the recovery shell through exit and return to the restricted shell:

% exit
Allowed commands are
         clear ...
         isi ...
         isi_recovery_shell ...
         isi_log_access ...
         exit
         logout
 %

 
Author: Nick Trimbee

 

Home > Storage > PowerScale (Isilon) > Blogs

security PowerScale OneFS

OneFS Restricted Shell

Nick Trimbee Nick Trimbee

Tue, 27 Jun 2023 19:59:59 -0000

|

Read Time: 0 minutes

In contrast to many other storage appliances, PowerScale has always included an extensive, rich, and capable command line, drawing from its FreeBSD heritage. As such, it incorporates a choice of full UNIX shells (that is, ZSH), the ability to script in a variety of languages (Perl, Python, and so on), full data access, a variety of system and network management and monitoring tools, plus the comprehensive OneFS isi command set. However, what is a bonus for usability can also present a risk from a security point of view.

With this in mind, among the bevy of security features that debuted in OneFS 9.5 release is the addition of a restricted shell for the CLI. This shell heavily curtails access to cluster command line utilities, eliminating areas where commands and scripts could be run and files modified maliciously and unaudited.

The new restricted shell can help both public and private sector organizations to meet a variety of regulatory compliance and audit requirements, in addition to reducing the security threat surface when OneFS is administered.

 

Written in Python, the restricted shell constrains users to a tight subset of the commands available in the regular OneFS command line shells, plus a couple of additional utilities. These include:

CLI utilityDescription

ISI commands

The isi or “isi space” commands. These include the commands such as isi status, and so on. For the full set of isi commands, run isi –help.

Shell commands

The supported shell commands include clear, exit, logout, and CTRL+D.

Log access

The isi_log_access tool can be used if the user possesses the ISI_PRIV_SYS_SUPPORT privilege.

Recovery shell

The recovery shell isi_recovery_shell can be used if the user possesses the ISI_PRIV_RECOVERY_SHELL and the security setting Restricted shell Enabled is configured to true.

For a OneFS CLI command to be audited, its handler needs to call through the platform API (pAPI). This occurs with the regular isi commands but not necessarily with the “isi underscore” commands such as isi_for_array, and so on. While some of these isi_* commands write to log files, there is no uniform or consistent auditing or logging.

On the data access side, /ifs file system auditing works through the various OneFS protocol heads (NFS, SMB, S3, and so on). So if the CLI is used with an unrestricted shell to directly access and modify /ifs, any access and changes are unrecorded and unaudited.

In OneFS 9.5, the new restricted shell is included in the permitted shells list (/etc/shells):

# grep -i restr /etc/shells
/usr/local/restricted_shell/bin/restricted_shell.py

It can be easily set for a user through the CLI. For example, to configure the admin account to use the restricted shell, instead of its default of ZSH:

# isi auth users view admin | grep -i shell
                   Shell: /usr/local/bin/zsh
# isi auth users modify admin --shell=/usr/local/restricted_shell/bin/restricted_shell.py
# isi auth users view admin | grep -i shell
                   Shell: /usr/local/restricted_shell/bin/restricted_shell.py

OneFS can also be configured to limit non-root users to just the secure shell:

  Restricted shell Enabled: No
# isi security settings modify --restricted-shell-enabled=true
# isi security settings view | grep -i restr
  Restricted shell Enabled: Yes

The underlying configuration changes to support this include only allowing non-root users with approved shells in /etc/shells to log in through the console or SSH and having just /usr/local/restricted_shell/bin/restricted_shell.py in the /etc/shells config file.

Note that no users’ shells are changed when the configuration commands above are enacted. If users are intended to have shell access, their login shell must be changed before they can log in. Users will also require the privileges ISI_PRIV_LOGIN_SSH and/or ISI_PRIV_LOGIN_CONSOLE to be able to log in through SSH and the console, respectively.

While the WebUI in OneFS 9.5 does not provide a secure shell configuration page, the restricted shell can be enabled from the platform API, in addition to the CLI. The pAPI security settings now include a restricted_shell_enabled key, which can be enabled by setting to value=1, from its default of 0.

Be aware that, upon configuring a OneFS 9.5 cluster to run in hardened mode with the STIG profile (that is, isi hardening enable STIG), the restricted-shell-enable security setting is automatically set to true. This means that only root and users with ISI_PRIV_LOGIN_SSH and/or ISI_PRIV_LOGIN_CONSOLE privileges and the restricted shell as their shell will be permitted to log in to the cluster. We will focus on OneFS security hardening in a future article.

So let’s take a look at some examples of the restricted shell’s configuration and operation. 

First, we log in as the admin user and modify the file and local auth provider password hash types to the more secure SHA512 from their default value of NTHash:

# ssh 10.244.34.34 -l admin
# isi auth file view System | grep -i hash
     Password Hash Type: NTHash
# isi auth local view System | grep -i hash
      Password Hash Type: NTHash
# isi auth file modify System –-password-hash-type=SHA512
# isi auth local modify System –-password-hash-type=SHA512

Note that a cluster’s default user admin uses role-based access control (RBAC), whereas root does not. As such, the root account should ideally be used as infrequently as possible and, ideally, considered solely as the account of last resort.

Next, the admin and root passwords are changed to generate new passwords using the SHA512 hash:

# isi auth users change-password root
# isi auth users change-password admin

An rl_ssh role is created and the SSH access privilege is added to it:

# isi auth roles create rl_ssh
# isi auth roles modify rl_ssh –-add-priv=ISI_PRIV_LOGIN_SSH

Then a regular user (usr_ssh_restricted) and an admin user (usr_admin_resticted) are created with restricted shell privileges:

# isi auth users create usr_ssh_restricted –-shell=/usr/local/restricted_shell/bin/restricted_shell.py –-set-password
# isi auth users create usr_admin_restricted –shell=/usr/local/restricted_shell/bin/restricted_shell.py –-set-password

We then assign roles to the new users. For the restricted SSH user, we add to our newly created rl_ssh role:

# isi auth roles modify rl_ssh –-add-user=usr_ssh_restricted

The admin user is then added to the security admin and the system admin roles:

# isi auth roles modify SecurityAdmin –-add-user=usr_admin_restricted
# isi auth roles modify SystemAdmin –-add-user=usr_admin_restricted

Next, we connect to the cluster through SSH and authenticate as the usr_ssh_restricted user:

$ ssh usr_ssh_restricted@10.246.178.121
 (usr_ssh_restricted@10.246.178.121) Password:
 Copyright (c) 2001-2023 Dell Inc. or its subsidiaries. All Rights Reserved.
 Copyright (c) 1992-2018 The FreeBSD Project.
 Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
         The Regents of the University of California. All rights reserved.
 PowerScale OneFS 9.5.0.0
Allowed commands are
         clear ...
         isi ...
         isi_recovery_shell ...
         isi_log_access ...
         exit
         logout
%

This account has no cluster RBAC privileges beyond SSH access so cannot run the various isi commands. For example, attempting to run isi status returns no data and, instead, warns of the need for event, job engine, and statistics privileges:

% isi status
Cluster Name: h7001
 __
 *** Capacity and health information require ***
 ***   the privilege: ISI_PRIV_STATISTICS.   ***
Critical Events:
*** Requires the privilege: ISI_PRIV_EVENT. ***
Cluster Job Status:
 __
*** Requires the privilege: ISI_PRIV_JOB_ENGINE. ***
Allowed commands are
         clear ...
         isi ...
         isi_recovery_shell ...
         isi_log_access ...
         exit
         logout
%

Similarly, standard UNIX shell commands, such as pwd and whoami, are also prohibited:

% pwd
Allowed commands are
        clear ...
        isi ...
        isi_recovery_shell ...
        isi_log_access ...
        exit
        logout
% whoami
Allowed commands are
        clear ...
        isi ...
        isi_recovery_shell ...
        isi_log_access ...
        exit
        logout


Indeed, without additional OneFS RBAC privileges, the only commands the usr_ssh_restricted user can actually run in the restricted shell are clear, exit, and logout:

Note that the restricted shell automatically logs out an inactive session after a short period of inactivity.

Next, we log in in with the usr_admin_restricted account:

$ ssh usr_admin_restricted@10.246.178.121
(usr_admin_restricted@10.246.178.121) Password:
Copyright (c) 2001-2023 Dell Inc. or its subsidiaries. All Rights Reserved.
Copyright (c) 1992-2018 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
PowerScale OneFS 9.5.0.0
Allowed commands are
         clear ...
         isi ...
         isi_recovery_shell ...
         isi_log_access ...
         exit
         logout
 %

The isi commands now work because the user has the SecurityAdmin and SystemAdmin roles and privileges:

% isi auth roles list
Name
---------------
AuditAdmin
BackupAdmin
BasicUserRole
SecurityAdmin
StatisticsAdmin
SystemAdmin
VMwareAdmin
rl_console
rl_ssh
---------------
Total: 9
Allowed commands are
        clear ...
        isi ...
        isi_recovery_shell ...
        isi_log_access ...
        exit
        logout
% isi auth users view usr_admin_restricted
                    Name: usr_admin_restricted
                      DN: CN=usr_admin_restricted,CN=Users,DC=H7001
              DNS Domain: -
                  Domain: H7001
                Provider: lsa-local-provider:System
        Sam Account Name: usr_admin_restricted
                     UID: 2003
                     SID: S-1-5-21-3745626141-289409179-1286507423-1003
                 Enabled: Yes
                 Expired: No
                   Expiry: -
                  Locked: No
                   Email: -
                   GECOS: -
           Generated GID: No
           Generated UID: No
           Generated UPN: Yes
           Primary Group
                          ID: GID:1800
                        Name: Isilon Users
          Home Directory: /ifs/home/usr_admin_restricted
        Max Password Age: 4W
        Password Expired: No
         Password Expiry: 2023-05-30T17:16:53
       Password Last Set: 2023-05-02T17:16:53
        Password Expires: Yes
              Last Logon: -
                   Shell: /usr/local/restricted_shell/bin/restricted_shell.py
                     UPN: usr_admin_restricted@H7001
User Can Change Password: Yes
   Disable When Inactive: No
Allowed commands are
        clear ...
        isi ...
        isi_recovery_shell ...
        isi_log_access ...
        exit
        logout
%

However, the OneFS “isi underscore” commands are not supported under the restricted shell. For example, attempting to use the isi_for_array command:

% isi_for_array -s uname -a
Allowed commands are
        clear ...
        isi ...
        isi_recovery_shell ...
        isi_log_access ...
        exit
        logout

Note that, by default, the SecurityAdmin and SystemAdmin roles do not grant the usr_admin_restricted user the privileges needed to run the new isi_log_access and isi_recovery_shell commands.

In the next article in this series, we’ll take a look at these associated isi_log_access and isi_recovery_shell utilities that are also introduced in OneFS 9.5.

Author: Nick Trimbee

 

Home > Storage > PowerScale (Isilon) > Blogs

PowerScale OneFS troubleshooting firewall

OneFS Firewall Management and Troubleshooting

Nick Trimbee Nick Trimbee

Thu, 25 May 2023 14:41:59 -0000

|

Read Time: 0 minutes

In the final blog in this series, we’ll focus on step five of the OneFS firewall provisioning process and turn our attention to some of the management and monitoring considerations and troubleshooting tools associated with the firewall.

One can manage and monitor the firewall in OneFS 9.5 using the CLI, platform API, or WebUI. Because data security threats come from inside an environment as well as out, such as from a rogue IT employee, a good practice is to constrain the use of all-powerful ‘root’, ‘administrator’, and ‘sudo’ accounts as much as possible. Instead of granting cluster admins full rights, a preferred approach is to use OneFS’ comprehensive authentication, authorization, and accounting framework.

OneFS role-based access control (RBAC) can be used to explicitly limit who has access to configure and monitor the firewall. A cluster security administrator selects the desired access zone, creates a zone-aware role within it, assigns privileges, and then assigns members. For example, from the WebUI under Access > Membership and roles > Roles:

When these members login to the cluster from a configuration interface (WebUI, Platform API, or CLI) they inherit their assigned privileges.

Accessing the firewall from the WebUI and CLI in OneFS 9.5 requires the new ISI_PRIV_FIREWALL administration privilege.

# isi auth privileges -v | grep -i -A 2 firewall
         ID: ISI_PRIV_FIREWALL
Description: Configure network firewall
       Name: Firewall
   Category: Configuration
 Permission: w

This privilege can be assigned one of four permission levels for a role, including:

Permission Indicator

Description

No permission.

R

Read-only permission.

X

Execute permission.

W

Write permission.

By default, the built-in ‘SystemAdmin’ roles is granted write privileges to administer the firewall, while the built-in ‘AuditAdmin’ role has read permission to view the firewall configuration and logs.

With OneFS RBAC, an enhanced security approach for a site could be to create two additional roles on a cluster, each with an increasing realm of trust. For example:

1.  An IT ops/helpdesk role with ‘read’ access to the snapshot attributes would permit monitoring and troubleshooting the firewall, but no changes:

RBAC Role

Firewall Privilege

Permission

IT_Ops

ISI_PRIV_FIREWALL

Read

For example:

# isi auth roles create IT_Ops
# isi auth roles modify IT_Ops --add-priv-read ISI_PRIV_FIREWALL
# isi auth roles view IT_Ops | grep -A2 -i firewall
             ID: ISI_PRIV_FIREWALL
      Permission: r

2.  A Firewall Admin role would provide full firewall configuration and management rights:

RBAC Role

Firewall Privilege

Permission

FirewallAdmin

ISI_PRIV_FIREWALL

Write

For example:

# isi auth roles create FirewallAdmin
# isi auth roles modify FirewallAdmin –add-priv-write ISI_PRIV_FIREWALL
# isi auth roles view FirewallAdmin | grep -A2 -i firewall
ID: ISI_PRIV_FIREWALL
Permission: w

Note that when configuring OneFS RBAC, remember to remove the ‘ISI_PRIV_AUTH’ and ‘ISI_PRIV_ROLE’ privilege from all but the most trusted administrators.

Additionally, enterprise security management tools such as CyberArk can also be incorporated to manage authentication and access control holistically across an environment. These can be configured to change passwords on trusted accounts frequently (every hour or so), require multi-Level approvals prior to retrieving passwords, and track and audit password requests and trends.

OneFS firewall limits

When working with the OneFS firewall, there are some upper bounds to the configurable attributes to keep in mind. These include:

Name

Value

Description

MAX_INTERFACES

500

Maximum number of L2 interfaces including Ethernet, VLAN, LAGG interfaces on a node.

MAX _SUBNETS

100

Maximum number of subnets within a OneFS cluster

MAX_POOLS

100

Maximum number of network pools within a OneFS cluster

DEFAULT_MAX_RULES

100

Default value of maximum rules within a firewall policy

MAX_RULES

200

Upper limit of maximum rules within a firewall policy

MAX_ACTIVE_RULES

5000

Upper limit of total active rules across the whole cluster

MAX_INACTIVE_POLICIES

200

Maximum number of policies that are not applied to any network subnet or pool. They will not be written into ipfw tables.

Firewall performance

Be aware that, while the OneFS firewall can greatly enhance the network security of a cluster, by nature of its packet inspection and filtering activity, it does come with a slight performance penalty (generally less than 5%).

Firewall and hardening mode

If OneFS STIG hardening (that is, from ‘isi hardening apply’) is applied to a cluster with the OneFS firewall disabled, the firewall will be automatically activated. On the other hand, if the firewall is already enabled, then there will be no change and it will remain active.

Firewall and user-configurable ports

Some OneFS services allow the TCP/UDP ports on which the daemon listens to be changed. These include:

Service

CLI Command

Default Port

NDMP

isi ndmp settings global modify –port

10000

S3

isi s3 settings global modify –https-port

9020, 9021

SSH

isi ssh settings modify –port

22

The default ports for these services are already configured in the associated global policy rules. For example, for the S3 protocol:

# isi network firewall rules list | grep s3
default_pools_policy.rule_s3                  55     Firewall rule on s3 service                                                              allow
# isi network firewall rules view default_pools_policy.rule_s3
          ID: default_pools_policy.rule_s3
        Name: rule_s3
       Index: 55
 Description: Firewall rule on s3 service
    Protocol: TCP
   Dst Ports: 9020, 9021
Src Networks: -
   Src Ports: -
      Action: allow

Note that the global policies, or any custom policies, do not auto-update if these ports are reconfigured. This means that the firewall policies must be manually updated when changing ports. For example, if the NDMP port is changed from 10000 to 10001:

# isi ndmp settings global view
                       Service: False
                           Port: 10000
                            DMA: generic
          Bre Max Num Contexts: 64
MSB Context Retention Duration: 300
MSR Context Retention Duration: 600
        Stub File Open Timeout: 15
             Enable Redirector: False
              Enable Throttler: False
       Throttler CPU Threshold: 50
# isi ndmp settings global modify --port 10001
# isi ndmp settings global view | grep -i port
                           Port: 10001

The firewall’s NDMP rule port configuration must also be reset to 10001:

# isi network firewall rule list | grep ndmp
default_pools_policy.rule_ndmp                44     Firewall rule on ndmp service                                                            allow
# isi network firewall rule modify default_pools_policy.rule_ndmp --dst-ports 10001 --live
# isi network firewall rule view default_pools_policy.rule_ndmp | grep -i dst
   Dst Ports: 10001

Note that the –live flag is specified to enact this port change immediately.

Firewall and source-based routing

Under the hood, OneFS source-based routing (SBR) and the OneFS firewall both leverage ‘ipfw’. As such, SBR and the firewall share the single ipfw table in the kernel. However, the two features use separate ipfw table partitions.

This allows SBR and the firewall to be activated independently of each other. For example, even if the firewall is disabled, SBR can still be enabled and any configured SBR rules displayed as expected (that is, using ipfw set 0 show).

Firewall and IPv6

Note that the firewall’s global default policies have a rule allowing ICMP6 by default. For IPv6 enabled networks, ICMP6 is critical for the functioning of NDP (Neighbor Discovery Protocol). As such, when creating custom firewall policies and rules for IPv6-enabled network subnets/pools, be sure to add a rule allowing ICMP6 to support NDP. As discussed in a previous blog, an alternative (and potentially easier) approach is to clone a global policy to a new one and just customize its ruleset instead.

Firewall and FTP

The OneFS FTP service can work in two modes: Active and Passive. Passive mode is the default, where FTP data connections are created on top of random ephemeral ports. However, because the OneFS firewall requires fixed ports to operate, it only supports the FTP service in Active mode. Attempts to enable the firewall with FTP running in Passive mode will generate the following warning:

# isi ftp settings view | grep -i active
          Active Mode: No
# isi network firewall settings modify --enabled yes
FTP service is running in Passive mode. Enabling network firewall will lead to FTP clients having their connections blocked. To avoid this, please enable FTP active mode and ensure clients are configured in active mode before retrying. Are you sure you want to proceed and enable network firewall? (yes/[no]):

To activate the OneFS firewall in conjunction with the FTP service, first ensure that the FTP service is running in Active mode before enabling the firewall. For example:

# isi ftp settings view | grep -i enable
  FTP Service Enabled: Yes
# isi ftp settings view | grep -i active
          Active Mode: No
# isi ftp setting modify –active-mode true
# isi ftp settings view | grep -i active
          Active Mode: Yes
# isi network firewall settings modify --enabled yes

Note: Verify FTP active mode support and/or firewall settings on the client side, too.

Firewall monitoring and troubleshooting

When it comes to monitoring the OneFS firewall, the following logfiles and utilities provide a variety of information and are a good source to start investigating an issue:

Utility

Description

/var/log/isi_firewall_d.log

Main OneFS firewall log file, which includes information from firewall daemon.

/var/log/isi_papi_d.log

Logfile for platform AP, including Firewall related handlers.

isi_gconfig -t firewall

CLI command that displays all firewall configuration info.

ipfw show

CLI command that displays the ipfw table residing in the FreeBSD kernel.

Note that the preceding files and command output are automatically included in logsets generated by the ‘isi_gather_info’ data collection tool.

You can run the isi_gconfig command with the ‘-q’ flag to identify any values that are not at their default settings. For example, the stock (default) isi_firewall_d gconfig context will not report any configuration entries:

# isi_gconfig -q -t firewall
[root] {version:1}

The firewall can also be run in the foreground for additional active rule reporting and debug output. For example, first shut down the isi_firewall_d service:

# isi services -a isi_firewall_d disable
The service 'isi_firewall_d' has been disabled.

Next, start up the firewall with the ‘-f’ flag.

# isi_firewall_d -f
Acquiring kevents for flxconfig
Acquiring kevents for nodeinfo
Acquiring kevents for firewall config
Initialize the firewall library
Initialize the ipfw set
ipfw: Rule added by ipfw is for temporary use and will be auto flushed soon. Use isi firewall instead.
cmd:/sbin/ipfw set enable 0 normal termination, exit code:0
isi_firewall_d is now running
Loaded master FlexNet config (rev:312)
Update the local firewall with changed files: flx_config, Node info, Firewall config
Start to update the firewall rule...
flx_config version changed!                              latest_flx_config_revision: new:312, orig:0
node_info version changed!                               latest_node_info_revision: new:1, orig:0
firewall gconfig version changed!                                latest_fw_gconfig_revision: new:17, orig:0
Start to update the firewall rule for firewall configuration (gconfig)
Start to handle the firewall configure (gconfig)
Handle the firewall policy default_pools_policy
ipfw: Rule added by ipfw is for temporary use and will be auto flushed soon. Use isi firewall instead.
32043 allow tcp from any to any 10000 in
cmd:/sbin/ipfw add 32043 set 8 allow TCP from any  to any 10000 in  normal termination, exit code:0
ipfw: Rule added by ipfw is for temporary use and will be auto flushed soon. Use isi firewall instead.
32044 allow tcp from any to any 389,636 in
cmd:/sbin/ipfw add 32044 set 8 allow TCP from any  to any 389,636 in  normal termination, exit code:0
Snip...

If the OneFS firewall is enabled and some network traffic is blocked, either this or the ipfw show CLI command will often provide the first clues.

Please note that the ipfw command should NEVER be used to modify the OneFS firewall table!

For example, say a rule is added to the default pools policy denying traffic on port 9876 from all source networks (0.0.0.0/0):

# isi network firewall rules create default_pools_policy.rule_9876 --index=100 --dst-ports 9876 --src-networks 0.0.0.0/0 --action deny –live
# isi network firewall rules view default_pools_policy.rule_9876
          ID: default_pools_policy.rule_9876
        Name: rule_9876
       Index: 100
 Description:
    Protocol: ALL
   Dst Ports: 9876
Src Networks: 0.0.0.0/0
   Src Ports: -
      Action: deny

Running ipfw show and grepping for the port will show this new rule:

# ipfw show | grep 9876
32099            0               0 deny ip from any to any 9876 in

The ipfw show command output also reports the statistics of how many IP packets have matched each rule This can be incredibly useful when investigating firewall issues. For example, a telnet session is initiated to the cluster on port 9876 from a client:

# telnet 10.224.127.8 9876
Trying 10.224.127.8...
telnet: connect to address 10.224.127.8: Operation timed out
telnet: Unable to connect to remote host

The connection attempt will time out because the port 9876 ‘deny’ rule will silently drop the packets. At the same time, the ipfw show command will increment its counter to report on the denied packets. For example:

# ipfw show | grep 9876
32099            9             540 deny ip from any to any 9876 in

If this behavior is not anticipated or desired, you can find the rule name by searching the rules list for the port number, in this case port 9876:

# isi network firewall rules list | grep 9876
default_pools_policy.rule_9876                100                                                                 deny

The offending rule can then be reverted to ‘allow’ traffic on port 9876:

# isi network firewall rules modify default_pools_policy.rule_9876 --action allow --live

Or easily deleted, if preferred:

# isi network firewall rules delete default_pools_policy.rule_9876 --live
Are you sure you want to delete firewall rule default_pools_policy.rule_9876? (yes/[no]): yes

Author: Nick Trimbee




Home > Storage > PowerScale (Isilon) > Blogs

security PowerScale OneFS

OneFS Firewall Configuration–Part 2

Nick Trimbee Nick Trimbee

Wed, 17 May 2023 19:13:33 -0000

|

Read Time: 0 minutes

In the previous article in this OneFS firewall series, we reviewed the upgrade, activation, and policy selection components of the firewall provisioning process.

Now, we turn our attention to the firewall rule configuration step of the process.

As stated previously, role-based access control (RBAC) explicitly limits who has access to manage the OneFS firewall. So, ensure that the user account that will be used to enable and configure the OneFS firewall belongs to a role with the ‘ISI_PRIV_FIREWALL’ write privilege.

4. Configuring Firewall Rules

When the desired policy is created, the next step is to configure the rules. Clearly, the first step here is to decide which ports and services need securing or opening, beyond the defaults.

The following CLI syntax returns a list of all the firewall’s default services, plus their respective ports, protocols, and aliases, sorted by ascending port number:

# isi network firewall services list
 
Service Name     Port  Protocol   Aliases
 
---------------------------------------------
 
ftp-data         20    TCP        -
 
ftp              21    TCP        -
 
ssh              22    TCP        -
 
smtp             25    TCP        -
 
dns              53    TCP        domain
 
                       UDP
 
http             80    TCP        www
 
                                  www-http
 
kerberos         88    TCP        kerberos-sec
 
                       UDP
 
rpcbind          111   TCP        portmapper
 
                        UDP       sunrpc
 
                                 rpc.bind
 
ntp              123   UDP        -
 
dcerpc           135   TCP        epmap
 
                        UDP       loc-srv
 
netbios-ns       137   UDP        -
 
netbios-dgm      138   UDP        -
 
netbios-ssn      139   UDP        -
 
snmp             161   UDP        -
 
snmptrap         162   UDP        snmp-trap
 
mountd           300   TCP        nfsmountd
 
                       UDP
 
statd            302   TCP        nfsstatd
 
                       UDP
 
lockd            304   TCP       nfslockd
 
                       UDP
 
nfsrquotad       305   TCP        -
 
                       UDP
 
nfsmgmtd         306   TCP        -
 
                       UDP
 
ldap             389   TCP        -
 
                       UDP
 
https            443   TCP        -
 
smb              445   TCP        microsoft-ds
 
hdfs-datanode    585   TCP        -
 
asf-rmcp         623   TCP        -
 
                       UDP
 
ldaps            636   TCP        sldap
 
asf-secure-rmcp  664   TCP        -
 
                       UDP
 
ftps-data        989   TCP        -
 
ftps             990   TCP        -
 
nfs              2049  TCP        nfsd
 
                       UDP
 
tcp-2097         2097  TCP        -
 
tcp-2098         2098  TCP        -
 
tcp-3148         3148  TCP        -
 
tcp-3149         3149  TCP        -
 
tcp-3268         3268  TCP        -
 
tcp-3269         3269  TCP        -
 
tcp-5667         5667  TCP        -
 
tcp-5668         5668  TCP        -
 
isi_ph_rpcd      6557  TCP        -
 
isi_dm_d         7722  TCP        -
 
hdfs-namenode    8020  TCP        -
 
isi_webui        8080  TCP        apache2
 
webhdfs          8082  TCP        -
 
tcp-8083         8083  TCP        -
 
ambari-handshake 8440   TCP       -
 
ambari-heartbeat 8441   TCP       -
 
tcp-8443         8443  TCP        -
 
tcp-8470         8470  TCP        -
 
s3-http          9020  TCP        -
 
s3-https         9021  TCP        -
 
isi_esrs_d       9443  TCP        -
 
ndmp             10000 TCP       -
 
cee              12228 TCP       -
 
nfsrdma          20049 TCP       -
 
                       UDP
 
tcp-28080        28080 TCP       -
 
---------------------------------------------
 
Total: 55

Similarly, the following CLI command generates a list of existing rules and their associated policies, sorted in alphabetical order. For example, to show the first five rules:

# isi network firewall rules list –-limit 5
 
ID                                             Index  Description                                                                              Action
 
----------------------------------------------------------------------------------------------------------------------------------------------------
 
default_pools_policy.rule_ambari_handshake    41      Firewall rule on ambari-handshake service                                                allow
 
default_pools_policy.rule_ambari_heartbeat    42      Firewall rule on ambari-heartbeat service                                               allow
 
default_pools_policy.rule_catalog_search_req  50      Firewall rule on service for global catalog search requests                             allow
 
default_pools_policy.rule_cee                 52     Firewall rule on cee service                                                             allow
 
default_pools_policy.rule_dcerpc_tcp          18      Firewall rule on dcerpc(TCP) service                                                     allow
 
----------------------------------------------------------------------------------------------------------------------------------------------------
 
Total: 5

Both the ‘isi network firewall rules list’ and the ‘isi network firewall services list’ commands also have a ‘-v’ verbose option, and can return their output in csv, list, table, or json formats with the ‘–flag’.

To view the detailed info for a given firewall rule, in this case the default SMB rule, use the following CLI syntax:

# isi network firewall rules view default_pools_policy.rule_smb
 
          ID: default_pools_policy.rule_smb
 
        Name: rule_smb
 
       Index: 3
 
 Description: Firewall rule on smb service
 
    Protocol: TCP
 
   Dst Ports: smb
 
Src Networks: -
 
   Src Ports: -
 
      Action: allow

Existing rules can be modified and new rules created and added into an existing firewall policy with the ‘isi network firewall rules create’ CLI syntax. Command options include:

Option

Description

–action

Allow, which mean pass packets.

 

Deny, which means silently drop packets.

 

Reject which means reply with ICMP error code.

id

Specifies the ID of the new rule to create. The rule must be added to an existing policy. The ID can be up to 32 alphanumeric characters long and can include underscores or hyphens, but cannot include spaces or other punctuation. Specify the rule ID in the following format:

 

<policy_name>.<rule_name>

 

The rule name must be unique in the policy.

–index

The rule index in the pool. The valid value is between 1 and 99. The lower value has the higher priority. If not specified, automatically go to the next available index (before default rule 100).

–live

The live option must only be used when a user issues a command to create/modify/delete a rule in an active policy. Such changes will take effect immediately on all network subnets and pools associated with this policy. Using the live option on a rule in an inactive policy will be rejected, and an error message will be returned.

–protocol

Specify the protocol matched for the inbound packets.  Available values are tcp, udp, icmp, and all.  if not configured, the default protocol all will be used.

–dst-ports  

Specify the network ports/services provided in the storage system which is identified by destination port(s). The protocol specified by –protocol will be applied on these destination ports.

–src-networks

Specify one or more IP addresses with corresponding netmasks that are to be allowed by this firewall policy. The correct format for this parameter is address/netmask, similar to “192.0.2.128/25”. Separate multiple address/netmask pairs with commas. Use the value 0.0.0.0/0 for “any”.

–src-ports

Specify the network ports/services provided in the storage system which is identified by source port(s). The protocol specified by –protocol will be applied on these source ports.

Note that, unlike for firewall policies, there is no provision for cloning individual rules.

The following CLI syntax can be used to create new firewall rules. For example, to add ‘allow’ rules for the HTTP and SSH protocols, plus a ‘deny’ rule for port TCP 9876, into firewall policy fw_test1:

# isi network firewall rules create  fw_test1.rule_http  --index 1 --dst-ports http --src-networks 10.20.30.0/24,20.30.40.0/24 --action allow
# isi network firewall rules create  fw_test1.rule_ssh  --index 2 --dst-ports ssh --src-networks 10.20.30.0/24,20.30.40.0/16 --action allow
# isi network firewall rules create fw_test1.rule_tcp_9876 --index 3 --protocol tcp --dst-ports 9876   --src-networks 10.20.30.0/24,20.30.40.0/24 -- action deny

When a new rule is created in a policy, if the index value is not specified, it will automatically inherit the next available number in the series (such as index=4 in this case).

# isi network firewall rules create fw_test1.rule_2049  --protocol udp -dst-ports 2049 --src-networks 30.1.0.0/16 -- action deny

For a more draconian approach, a ‘deny’ rule could be created using the match-everything ‘*’ wildcard for destination ports and a 0.0.0.0/0 network and mask, which would silently drop all traffic:

# isi network firewall rules create fw_test1.rule_1234  --index=100--dst-ports * --src-networks 0.0.0.0/0 --action deny

When modifying existing firewall rules, use the following CLI syntax, in this case to change the source network of an HTTP allow rule (index 1) in firewall policy fw_test1:

# isi network firewall rules modify fw_test1.rule_http --index 1  --protocol ip --dst-ports http --src-networks 10.1.0.0/16 -- action allow

Or to modify an SSH rule (index 2) in firewall policy fw_test1, changing the action from ‘allow’ to ‘deny’:

# isi network firewall rules modify fw_test1.rule_ssh --index 2 --protocol tcp --dst-ports ssh --src-networks 10.1.0.0/16,20.2.0.0/16 -- action deny

Also, to re-order the custom TCP 9876 rule form the earlier example from index 3 to index 7 in firewall policy fw_test1.

# isi network firewall rules modify fw_test1.rule_tcp_9876 --index 7

Note that all rules equal or behind index 7 will have their index values incremented by one.

When deleting a rule from a firewall policy, any rule reordering is handled automatically. If the policy has been applied to a network pool, the ‘–live’ option can be used to force the change to take effect immediately. For example, to delete the HTTP rule from the firewall policy ‘fw_test1’:

# isi network firewall policies delete fw_test1.rule_http --live

Firewall rules can also be created, modified, and deleted within a policy from the WebUI by navigating to Cluster management > Firewall Configuration > Firewall Policies. For example, to create a rule that permits SupportAssist and Secure Gateway traffic on the 10.219.0.0/16 network:

Once saved, the new rule is then displayed in the Firewall Configuration page:

5. Firewall management and monitoring.

In the next and final article in this series, we’ll turn our attention to managing, monitoring, and troubleshooting the OneFS firewall (Step 5).

Author: Nick Trimbee



Home > Storage > PowerScale (Isilon) > Blogs

security PowerScale OneFS

OneFS Firewall Configuration—Part 1

Nick Trimbee Nick Trimbee

Tue, 02 May 2023 17:21:12 -0000

|

Read Time: 0 minutes

The new firewall in OneFS 9.5 enhances the security of the cluster and helps prevent unauthorized access to the storage system. When enabled, the default firewall configuration allows remote systems access to a specific set of default services for data, management, and inter-cluster interfaces (network pools).

The basic OneFS firewall provisioning process is as follows:

 

Note that role-based access control (RBAC) explicitly limits who has access to manage the OneFS firewall. In addition to the ubiquitous root, the cluster’s built-in SystemAdmin role has write privileges to configure and administer the firewall.

1.  Upgrade cluster to OneFS 9.5.

First, to provision the firewall, the cluster must be running OneFS 9.5.

If you are upgrading from an earlier release, the OneFS 9.5 upgrade must be committed before enabling the firewall.

Also, be aware that configuration and management of the firewall in OneFS 9.5 requires the new ISI_PRIV_FIREWALL administration privilege. 

# isi auth privilege | grep -i firewall
ISI_PRIV_FIREWALL                   Configure network firewall

This privilege can be granted to a role with either read-only or read/write permissions. By default, the built-in SystemAdmin role is granted write privileges to administer the firewall:

# isi auth roles view SystemAdmin | grep -A2 -i firewall
             ID: ISI_PRIV_FIREWALL
     Permission: w

Additionally, the built-in AuditAdmin role has read permission to view the firewall configuration and logs, and so on:

# isi auth roles view AuditAdmin | grep -A2 -i firewall
             ID: ISI_PRIV_FIREWALL
     Permission: r

Ensure that the user account that will be used to enable and configure the OneFS firewall belongs to a role with the ISI_PRIV_FIREWALL write privilege.

2.  Activate firewall.

The OneFS firewall can be either enabled or disabled, with the latter as the default state. 

The following CLI syntax will display the firewall’s global status (in this case disabled, the default):

# isi network firewall settings view
Enabled: False

Firewall activation can be easily performed from the CLI as follows:

# isi network firewall settings modify --enabled true
# isi network firewall settings view
Enabled: True

Or from the WebUI under Cluster management > Firewall Configuration > Settings:

Note that the firewall is automatically enabled when STIG hardening is applied to a cluster.

3.  Select policies.

A cluster’s existing firewall policies can be easily viewed from the CLI with the following command:

# isi network firewall policies list
ID        Pools                    Subnets                   Rules
 -----------------------------------------------------------------------------
 fw_test1  groupnet0.subnet0.pool0  groupnet0.subnet1         test_rule1
 -----------------------------------------------------------------------------
 Total: 1

Or from the WebUI under Cluster management > Firewall Configuration > Firewall Policies:

The OneFS firewall offers four main strategies when it comes to selecting a firewall policy: 

  1. Retaining the default policy
  2. Reconfiguring the default policy
  3. Cloning the default policy and reconfiguring
  4. Creating a custom firewall policy

We’ll consider each of these strategies in order:

a.  Retaining the default policy

In many cases, the default OneFS firewall policy value provides acceptable protection for a security-conscious organization. In these instances, once the OneFS firewall has been enabled on a cluster, no further configuration is required, and the cluster administrators can move on to the management and monitoring phase.

The firewall policy for all front-end cluster interfaces (network pool) is the default. While the default policy can be modified, be aware that this default policy is global. As such, any change against it will affect all network pools using this default policy.

The following table describes the default firewall policies that are assigned to each interface:

Policy

Description

Default pools policy

Contains rules for the inbound default ports for TCP and UDP services in OneFS

Default subnets policy

Contains rules for:

  • DNS port 53
  • ICMP
  • ICMP6

These can be viewed from the CLI as follows:

# isi network firewall policies view default_pools_policy
            ID: default_pools_policy
          Name: default_pools_policy
    Description: Default Firewall Pools Policy
Default Action: deny
      Max Rules: 100
          Pools: groupnet0.subnet0.pool0, groupnet0.subnet0.testpool1, groupnet0.subnet0.testpool2, groupnet0.subnet0.testpool3, groupnet0.subnet0.testpool4, groupnet0.subnet0.poolcava
        Subnets: -
          Rules: rule_ldap_tcp, rule_ldap_udp, rule_reserved_for_hw_tcp, rule_reserved_for_hw_udp, rule_isi_SyncIQ, rule_catalog_search_req, rule_lwswift, rule_session_transfer, rule_s3, rule_nfs_tcp, rule_nfs_udp, rule_smb, rule_hdfs_datanode, rule_nfsrdma_tcp, rule_nfsrdma_udp, rule_ftp_data, rule_ftps_data, rule_ftp, rule_ssh, rule_smtp, rule_http, rule_kerberos_tcp, rule_kerberos_udp, rule_rpcbind_tcp, rule_rpcbind_udp, rule_ntp, rule_dcerpc_tcp, rule_dcerpc_udp, rule_netbios_ns, rule_netbios_dgm, rule_netbios_ssn, rule_snmp, rule_snmptrap, rule_mountd_tcp, rule_mountd_udp, rule_statd_tcp, rule_statd_udp, rule_lockd_tcp, rule_lockd_udp, rule_nfsrquotad_tcp, rule_nfsrquotad_udp, rule_nfsmgmtd_tcp, rule_nfsmgmtd_udp, rule_https, rule_ldaps, rule_ftps, rule_hdfs_namenode, rule_isi_webui, rule_webhdfs, rule_ambari_handshake, rule_ambari_heartbeat, rule_isi_esrs_d, rule_ndmp, rule_isi_ph_rpcd, rule_cee, rule_icmp, rule_icmp6, rule_isi_dm_d
 # isi network firewall policies view default_subnets_policy
            ID: default_subnets_policy
          Name: default_subnets_policy
    Description: Default Firewall Subnets Policy
Default Action: deny
      Max Rules: 100
          Pools: -
        Subnets: groupnet0.subnet0
          Rules: rule_subnets_dns_tcp, rule_subnets_dns_udp, rule_icmp, rule_icmp6

Or from the WebUI under Cluster management > Firewall Configuration > Firewall Policies:

b.  Reconfiguring the default policy

Depending on an organization’s threat levels or security mandates, there may be a need to restrict access to certain additional IP addresses and/or management service protocols.

If the default policy is deemed insufficient, reconfiguring the default firewall policy can be a good option if only a small number of rule changes are required. The specifics of creating, modifying, and deleting individual firewall rules is covered later in this article (step 3).

Note that if new rule changes behave unexpectedly, or firewall configuration generally goes awry, OneFS does provide a “get out of jail free” card. In a pinch, the global firewall policy can be quickly and easily restored to its default values. This can be achieved with the following CLI syntax:

# isi network firewall reset-global-policy
This command will reset the global firewall policies to the original system defaults. Are you sure you want to continue? (yes/[no]):

Alternatively, the default policy can also be easily reverted from the WebUI by clicking the Reset default policies:

 c.  Cloning the default policy and reconfiguring

Another option is cloning, which can be useful when batch modification or a large number of changes to the current policy are required. By cloning the default firewall policy, an exact copy of the existing policy and its rules is generated, but with a new policy name. For example:

# isi network firewall policies clone default_pools_policy clone_default_pools_policy
# isi network firewall policies list | grep -i clone
clone_default_pools_policy -                           

Cloning can also be initiated from the WebUI under Firewall Configuration > Firewall Policies > More Actions > Clone Policy:

Enter a name for the clone in the Policy Name field in the pop-up window, and click Save:

 Once cloned, the policy can then be easily reconfigured to suit. For example, to modify the policy fw_test1 and change its default-action from deny-all to allow-all:

# isi network firewall policies modify fw_test1 --default-action allow-all

When modifying a firewall policy, you can use the --live CLI option to force it to take effect immediately. Note that the --live option is only valid when issuing a command to modify or delete an active custom policy and to modify default policy. Such changes will take effect immediately on all network subnets and pools associated with this policy. Using the --live option on an inactive policy will be rejected, and an error message returned.

Options for creating or modifying a firewall policy include:

Option

Description

--default-action

Automatically add one rule to deny all or allow all to the bottom of the rule set for this created policy (Index = 100).

--max-rule-num

By default, each policy when created could have a maximum of 100 rules (including one default rule), so user could configure a maximum of 99 rules. User could expand the maximum rule number to a specified value. Currently this value is limited to 200 (and user could configure a maximum of 199 rules).

--add-subnets

Specify the network subnet(s) to add to policy, separated by a comma.

--remove-subnets

Specify the networks subnets to remove from policy and fall back to global policy.

--add-pools

Specify the network pool(s) to add to policy, separated by a comma.

--remove-pools

Specify the networks pools to remove from policy and fall back to global policy.

When you modify firewall policies, OneFS issues the following warning to verify the changes and help avoid the risk of a self-induced denial-of-service:   

# isi network firewall policies modify --pools groupnet0.subnet0.pool0 fw_test1
Changing the Firewall Policy associated with a subnet or pool may change the networks and/or services allowed to connect to OneFS. Please confirm you have selected the correct Firewall Policy and Subnets/Pools. Are you sure you want to continue? (yes/[no]): yes

Once again, having the following CLI command handy, plus console access to the cluster is always a prudent move:

# isi network firewall reset-global-policy

So adding network pools or subnets to a firewall policy will cause the previous policy to be removed from them. Similarly, adding network pools or subnets to the global default policy will revert any custom policy configuration they might have. For example, to apply the firewall policy fw_test1 to IP Pool groupnet0.subnet0.pool0 and groupnet0.subnet0.pool1:

# isi network pools view groupnet0.subnet0.pool0 | grep -i firewall
       Firewall Policy: default_pools_policy
# isi network firewall policies modify fw_test1 --add-pools groupnet0.subnet0.pool0, groupnet0.subnet0.pool1
# isi network pools view groupnet0.subnet0.pool0 | grep -i firewall
       Firewall Policy: fw_test1

Or to apply the firewall policy fw_test1 to IP Pool groupnet0.subnet0.pool0 and groupnet0.subnet0:

# isi network firewall policies modify fw_test1 --apply-subnet groupnet0.subnet0.pool0, groupnet0.subnet0
# isi network pools view groupnet0.subnet0.pool0 | grep -i firewall
 Firewall Policy: fw_test1
# isi network subnets view groupnet0.subnet0 | grep -i firewall
 Firewall Policy: fw_test1

To reapply global policy at any time, either add the pools to the default policy:

# isi network firewall policies modify default_pools_policy --add-pools groupnet0.subnet0.pool0, groupnet0.subnet0.pool1
# isi network pools view groupnet0.subnet0.pool0 | grep -i firewall
 Firewall Policy: default_subnets_policy
# isi network subnets view groupnet0.subnet1 | grep -i firewall
 Firewall Policy: default_subnets_policy

Or remove the pool from the custom policy:

# isi network firewall policies modify fw_test1 --remove-pools groupnet0.subnet0.pool0 groupnet0.subnet0.pool1

You can also manage firewall policies on a network pool in the OneFS WebUI by going to Cluster configuration > Network configuration > External network > Edit pool details. For example:

 

Be aware that cloning is also not limited to the default policy because clones can be made of any custom policies too. For example:

# isi network firewall policies clone clone_default_pools_policy fw_test1

d.  Creating a custom firewall policy

Alternatively, a custom firewall policy can also be created from scratch. This can be accomplished from the CLI using the following syntax, in this case to create a firewall policy named fw_test1:

# isi network firewall policies create fw_test1 --default-action deny
# isi network firewall policies view fw_test1
            ID: fw_test1
          Name: fw_test1
    Description:
Default Action: deny
      Max Rules: 100
          Pools: -
        Subnets: -
          Rules: -

Note that if a default-action is not specified in the CLI command syntax, it will automatically default to deny.

Firewall policies can also be configured in the OneFS WebUI by going to Cluster management > Firewall Configuration > Firewall Policies > Create Policy:

However, in contrast to the CLI, if a default-action is not specified when a policy is created in the WebUI, the automatic default is to Allow because the drop-down list works alphabetically.

If and when a firewall policy is no longer required, it can be swiftly and easily removed. For example, the following CLI syntax deletes the firewall policy fw_test1, clearing out any rules within this policy container:

# isi network firewall policies delete fw_test1
Are you sure you want to delete firewall policy fw_test1? (yes/[no]): yes

Note that the default global policies cannot be deleted.

# isi network firewall policies delete default_subnets_policy
Are you sure you want to delete firewall policy default_subnets_policy? (yes/[no]): yes
Firewall policy: Cannot delete default policy default_subnets_policy.

4.  Configure firewall rules.

 In the next article in this series, we’ll turn our attention to this step, configuring the OneFS firewall rules.

 

 

Home > Storage > PowerScale (Isilon) > Blogs

security PowerScale OneFS cybersecurity

PowerScale OneFS 9.5 Delivers New Security Features and Performance Gains

Nick Trimbee Nick Trimbee

Fri, 28 Apr 2023 19:57:51 -0000

|

Read Time: 0 minutes

PowerScale – the world’s most flexible[1] and cyber-secure scale-out NAS solution[2]  – is powering up the new year with the launch of the innovative OneFS 9.5 release. With data integrity and protection being top of mind in this era of unprecedented corporate cyber threats, OneFS 9.5 brings an array of new security features and functionality to keep your unstructured data and workloads more secure than ever, as well as delivering significant performance gains on the PowerScale nodes – such as up to 55% higher performance on all-flash F600 and F900 nodes as compared with the previous OneFS release.[3]   

Table

Description automatically generated

OneFS and hardware security features 

New PowerScale OneFS 9.5 security enhancements include those that directly satisfy US Federal and DoD mandates, such as FIPS 140-2, Common Criteria, and DISA STIGs – in addition to general enterprise data security requirements. Multi-factor authentication (MFA), single sign-on (SSO) support, data encryption in-flight and at rest, TLS 1.2, USGv6R1 IPv6 support, SED Master Key rekey, plus a new host-based firewall are all part of OneFS 9.5. 

15TB and 30TB self-encrypting (SED) SSDs now enable PowerScale platforms running OneFS 9.5 to scale up to 186 PB of encrypted raw capacity per cluster – all within a single volume and filesystem, and before any additional compression and deduplication benefit.  

Delivering federal-grade security to protect data under a zero trust model 

Security-wise, the United States Government has stringent requirements for infrastructure providers such as Dell Technologies, requiring vendors to certify that products comply with requirements such as USGv6, STIGs, DoDIN APL, Common Criteria, and so on. Activating the OneFS 9.5 cluster hardening option implements a default maximum security configuration with AES and SHA cryptography, which automatically renders a cluster FIPS 140-2 compliant. 

OneFS 9.5 introduces SAML-based single sign-on (SSO) from both the command line and WebUI using a redesigned login screen. OneFS SSO is compatible with identity providers (IDPs) such as Active Directory Federation Services, and is also multi-tenant aware, allowing independent configuration for each of a cluster’s Access Zones. 

Federal APL requirements mandate that a system must validate all certificates in a chain up to a trusted CA root certificate. To address this, OneFS 9.5 introduces a common Public Key Infrastructure (PKI) library to issue, maintain, and revoke public key certificates. These certificates provide digital signature and encryption capabilities, using public key cryptography to provide identification and authentication, data integrity, and confidentiality. This PKI library is used by all OneFS components that need PKI certificate verification support, such as SecureSMTP, ensuring that they all meet Federal PKI requirements. 

This new OneFS 9.5 PKI and certificate authority infrastructure enables multi-factor authentication, allowing users to swipe a CAC or PIV smartcard containing their login credentials to gain access to a cluster, rather than manually entering username and password information. Additional account policy restrictions in OneFS 9.5 automatically disable inactive accounts, provide concurrent administrative session limits, and implement a delay after a failed login.  

As part of FIPS 140-2 compliance, OneFS 9.5 introduces a new key manager, providing a secure central repository for secrets such as machine passwords, Kerberos keytabs, and other credentials, with the option of using MCF (modular crypt format) with SHA256 or SHA512 hash types. OneFS protocols and services may be configured to support FIPS 140-2 data-in-flight encryption compliance, while SED clusters and the new Master Key re-key capability provide FIPS 140-2 data-at-rest encryption. Plus, any unused or non-compliant services are easily disabled.  

On the network side, the Federal APL has several IPv6 (USGv6) requirements that are focused on allowing granular control of individual components of a cluster’s IPv6 stack, such as duplicate address detection (DAD) and link local IP control. Satisfying both STIG and APL requirements, the new OneFS 9.5 front-end firewall allows security admins to restrict the management interface to specified subnet and implement port blocking and packet filtering rules from the cluster’s command line or WebUI, in accordance with federal or corporate security policy. 

Improving performance for the most demanding workloads

OneFS 9.5 unlocks dramatic performance gains, particularly for the all-flash NVMe platforms, where the PowerScale F900 can now support line-rate streaming reads. SmartCache enhancements allow OneFS 9.5 to deliver streaming read performance gains of up to 55% on the F-series nodes, F600 and F9003, delivering benefit to media and entertainment workloads, plus AI, machine learning, deep learning, and more. 

Enhancements to SmartPools in OneFS 9.5 introduce configurable transfer limits. These limits include maximum capacity thresholds, expressed as a percentage, above which SmartPools will not attempt to move files to a particular tier, boosting both reliability and tiering performance. 

Granular cluster performance control is enabled with the debut of PowerScale SmartQoS, which allows admins to configure limits on the maximum number of protocol operations that NFS, S3, SMB, or mixed protocol workloads can consume. 

Enhancing enterprise-grade supportability and serviceability

OneFS 9.5 enables SupportAssist, Dell’s next generation remote connectivity system for transmitting events, logs, and telemetry from a PowerScale cluster to Dell Support. SupportAssist provides a full replacement for ESRS, as well as enabling Dell Support to perform remote diagnosis and remediation of cluster issues. 

Upgrading to OneFS 9.5 

The new OneFS 9.5 code is available on the Dell Technologies Support site, as both an upgrade and reimage file, allowing both installation and upgrade of this new release.  

Author: Nick Trimbee

[1] Based on Dell analysis, August 2021.

[2] Based on Dell analysis comparing cybersecurity software capabilities offered for Dell PowerScale vs. competitive products, September 2022.

[3] Based on Dell internal testing, January 2023. Actual results will vary.


Home > Storage > PowerScale (Isilon) > Blogs

PowerScale OneFS

Announcing PowerScale OneFS 9.4!

Nick Trimbee Nick Trimbee

Fri, 28 Apr 2023 19:52:18 -0000

|

Read Time: 0 minutes

Arriving in time for Dell Technologies World 2022, the new PowerScale OneFS 9.4 release shipped on Monday 4th April 2022. 

OneFS 9.4 brings with it a wide array of new features and functionality, including:

Feature

Description

SmartSync Data Mover

  • Introduction of a new OneFS SmartSync data mover, allowing flexible data movement and copying, incremental resyncs, push and pull data transfer, and one-time file to object copy. Complementary to SyncIQ, SmartSync provides an additional option for data transfer, including to object storage targets such as ECS, AWS, and Azure.

IB to Ethernet Backend Migration

  • Non-disruptive rolling InfiniBand to Ethernet back-end network migration for legacy Gen6 clusters.

Secure Boot

  • ·       Secure boot support is extended to include the F900, F600, F200, H700/7000, and A700/7000 platforms.

Smarter SmartConnect Diagnostics

  • Identifies non-resolvable nodes and provides their detailed status, allowing the root cause to be easily pinpointed.

In-line Dedupe

  • In-line deduplication will be enabled by default on new OneFS 9.4 clusters. Clusters upgraded to OneFS 9.4 will maintain their current dedupe configuration.

Healthcheck Auto-updates

  • Automatic monitoring, download, and installation of new healthcheck packages as they are released.

CloudIQ Protocol Statistics

  • New protocol statistics ‘count’ keys are added, allowing performance to be measured over a specified time window and providing point-in-time protocol information.

SRS Alerts and CELOG Event Limiting

  • Prevents CELOG from sending unnecessary event types to Dell SRS and restricts CELOG alerts from customer-created channels.

CloudPools Statistics

  • Automated statistics gathering on CloudPools accounts and policies, providing insights for planning and troubleshooting CloudPools-related activities. 

We’ll be taking a deeper look at some of these new features in blog articles over the course of the next few weeks. 

Meanwhile, the new OneFS 9.4 code is available for download on the Dell Online Support site, in both upgrade and reimage file formats. 

Enjoy your OneFS 9.4 experience!

Author: Nick Trimbee

Home > Storage > PowerScale (Isilon) > Blogs

security PowerScale OneFS

OneFS Host-Based Firewall

Nick Trimbee Nick Trimbee

Wed, 26 Apr 2023 15:40:15 -0000

|

Read Time: 0 minutes

Among the array of security features introduced in OneFS 9.5 is a new host-based firewall. This firewall allows cluster administrators to configure policies and rules on a PowerScale cluster in order to meet the network and application management needs and security mandates of an organization.

The OneFS firewall protects the cluster’s external, or front-end, network and operates as a packet filter for inbound traffic. It is available upon installation or upgrade to OneFS 9.5 but is disabled by default in both cases. However, the OneFS STIG hardening profile automatically enables the firewall and the default policies, in addition to manual activation.

The firewall generally manages IP packet filtering in accordance with the OneFS Security Configuration Guide, especially in regards to the network port usage. Packet control is governed by firewall policies, which have one or more individual rules.

Item

Description

Match

Action

Firewall Policy

Each policy is a set of firewall rules.

Rules are matched by index in ascending order.

Each policy has a default action.

Firewall Rule

Each rule specifies what kinds of network packets should be matched by Firewall engine and what action should be taken upon them.

Matching criteria includes protocol, source ports, destination ports, source network address).

Options are allow, deny, or reject.

 A security best practice is to enable the OneFS firewall using the default policies, with any adjustments as required. The recommended configuration process is as follows:

Step

Details

1.  Access

Ensure that the cluster uses a default SSH or HTTP port before enabling. The default firewall policies block all nondefault ports until you change the policies.

2.  Enable

Enable the OneFS firewall.

3.  Compare

Compare your cluster network port configurations against the default ports listed in Network port usage.

4.  Configure

Edit the default firewall policies to accommodate any non-standard ports in use in the cluster.

NOTE: The firewall policies do not automatically update when port configurations are changed.

5.  Constrain

Limit access to the OneFS Web UI to specific administrator terminals.

Under the hood, the OneFS firewall is built upon the ubiquitous ipfirewall, or ipfw, which is FreeBSD’s native stateful firewall, packet filter, and traffic accounting facility.

Firewall configuration and management is through the CLI, or platform API, or WebUI, and OneFS 9.5 introduces a new Firewall Configuration page to support this. Note that the firewall is only available once a cluster is already running OneFS 9.5 and the feature has been manually enabled, activating the isi_firewall_d service. The firewall’s configuration is split between gconfig, which handles the settings and policies, and the ipfw table, which stores the rules themselves.

The firewall gracefully handles SmartConnect dynamic IP movement between nodes since firewall policies are applied per network pool. Additionally, being network pool based allows the firewall to support OneFS access zones and shared/multitenancy models. 

The individual firewall rules, which are essentially simplified wrappers around ipfw rules, work by matching packets through the 5-tuples that uniquely identify an IPv4 UDP or TCP session:

  • Source IP address
  • Source port
  • Destination IP address
  • Destination port
  • Transport protocol

The rules are then organized within a firewall policy, which can be applied to one or more network pools. 

Note that each pool can only have a single firewall policy applied to it. If there is no custom firewall policy configured for a network pool, it automatically uses the global default firewall policy.

When enabled, the OneFS firewall function is cluster wide, and all inbound packets from external interfaces will go through either the custom policy or default global policy before reaching the protocol handling pathways. Packets passed to the firewall are compared against each of the rules in the policy, in rule-number order. Multiple rules with the same number are permitted, in which case they are processed in order of insertion. When a match is found, the action corresponding to that matching rule is performed. A packet is checked against the active ruleset in multiple places in the protocol stack, and the basic flow is as follows: 

  1. Get the logical interface for incoming packets.
  2. Find all network pools assigned to this interface.
  3. Compare these network pools one by one with destination IP address to find the matching pool (either custom firewall policy, or default global policy).
  4. Compare each rule with service (protocol and destination ports) and source IP address in this pool in order of lowest index value.  If matched, perform actions according to the associated rule.
  5. If no rule matches, go to the final rule (deny all or allow all), which is specified upon policy creation.

The OneFS firewall automatically reserves 20,000 rules in the ipfw table for its custom and default policies and rules. By default, each policy can have a maximum of 100 rules, including one default rule. This translates to an effective maximum of 99 user-defined rules per policy, because the default rule is reserved and cannot be modified. As such, a maximum of 198 policies can be applied to pools or subnets since the default-pools-policy and default-subnets-policy are reserved and cannot be deleted.

Additional firewall bounds and limits to keep in mind include:

Name

Value

Description

MAX_INTERFACES

500

Maximum number of Layer 2 interfaces per node (including Ethernet, VLAN, LAGG interfaces).

MAX _SUBNETS

100

Maximum number of subnets within a OneFS cluster.

MAX_POOLS

100

Maximum number of network pools within a OneFS cluster.

DEFAULT_MAX_RULES

100

Default value of maximum rules within a firewall policy.

MAX_RULES

200

Upper limit of maximum rules within a firewall policy.

MAX_ACTIVE_RULES

5000

Upper limit of total active rules across the whole cluster.

MAX_INACTIVE_POLICIES

200

Maximum number of policies that are not applied to any network subnet or pool. They will not be written into ipfw table.

The firewall default global policy is ready to use out of the box and, unless a custom policy has been explicitly configured, all network pools use this global policy. Custom policies can be configured by either cloning and modifying an existing policy or creating one from scratch. 

Component

Description

Custom policy

A user-defined container with a set of rules. A policy can be applied to multiple network pools, but a network pool can only apply one policy.

Firewall rule

An ipfw-like rule that can be used to restrict remote access. Each rule has an index that is valid within the policy. Index values range from 1 to 99, with lower numbers having higher priority. Source networks are described by IP and netmask, and services can be expressed either by port number (i.e., 80) or service name (i.e., http, ssh, smb). The * wildcard can also be used to denote all services. Supported actions include allow, drop, and reject.

Default policy

A global policy to manage all default services, used for maintaining OneFS minimum running and management. While Deny any is the default action of the policy, the defined service rules have a default action to allow all remote access. All packets not matching any of the rules are automatically dropped.  

Two default policies: 

  • default-pools-policy
  • default-subnets-policy

Note that these two default policies cannot be deleted, but individual rule modification is permitted in each.

Default services

The firewall’s default predefined services include the usual suspects, such as: DNS, FTP, HDFS, HTTP, HTTPS, ICMP, NDMP, NFS, NTP, S3, SMB, SNMP, SSH, and so on. A full listing is available in the isi network firewall services list CLI command output.

For a given network pool, either the global policy or a custom policy is assigned and takes effect. Additionally, all configuration changes to either policy type are managed by gconfig and are persistent across cluster reboots.

In the next article in this series we’ll take a look at the CLI and WebUI configuration and management of the OneFS firewall. 

 

 

Home > Storage > PowerScale (Isilon) > Blogs

security PowerScale OneFS snapshots

OneFS Snapshot Security

Nick Trimbee Nick Trimbee

Fri, 21 Apr 2023 17:11:00 -0000

|

Read Time: 0 minutes

In this era of elevated cyber-crime and data security threats, there is increasing demand for immutable, tamper-proof snapshots. Often this need arises as part of a broader security mandate, ideally proactively, but oftentimes as a response to a security incident. OneFS addresses this requirement in the following ways:

On-cluster

Off-cluster

  • Read-only snapshots
  • Snapshot locks
  • Role-based administration
  • SyncIQ snapshot replication
  • Cyber-vaulting

Read-only snapshots

At its core, OneFS SnapshotIQ generates read-only, point-in-time, space efficient copies of a defined subset of a cluster’s data.

Only the changed blocks of a file are stored when updating OneFS snapshots, ensuring efficient storage utilization. They are also highly scalable and typically take less than a second to create, while generating little performance overhead. As such, the RPO (recovery point objective) and RTO (recovery time objective) of a OneFS snapshot can be very small and highly flexible, with the use of rich policies and schedules.

OneFS Snapshots are created manually, on a schedule, or automatically generated by OneFS to facilitate system operations. But whatever the generation method, when a snapshot has been taken, its contents cannot be manually altered.

Snapshot Locks

In addition to snapshot contents immutability, for an enhanced level of tamper-proofing, SnapshotIQ also provides the ability to lock snapshots with the ‘isi snapshot locks’ CLI syntax. This prevents snapshots from being accidentally or unintentionally deleted.

For example, a manual snapshot, ‘snaploc1’ is taken of /ifs/test:

# isi snapshot snapshots create /ifs/test --name snaploc1
# isi snapshot snapshots list | grep snaploc1
79188 snaploc1                                     /ifs/test

A lock is then placed on it (in this case lock ID=1):

# isi snapshot locks create snaplock1
# isi snapshot locks list snaploc1
ID
----
1
----
Total: 1

Attempts to delete the snapshot fail because the lock prevents its removal:

# isi snapshot snapshots delete snaploc1
Are you sure? (yes/[no]): yes
Snapshot "snaploc1" can't be deleted because it is locked

The CLI command ‘isi snapshot locks delete <lock_ID>’ can be used to clear existing snapshot locks, if desired. For example, to remove the only lock (ID=1) from snapshot ‘snaploc1’:

# isi snapshot locks list snaploc1
ID
----
1
----
Total: 1
# isi snapshot locks delete snaploc1 1
Are you sure you want to delete snapshot lock 1 from snaploc1? (yes/[no]): yes
# isi snap locks view snaploc1 1
No such lock

When the lock is removed, the snapshot can then be deleted:

# isi snapshot snapshots delete snaploc1
Are you sure? (yes/[no]): yes
# isi snapshot snapshots list| grep -i snaploc1 | wc -l
       0

Note that a snapshot can have up to a maximum of sixteen locks on it at any time. Also, lock numbers are continually incremented and not recycled upon deletion.

Like snapshot expiration, snapshot locks can also have an expiration time configured. For example, to set a lock on snapshot ‘snaploc1’ that expires at 1am on April 1st, 2024:

# isi snap lock create snaploc1 --expires '2024-04-01T01:00:00'
# isi snap lock list snaploc1
ID
----
36
----
Total: 1
# isi snap lock view snaploc1 33
     ID: 36
Comment:
Expires: 2024-04-01T01:00:00
  Count: 1

Note that if the duration period of a particular snapshot lock expires but others remain, OneFS will not delete that snapshot until all the locks on it have been deleted or expired.

The following table provides an example snapshot expiration schedule, with monthly locked snapshots to prevent deletion:

Snapshot Frequency

Snapshot Time

Snapshot Expiration

Max Retained Snapshots

Every other hour

Start at 12:00AM

End at 11:59AM

1 day





27

Every day

At 12:00AM

1 week

Every week

Saturday at 12:00AM

1 month

Every month

First Saturday of month at 12:00AM

Locked

Roles-based Access Control

Read-only snapshots plus locks present physically secure snapshots on a cluster. However, if you can login to the cluster and have the required elevated administrator privileges to do so, you can still remove locks and/or delete snapshots.

Because data security threats come from inside an environment as well as out, such as from a disgruntled IT employee or other internal bad actor, another key to a robust security profile is to constrain the use of all-powerful ‘root’, ‘administrator’, and ‘sudo’ accounts as much as possible. Instead, of granting cluster admins full rights, a preferred security best practice is to leverage the comprehensive authentication, authorization, and accounting framework that OneFS natively provides.

OneFS role-based access control (RBAC) can be used to explicitly limit who has access to manage and delete snapshots. This granular control allows you to craft administrative roles that can create and manage snapshot schedules, but prevent their unlocking and/or deletion. Similarly, lock removal and snapshot deletion can be isolated to a specific security role (or to root only).

A cluster security administrator selects the desired access zone, creates a zone-aware role within it, assigns privileges, and then assigns members.

For example, from the WebUI under Access > Membership and roles > Roles:

When these members access the cluster through the WebUI, PlatformAPI, or CLI, they inherit their assigned privileges.

The specific privileges that can be used to segment OneFS snapshot management include:

Privilege

Description

ISI_PRIV_SNAPSHOT_ALIAS

Aliasing for snapshots

ISI_PRIV_SNAPSHOT_LOCKS

Locking of snapshots from deletion

ISI_PRIV_SNAPSHOT_PENDING

Upcoming snapshot based on schedules

ISI_PRIV_SNAPSHOT_RESTORE

Restoring directory to a particular snapshot

ISI_PRIV_SNAPSHOT_SCHEDULES

Scheduling for periodic snapshots

ISI_PRIV_SNAPSHOT_SETTING

Service and access settings

ISI_PRIV_SNAPSHOT_SNAPSHOTMANAGEMENT

Manual snapshots and locks

ISI_PRIV_SNAPSHOT_SNAPSHOT_SUMMARY

Snapshot summary and usage details

Each privilege can be assigned one of four permission levels for a role, including:

Permission Indicator

Description

No permission

R

Read-only permission

X

Execute permission

W

Write permission

The ability for a user to delete a snapshot is governed by the ‘ISI_PRIV_SNAPSHOT_SNAPSHOTMANAGEMENT’ privilege. Similarly, the ‘ISI_PRIV_SNAPSHOT_LOCKS’ privilege governs lock creation and removal.

In the following example, the ‘snap’ role has ‘read’ rights for the ‘ISI_PRIV_SNAPSHOT_LOCKS’ privilege, allowing a user associated with this role to view snapshot locks:

# isi auth roles view snap | grep -I -A 1 locks
             ID: ISI_PRIV_SNAPSHOT_LOCKS
     Permission: r
--
# isi snapshot locks list snaploc1
ID
----
1
----
Total: 1

However, attempts to remove the lock ‘ID 1’ from the ‘snaploc1’ snapshot fail without write privileges:

# isi snapshot locks delete snaploc1 1
Privilege check failed. The following write privilege is required: Snapshot locks (ISI_PRIV_SNAPSHOT_LOCKS)

Write privileges are added to ‘ISI_PRIV_SNAPSHOT_LOCKS’ in the ‘’snaploc1’ role:

# isi auth roles modify snap –-add-priv-write ISI_PRIV_SNAPSHOT_LOCKS
# isi auth roles view snap | grep -I -A 1 locks
             ID: ISI_PRIV_SNAPSHOT_LOCKS
     Permission: w
--

This allows the lock ‘ID 1’ to be successfully deleted from the ‘snaploc1’ snapshot:

# isi snapshot locks delete snaploc1 1
Are you sure you want to delete snapshot lock 1 from snaploc1? (yes/[no]): yes
# isi snap locks view snaploc1 1
No such lock

Using OneFS RBAC, an enhanced security approach for a site could be to create three OneFS roles on a cluster, each with an increasing realm of trust:

1.  First, an IT ops/helpdesk role with ‘read’ access to the snapshot attributes would permit monitoring and troubleshooting, but no changes:

Snapshot Privilege

Description

ISI_PRIV_SNAPSHOT_ALIAS

Read

ISI_PRIV_SNAPSHOT_LOCKS

Read

ISI_PRIV_SNAPSHOT_PENDING

Read

ISI_PRIV_SNAPSHOT_RESTORE

Read

ISI_PRIV_SNAPSHOT_SCHEDULES

Read

ISI_PRIV_SNAPSHOT_SETTING

Read

ISI_PRIV_SNAPSHOT_SNAPSHOTMANAGEMENT

Read

ISI_PRIV_SNAPSHOT_SNAPSHOT_SUMMARY

Read

2.  Next, a cluster admin role, with ‘read’ privileges for ‘ISI_PRIV_SNAPSHOT_LOCKS’ and ‘ISI_PRIV_SNAPSHOT_SNAPSHOTMANAGEMENT’ would prevent snapshot and lock deletion, but provide ‘write’ access for schedule configuration, restores, and so on.

Snapshot Privilege

Description

ISI_PRIV_SNAPSHOT_ALIAS

Write

ISI_PRIV_SNAPSHOT_LOCKS

Read

ISI_PRIV_SNAPSHOT_PENDING

Write

ISI_PRIV_SNAPSHOT_RESTORE

Write

ISI_PRIV_SNAPSHOT_SCHEDULES

Write

ISI_PRIV_SNAPSHOT_SETTING

Write

ISI_PRIV_SNAPSHOT_SNAPSHOTMANAGEMENT

Read

ISI_PRIV_SNAPSHOT_SNAPSHOT_SUMMARY

Write

3.  Finally, a cluster security admin role (root equivalence) would provide full snapshot configuration and management, lock control, and deletion rights:

Snapshot Privilege

Description

ISI_PRIV_SNAPSHOT_ALIAS

Write

ISI_PRIV_SNAPSHOT_LOCKS

Write

ISI_PRIV_SNAPSHOT_PENDING

Write

ISI_PRIV_SNAPSHOT_RESTORE

Write

ISI_PRIV_SNAPSHOT_SCHEDULES

Write

ISI_PRIV_SNAPSHOT_SETTING

Write

ISI_PRIV_SNAPSHOT_SNAPSHOTMANAGEMENT

Write

ISI_PRIV_SNAPSHOT_SNAPSHOT_SUMMARY

Write

Note that when configuring OneFS RBAC, remember to remove the ‘ISI_PRIV_AUTH’ and ‘ISI_PRIV_ROLE’ privilege from all but the most trusted administrators.

Additionally, enterprise security management tools such as CyberArk can also be incorporated to manage authentication and access control holistically across an environment. These can be configured to frequently change passwords on trusted accounts (that is, every hour or so), require multi-Level approvals prior to retrieving passwords, and track and audit password requests and trends.

While this article focuses exclusively on OneFS snapshots, the expanded use of RBAC granular privileges for enhanced security is germane to most key areas of cluster management and data protection, such as SyncIQ replication, and so on.

Snapshot replication

In addition to using snapshots for its own checkpointing system, SyncIQ, the OneFS data replication engine, supports snapshot replication to a target cluster.

OneFS SyncIQ replication policies contain an option for triggering a replication policy when a snapshot of the source directory is completed. Additionally, at the onset of a new policy configuration, when the “Whenever a Snapshot of the Source Directory is Taken” option is selected, a checkbox appears to enable any existing snapshots in the source directory to be replicated. More information is available in this SyncIQ paper.

Cyber-vaulting

File data is arguably the most difficult to protect, because:

  • It is the only type of data where potentially all employees have a direct connection to the storage (with the other type of storage it’s through an application)
  • File data is linked (or mounted) to the operating system of the client. This means that it’s sufficient to gain file access to the OS to get access to potentially critical data.
  • Users are the largest breach points.

The Cyber Security Framework (CSF) from the National Institute of Standards and Technology (NIST) categorizes the threat through recovery process:

Within the ‘Protect’ phase, there are two core aspects:

  • Applying all the core protection features available on the OneFS platform, namely:

Feature

Description

Access control

Where the core data protection functions are being executed. Assess who actually needs write access.

Immutability

Having immutable snapshots, replica versions, and so on. Augmenting backup strategy with an archiving strategy with SmartLock WORM.

Encryption

Encrypting both data in-flight and data at rest.

Anti-virus

Integrating with anti-virus/anti-malware protection that does content inspection.

Security advisories

Dell Security Advisories (DSA) inform customers about fixes to common vulnerabilities and exposures. 

  • Data isolation provides a last resort copy of business critical data, and can be achieved by using an air gap to isolate the cyber vault copy of the data. The vault copy is logically separated from the production copy of the data. Data syncing happens only intermittently by closing the air gap after ensuring that there are no known issues.

The combination of OneFS snapshots and SyncIQ replication allows for granular data recovery. This means that only the affected files are recovered, while the most recent changes are preserved for the unaffected data. While an on-prem air-gapped cyber vault can still provide secure network isolation, in the event of an attack, the ability to failover to a fully operational ‘clean slate’ remote site provides additional security and peace of mind.

We’ll explore PowerScale cyber protection and recovery in more depth in a future article.

Author: Nick Trimbee

Home > Storage > PowerScale (Isilon) > Blogs

PowerScale OneFS SupportAssist

OneFS SupportAssist Architecture and Operation

Nick Trimbee Nick Trimbee

Fri, 21 Apr 2023 16:41:36 -0000

|

Read Time: 0 minutes

The previous article in this series looked at an overview of OneFS SupportAssist. Now, we’ll turn our attention to its core architecture and operation.

Under the hood, SupportAssist relies on the following infrastructure and services:

Service

Name

ESE

Embedded Service Enabler.

isi_rice_d

Remote Information Connectivity Engine (RICE).

isi_crispies_d

Coordinator for RICE Incidental Service Peripherals including ESE Start.

Gconfig

OneFS centralized configuration infrastructure.

MCP

Master Control Program – starts, monitors, and restarts OneFS services.

Tardis

Configuration service and database.

Transaction journal

Task manager for RICE.

Of these, ESE, isi_crispies_d, isi_rice_d, and the Transaction Journal are new in OneFS 9.5 and exclusive to SupportAssist. By contrast, Gconfig, MCP, and Tardis are all legacy services that are used by multiple other OneFS components.

The Remote Information Connectivity Engine (RICE) represents the new SupportAssist ecosystem for OneFS to connect to the Dell backend. The high level architecture is as follows:

Dell’s Embedded Service Enabler (ESE) is at the core of the connectivity platform and acts as a unified communications broker between the PowerScale cluster and Dell Support. ESE runs as a OneFS service and, on startup, looks for an on-premises gateway server. If none is found, it connects back to the connectivity pipe (SRS). The collector service then interacts with ESE to send telemetry, obtain upgrade packages, transmit alerts and events, and so on.

Depending on the available resources, ESE provides a base functionality with additional optional capabilities to enhance serviceability. ESE is multithreaded, and each payload type is handled by specific threads. For example, events are handled by event threads, binary and structured payloads are handled by web threads, and so on. Within OneFS, ESE gets installed to /usr/local/ese and runs as ‘ese’ user and group.

The responsibilities of isi_rice_d include listening for network changes, getting eligible nodes elected for communication, monitoring notifications from CRISPIES, and engaging Task Manager when ESE is ready to go.

The Task Manager is a core component of the RICE engine. Its responsibility is to watch the incoming tasks that are placed into the journal and to assign workers to step through the tasks  until completion. It controls the resource utilization (Python threads) and distributes tasks that are waiting on a priority basis.

The ‘isi_crispies_d’ service exists to ensure that ESE is only running on the RICE active node, and nowhere else. It acts, in effect, like a specialized MCP just for ESE and RICE-associated services, such as IPA. This entails starting ESE on the RICE active node, re-starting it if it crashes on the RICE active node, and stopping it and restarting it on the appropriate node if the RICE active instance moves to another node. We are using ‘isi_crispies_d’ for this, and not MCP, because MCP does not support a service running on only one node at a time.

The core responsibilities of ‘isi_crispies_d’ include:

  • Starting and stopping ESE on the RICE active node
  • Monitoring ESE and restarting, if necessary. ‘isi_crispies_d’ restarts ESE on the node if it crashes. It will retry a couple of times and then notify RICE if it’s unable to start ESE.
  • Listening for gconfig changes and updating ESE. Stopping ESE if unable to make a change and notifying RICE.
  • Monitoring other related services.

The state of ESE, and of other RICE service peripherals, is stored in the OneFS tardis configuration database so that it can be checked by RICE. Similarly, ‘isi_crispies_d’ monitors the OneFS Tardis configuration database to see which node is designated as the RICE ‘active’ node.

The ‘isi_telemetry_d’ daemon is started by MCP and runs when SupportAssist is enabled. It does not have to be running on the same node as the active RICE and ESE instance. Only one instance of ‘isi_telemetry_d’ will be active at any time, and the other nodes will be waiting for the lock.

You can query the current status and setup of SupportAssist on a PowerScale cluster by using the ‘isi supportassist settings view’ CLI command. For example:

# isi supportassist settings view
        Service enabled: Yes
       Connection State: enabled
      OneFS Software ID: ELMISL08224764
          Network Pools: subnet0:pool0
        Connection mode: direct
           Gateway host: -
           Gateway port: -
    Backup Gateway host: -
    Backup Gateway port: -
  Enable Remote Support: Yes
Automatic Case Creation: Yes
       Download enabled: Yes

You can also do this from the WebUI by navigating to Cluster management > General settings > SupportAssist:

You can enable or disable SupportAssist by using the ‘isi services’ CLI command set. For example:

# isi services isi_supportassist disable
The service 'isi_supportassist' has been disabled.
# isi services isi_supportassist enable
The service 'isi_supportassist' has been enabled.
# isi services -a | grep supportassist
   isi_supportassist    SupportAssist Monitor                    Enabled

You can check the core services, as follows:

# ps -auxw | grep -e 'rice' -e 'crispies' | grep -v grep
root    8348    9.4   0.0 109844  60984  -   Ss   22:14        0:00.06 /usr/libexec/isilon/isi_crispies_d /usr/bin/isi_crispies_d
root    8183    8.8   0.0 108060  64396  -   Ss   22:14        0:01.58 /usr/libexec/isilon/isi_rice_d /usr/bin/isi_rice_d

Note that when a cluster is provisioned with SupportAssist, ESRS can no longer be used. However, customers that have not previously connected their clusters to Dell Support can still provision ESRS, but will be presented with a message encouraging them to adopt the best practice of using SupportAssist.

Additionally, SupportAssist in OneFS 9.5 does not currently support IPv6 networking, so clusters deployed in IPv6 environments should continue to use ESRS until SupportAssist IPv6 integration is introduced in a future OneFS release.

Author: Nick Trimbee

Home > Storage > PowerScale (Isilon) > Blogs

PowerScale OneFS

OneFS SupportAssist Management and Troubleshooting

Nick Trimbee Nick Trimbee

Tue, 18 Apr 2023 20:07:18 -0000

|

Read Time: 0 minutes

In this final article in the OneFS SupportAssist series, we turn our attention to management and troubleshooting.

Once the provisioning process above is complete, the isi supportassist settings view CLI command reports the status and health of SupportAssist operations on the cluster.

# isi supportassist settings view
        Service enabled: Yes
       Connection State: enabled
      OneFS Software ID: xxxxxxxxxx
          Network Pools: subnet0:pool0
        Connection mode: direct
           Gateway host: -
           Gateway port: -
    Backup Gateway host: -
    Backup Gateway port: -
  Enable Remote Support: Yes
Automatic Case Creation: Yes
       Download enabled: Yes

This can also be obtained from the WebUI by going to Cluster management > General settings > SupportAssist:

 There are some caveats and considerations to keep in mind when upgrading to OneFS 9.5 and enabling SupportAssist, including:

  • SupportAssist is disabled when STIG hardening is applied to a cluster.
  • Using SupportAssist on a hardened cluster is not supported.
  • Clusters with the OneFS network firewall enabled (isi network firewall settings) might need to allow outbound traffic on port 9443.
  • SupportAssist is supported on a cluster that’s running in Compliance mode.
  • Secure keys are held in key manager under the RICE domain.

Also, note that Secure Remote Services can no longer be used after SupportAssist has been provisioned on a cluster.

SupportAssist has a variety of components that gather and transmit various pieces of OneFS data and telemetry to Dell Support and backend services through the Embedded Service Enabler (ESE). These workflows include CELOG events; in-product activation (IPA) information; CloudIQ telemetry data; Isi-Gather-info (IGI) logsets; and provisioning, configuration, and authentication data to ESE and the various backend services.

Activity

Information

Events and alerts

SupportAssist can be configured to send CELOG events.

Diagnostics

The OneFS isi diagnostics gather and isi_gather_info logfile collation and transmission commands have a SupportAssist option. 

HealthChecks

HealthCheck definitions are updated using SupportAssist.

License Activation

The isi license activation start command uses SupportAssist to connect.

Remote Support

Remote Support uses SupportAssist and the Connectivity Hub to assist customers with their clusters.

Telemetry

CloudIQ telemetry data is sent using SupportAssist. 

CELOG

Once SupportAssist is up and running, it can be configured to send CELOG events and attachments  through ESE to CLM. This can be managed by the isi event channels CLI command syntax. For example:

# isi event channels list
ID   Name                Type          Enabled
-----------------------------------------------
1    RemoteSupport       connectemc    No
2    Heartbeat Self-Test heartbeat     Yes
3    SupportAssist       supportassist No
-----------------------------------------------
Total: 3
# isi event channels view SupportAssist
     ID: 3
   Name: SupportAssist
   Type: supportassist
Enabled: No

Or from the WebUI:

CloudIQ telemetry

In OneFS 9.5, SupportAssist provides an option to send telemetry data to CloudIQ. This can be enabled from the CLI as follows:

# isi supportassist telemetry modify --telemetry-enabled 1 --telemetry-persist 0
# isi supportassist telemetry view
        Telemetry Enabled: Yes
        Telemetry Persist: No
        Telemetry Threads: 8
Offline Collection Period: 7200

Or in the SupportAssist WebUI:

Diagnostics gather

Also in OneFS 9.5, the isi diagnostics gather and isi_gather_info CLI commands both include a --supportassist upload option for log gathers, which also allows them to continue to function through a new “emergency mode” when the cluster is unhealthy. For example, to start a gather from the CLI that will be uploaded through SupportAssist:

# isi diagnostics gather start –supportassist 1

Similarly, for ISI gather info:

# isi_gather_info --supportassist

Or to explicitly avoid using SupportAssist for ISI gather info log gather upload:

# isi_gather_info --nosupportassist

This can also be configured from the WebUI at Cluster management > General configuration > Diagnostics > Gather:

License Activation through SupportAssist

PowerScale License Activation (previously known as In-Product Activation) facilitates the management of the cluster's entitlements and licenses by communicating directly with Software Licensing Central through SupportAssist.

To activate OneFS product licenses through the SupportAssist WebUI:

  1. Go to Cluster management Licensing. 
    For example, on a new cluster without any signed licenses:


     
  2. Click the Update & Refresh button in the License Activation section. In the Activation File Wizard, select the software modules that you want in the activation file.

     

  3. Select Review changes, review, click Proceed, and finally Activate

Note that it can take up to 24 hours for the activation to occur.

Alternatively, cluster license activation codes (LAC) can also be added manually.

Troubleshooting

When it comes to troubleshooting SupportAssist, the basic process flow is as follows:

 
The OneFS components and services above are:

Component

Info

ESE

Embedded Service Enabler

isi_rice_d

Remote Information Connectivity Engine (RICE)

isi_crispies_d

Coordinator for RICE Incidental Service Peripherals including ESE Start

Gconfig

OneFS centralized configuration infrastructure

MCP

Master Control Program; starts, monitors, and restarts OneFS services

Tardis

Configuration service and database

Transaction journal

Task manager for RICE

Of these, ESE, isi_crispies_d, isi_rice_d, and the transaction journal are new in OneFS 9.5 and exclusive to SupportAssist. In contrast, Gconfig, MCP, and Tardis are all legacy services that are used by multiple other OneFS components. 

For its connectivity, SupportAssist elects a single leader single node within the subnet pool, and NANON nodes are automatically avoided. Ports 443 and 8443 are required to be open for bi-directional communication between the cluster and Connectivity Hub, and port 9443 is for communicating with a gateway. The SupportAssist ESE component communicates with a number of Dell backend services:

  • SRS
  • Connectivity Hub
  • CLM
  • ELMS/Licensing
  • SDR
  • Lightning
  • Log Processor
  • CloudIQ
  • ESE

Debugging backend issues might involve one or more services, and Dell Support can assist with this process.

The main log files for investigating and troubleshooting SupportAssist issues and idiosyncrasies are isi_rice_d.log and isi_crispies_d.log. There is also an ese_log, which can be useful, too. These logs can be found at:

Component

Logfile location

Info

Rice

/var/log/isi_rice_d.log

Per node

Crispies

/var/log/isi_crispies_d.log

Per node

ESE

/ifs/.ifsvar/ese/var/log/ESE.log

Cluster-wide for single instance ESE

Debug level logging can be configured from the CLI as follows:

# isi_for_array isi_ilog -a isi_crispies_d --level=debug+
# isi_for_array isi_ilog -a isi_rice_d --level=debug+

Note that the OneFS log gathers (such as the output from the isi_gather_info utility) will capture all the above log files, plus the pertinent SupportAssist Gconfig contexts and Tardis namespaces, for later analysis.

If needed, the Rice and ESE configurations can also be viewed as follows:

# isi_gconfig -t ese
[root] {version:1}
ese.mode (char*) = direct
ese.connection_state (char*) = disabled
ese.enable_remote_support (bool) = true
ese.automatic_case_creation (bool) = true
ese.event_muted (bool) = false
ese.primary_contact.first_name (char*) =
ese.primary_contact.last_name (char*) =
ese.primary_contact.email (char*) =
ese.primary_contact.phone (char*) =
ese.primary_contact.language (char*) =
ese.secondary_contact.first_name (char*) =
ese.secondary_contact.last_name (char*) =
ese.secondary_contact.email (char*) =
ese.secondary_contact.phone (char*) =
ese.secondary_contact.language (char*) =
(empty dir ese.gateway_endpoints)
ese.defaultBackendType (char*) = srs
ese.ipAddress (char*) = 127.0.0.1
ese.useSSL (bool) = true
ese.srsPrefix (char*) = /esrs/{version}/devices
ese.directEndpointsUseProxy (bool) = false
ese.enableDataItemApi (bool) = true
ese.usingBuiltinConfig (bool) = false
ese.productFrontendPrefix (char*) = platform/16/supportassist
ese.productFrontendType (char*) = webrest
ese.contractVersion (char*) = 1.0
ese.systemMode (char*) = normal
ese.srsTransferType (char*) = ISILON-GW
ese.targetEnvironment (char*) = PROD
 
# isi_gconfig -t rice
[root] {version:1}
rice.enabled (bool) = false
rice.ese_provisioned (bool) = false
rice.hardware_key_present (bool) = false
rice.supportassist_dismissed (bool) = false
rice.eligible_lnns (char*) = []
rice.instance_swid (char*) =
rice.task_prune_interval (int) = 86400
rice.last_task_prune_time (uint) = 0
rice.event_prune_max_items (int) = 100
rice.event_prune_days_to_keep (int) = 30
rice.jnl_tasks_prune_max_items (int) = 100
rice.jnl_tasks_prune_days_to_keep (int) = 30
rice.config_reserved_workers (int) = 1
rice.event_reserved_workers (int) = 1
rice.telemetry_reserved_workers (int) = 1
rice.license_reserved_workers (int) = 1
rice.log_reserved_workers (int) = 1
rice.download_reserved_workers (int) = 1
rice.misc_task_workers (int) = 3
rice.accepted_terms (bool) = false
(empty dir rice.network_pools)
rice.telemetry_enabled (bool) = true
rice.telemetry_persist (bool) = false
rice.telemetry_threads (uint) = 8
rice.enable_download (bool) = true
rice.init_performed (bool) = false
rice.ese_disconnect_alert_timeout (int) = 14400
rice.offline_collection_period (uint) = 7200

The -q flag can also be used in conjunction with the isi_gconfig command to identify any values that are not at their default settings. For example, the stock (default) Rice gconfig context will not report any configuration entries:

# isi_gconfig -q -t rice
[root] {version:1}

 

Home > Storage > PowerScale (Isilon) > Blogs

PowerScale OneFS

OneFS SupportAssist Provisioning – Part 2

Nick Trimbee Nick Trimbee

Thu, 13 Apr 2023 21:29:24 -0000

|

Read Time: 0 minutes

In the previous article in this OneFS SupportAssist series, we reviewed the off-cluster prerequisites for enabling OneFS SupportAssist:

  1. Upgrading the cluster to OneFS 9.5.
  2. Obtaining the secure access key and PIN.
  3. Selecting either direct connectivity or gateway connectivity.
  4. If using gateway connectivity, installing Secure Connect Gateway v5.x.

In this article, we turn our attention to step 5: Provisioning SupportAssist on the cluster.

As part of this process, we’ll be using the access key and PIN credentials previously obtained from the Dell Support portal in step 2 above.

Provisioning SupportAssist on a cluster

SupportAssist can be configured from the OneFS 9.5 WebUI by going to Cluster management > General settings > SupportAssist. To initiate the provisioning process on a cluster, click the Connect SupportAssist link, as shown here:

If SupportAssist is not configured, the Remote support page displays the following banner, warning of the future deprecation of SRS:

Similarly, when SupportAssist is not configured, the SupportAssist WebUI page also displays verbiage recommending the adoption of SupportAssist:

There is also a Connect SupportAssist button to begin the provisioning process.

Selecting the Configure SupportAssist button initiates the setup wizard.

1.  Telemetry Notice

 


The first step requires checking and accepting the Infrastructure Telemetry Notice:



2.  Support Contract



For the next step, enter the details for the primary support contact, as prompted:

 
You can also provide the information from the CLI by using the isi supportassist contacts command set. For example:

# isi supportassist contacts modify --primary-first-name=Nick --primary-last-name=Trimbee --primary-email=trimbn@isilon.com


3.  Establish Connections

Next, complete the Establish Connections page

This involves the following steps:

      • Selecting the network pool(s)
      • Adding the secure access key and PIN
      • Configuring either direct or gateway access
      • Selecting whether to allow remote support, CloudIQ telemetry, and auto case creation

a.  Select network pool(s).

At least one statically allocated IPv4 network subnet and pool are required for provisioning SupportAssist. OneFS 9.5 does not support IPv6 networking for SupportAssist remote connectivity. However, IPv6 support is planned for a future release.

Select one or more network pools or subnets from the options displayed. In this example, we select subnet0pool0:



Or from the CLI:

Select one or more static subnets or pools for outbound communication, using the following CLI syntax:

# isi supportassist settings modify --network-pools="subnet0.pool0"

Additionally, if the cluster has the OneFS 9.5 network firewall enabled (“isi network firewall settings”), ensure that outbound traffic is allowed on port 9443.

b.  Add secure access key and PIN.

In this next step, add the secure access key and pin. These should have been obtained in an earlier step in the provisioning procedure from the following Dell Support site: https://www.dell.com/support/connectivity/product/isilon-onefs.


Alternatively, if configuring SupportAssist from the OneFS CLI, add the key and pin by using the following syntax:

# isi supportassist provision start --access-key <key> --pin <pin>


c.  Configure access.

  • Direct access

Or, to configure direct access (the default) from the CLI, ensure that the following parameter is set:

# isi supportassist settings modify --connection-mode direct
# isi supportassist settings view | grep -i "connection mode"
        Connection mode: direct
  • Gateway access

Alternatively, to connect through a gateway, select the Connect via Secure Connect Gateway button:

Complete the Gateway host and Gateway port fields as appropriate for the environment.

Alternatively, to set up a gateway configuration from the CLI, use the isi supportassist settings modify syntax. For example, to use the gateway FQDN secure-connect-gateway.yourdomain.com and the default port 9443:

# isi supportassist settings modify --connection-mode gateway
# isi supportassist settings view | grep -i "connection mode"
        Connection mode: gateway
# isi supportassist settings modify --gateway-host secure-connect-gateway.yourdomain.com --gateway-port 9443

When setting up the gateway connectivity option, Secure Connect Gateway v5.0 or later must be deployed within the data center. SupportAssist is incompatible with either ESRS gateway v3.52 or SAE gateway v4. However, Secure Connect Gateway v5.x is backward compatible with PowerScale OneFS ESRS, which allows the gateway to be provisioned and configured ahead of a cluster upgrade to OneFS 9.5.

d. Configure support options.

Finally, configure the support options:



When you have completed the configuration, the WebUI will confirm that SmartConnect is successfully configured and enabled, as follows:

 
Or from the CLI:

# isi supportassist settings view
        Service enabled: Yes
       Connection State: enabled
      OneFS Software ID: ELMISL0223BJJC
          Network Pools: subnet0.pool0, subnet0.testpool1, subnet0.testpool2, subnet0.testpool3, subnet0.testpool4
        Connection mode: gateway
           Gateway host: eng-sea-scgv5stg3.west.isilon.com
           Gateway port: 9443
    Backup Gateway host: eng-sea-scgv5stg.west.isilon.com
    Backup Gateway port: 9443
  Enable Remote Support: Yes
Automatic Case Creation: Yes
       Download enabled: Yes

 

 

Home > Storage > PowerScale (Isilon) > Blogs

PowerScale OneFS

OneFS SupportAssist Provisioning – Part 1

Nick Trimbee Nick Trimbee

Thu, 13 Apr 2023 20:20:31 -0000

|

Read Time: 0 minutes

In OneFS 9.5, several OneFS components now leverage SupportAssist as their secure off-cluster data retrieval and communication channel. These components include:

ComponentDetails

Events and Alerts

SupportAssist can send CELOG events and attachments through Embedded Service Enabler (ESE) to CLM.

Diagnostics

Logfile gathers can be uploaded to Dell through SupportAssist.

License activation

License activation uses SupportAssist for the isi license activation start CLI command.

Telemetry

Telemetry is sent through SupportAssist to CloudIQ for analytics.

Health check

Health check definition downloads now leverage SupportAssist.

Remote Support

Remote Support now uses SupportAssist along with Connectivity Hub.

For existing clusters, SupportAssist supports the same basic workflows as its predecessor, ESRS, so the transition from old to new is generally pretty seamless.

The overall process for enabling OneFS SupportAssist is as follows:

  1. Upgrade the cluster to OneFS 9.5.
  2. Obtain the secure access key and PIN.
  3. Select either direct connectivity or gateway connectivity.
  4. If using gateway connectivity, install Secure Connect Gateway v5.x.
  5. Provision SupportAssist on the cluster.

 We’ll go through each of these configuration steps in order:

1.  Upgrading to OneFS 9.5

First, the cluster must be running OneFS 9.5 to configure SupportAssist.

There are some additional considerations and caveats to bear in mind when upgrading to OneFS 9.5 and planning on enabling SupportAssist. These include:

  • SupportAssist is disabled when STIG hardening is applied to the cluster.
  • Using SupportAssist on a hardened cluster is not supported.
  • Clusters with the OneFS network firewall enabled (”isi network firewall settings”) might need to allow outbound traffic on ports 443 and 8443, plus 9443 if gateway (SCG) connectivity is configured.
  • SupportAssist is supported on a cluster that’s running in Compliance mode.
  • If you are upgrading from an earlier release, the OneFS 9.5 upgrade must be committed before SupportAssist can be provisioned.

Also, ensure that the user account that will be used to enable SupportAssist belongs to a role with the ISI_PRIV_REMOTE_SUPPORT read and write privilege:

# isi auth privileges | grep REMOTE
ISI_PRIV_REMOTE_SUPPORT                           
  Configure remote support

 For example, for an ese user account:

# isi auth roles view SupportAssistRole
       Name: SupportAssistRole
Description: -
    Members: ese
 Privileges
             ID: ISI_PRIV_LOGIN_PAPI
     Permission: r
             ID: ISI_PRIV_REMOTE_SUPPORT
      Permission: w

2.  Obtaining secure access key and PIN

An access key and pin are required to provision SupportAssist, and these secure keys are held in key manager under the RICE domain. This access key and pin can be obtained from the following Dell Support site: https://www.dell.com/support/connectivity/product/isilon-onefs.

In the Quick link navigation bar, select the Generate Access key link:

 On the following page, select the appropriate button:

The credentials required to obtain an access key and pin vary, depending on prior cluster configuration. Sites that have previously provisioned ESRS will need their OneFS Software ID (SWID) to obtain their access key and pin.

The isi license list CLI command can be used to determine a cluster’s SWID. For example:

# isi license list | grep "OneFS Software ID"
OneFS Software ID: ELMISL999CKKD

However, customers with new clusters and/or customers who have not previously provisioned ESRS or SupportAssist will require their Site ID to obtain the access key and pin.

Note that any new cluster hardware shipping after January 2023 will already have an integrated key, so this key can be used in place of the Site ID.

For example, if this is the first time registering this cluster and it does not have an integrated key, select Yes, let’s register:


 Enter the Site ID, site name, and location information for the cluster:

Choose a 4-digit PIN and save it for future reference. After that, click Create My Access Key:

The access key is then generated.
 

An automated email containing the pertinent key info is sent from the Dell | ServicesConnectivity Team. For example:

This access key is valid for one week, after which it automatically expires.

Next, in the cluster’s WebUI, go back to Cluster management > General settings > SupportAssist and enter the access key and PIN information in the appropriate fields. Finally, click Finish Setup to complete the SupportAssist provisioning process:



3.  Deciding between direct or gateway topology 


A topology decision will need to be made between implementing either direct connectivity or gateway connectivity, depending on the needs of the environment:

  • Direct connect: