OneFS Virtual Hot Spare
Fri, 28 Jan 2022 21:12:37 -0000
|Read Time: 0 minutes
There have been several recent questions from the field around how a cluster manages space reservation and pre-allocation of capacity for data repair and drive rebuilds.
OneFS provides a mechanism called Virtual Hot Spare (VHS), which helps ensure that node pools maintain enough free space to successfully re-protect data in the event of drive failure.
Although globally configured, Virtual Hot Spare actually operates at the node pool level so that nodes with different size drives reserve the appropriate VHS space. This helps ensure that, while data may move from one disk pool to another during repair, it remains on the same class of storage. VHS reservations are cluster wide and configurable as either a percentage of total storage (0-20%) or as a number of virtual drives (1-4). To achieve this, the reservation mechanism allocates a fraction of the node pool’s VHS space in each of its constituent disk pools.
No space is reserved for VHS on SSDs unless the entire node pool consists of SSDs. This means that a failed SSD may have data moved to HDDs during repair, but without adding additional configuration settings. This avoids reserving an unreasonable percentage of the SSD space in a node pool.
The default for new clusters is for Virtual Hot Spare to have both “subtract the space reserved for the virtual hot spare…” and “deny new data writes…” enabled with one virtual drive. On upgrade, existing settings are maintained.
It is strongly encouraged to keep Virtual Hot Spare enabled on a cluster, and a best practice is to configure 10% of total storage for VHS. If VHS is disabled and you upgrade OneFS, VHS will remain disabled. If VHS is disabled on your cluster, first check to ensure the cluster has sufficient free space to safely enable VHS, and then enable it.
VHS can be configured via the OneFS WebUI, and is always available, regardless of whether SmartPools has been licensed on a cluster. For example:
From the CLI, the cluster’s VHS configuration is part of the storage pool settings, and can be viewed with the following syntax:
# isi storagepool settings view Automatically Manage Protection: files_at_default Automatically Manage Io Optimization: files_at_default Protect Directories One Level Higher: Yes Global Namespace Acceleration: disabled Virtual Hot Spare Deny Writes: Yes Virtual Hot Spare Hide Spare: Yes Virtual Hot Spare Limit Drives: 1 Virtual Hot Spare Limit Percent: 10 Global Spillover Target: anywhere Spillover Enabled: Yes SSD L3 Cache Default Enabled: Yes SSD Qab Mirrors: one SSD System Btree Mirrors: one SSD System Delta Mirrors: one
Similarly, the following command will set the cluster’s VHS space reservation to 10%.
# isi storagepool settings modify --virtual-hot-spare-limit-percent 10
Bear in mind that reservations for virtual hot sparing will affect spillover. For example, if VHS is configured to reserve 10% of a pool’s capacity, spillover will occur at 90% full.
Spillover allows data that is being sent to a full pool to be diverted to an alternate pool. Spillover is enabled by default on clusters that have more than one pool. If you have a SmartPools license on the cluster, you can disable Spillover. However, it is recommended that you keep Spillover enabled. If a pool is full and Spillover is disabled, you might get a “no space available” error but still have a large amount of space left on the cluster.
If the cluster is inadvertently configured to allow data writes to the reserved VHS space, the following informational warning will be displayed in the SmartPools WebUI:
There is also no requirement for reserved space for snapshots in OneFS. Snapshots can use as much or little of the available file system space as desirable and necessary.
A snapshot reserve can be configured if preferred, although this will be an accounting reservation rather than a hard limit and is not a recommend best practice. If desired, snapshot reserve can be set via the OneFS command line interface (CLI) by running the ‘isi snapshot settings modify –reserve’ command.
For example, the following command will set the snapshot reserve to 10%:
# isi snapshot settings modify --reserve 10
It’s worth noting that the snapshot reserve does not constrain the amount of space that snapshots can use on the cluster. Snapshots can consume a greater percentage of storage capacity specified by the snapshot reserve.
Additionally, when using SmartPools, snapshots can be stored on a different node pool or tier than the one the original data resides on.
For example, as above, the snapshots taken on a performance aligned tier can be physically housed on a more cost effective archive tier.
Author: Nick Trimbee
Related Blog Posts
OneFS and HTTP Security
Mon, 22 Apr 2024 20:35:30 -0000
|Read Time: 0 minutes
To enable granular HTTP security configuration, OneFS provides an option to disable nonessential HTTP components selectively. This can help reduce the overall attack surface of your infrastructure. Disabling a specific component’s service still allows other essential services on the cluster to continue to run unimpeded. In OneFS 9.4 and later, you can disable the following nonessential HTTP services:
Service | Description |
PowerScaleUI | The OneFS WebUI configuration interface. |
Platform-API-External | External access to the OneFS platform API endpoints. |
Rest Access to Namespace (RAN) | REST-ful access by HTTP to a cluster’s /ifs namespace. |
RemoteService | Remote Support and In-Product Activation. |
SWIFT (deprecated) | Deprecated object access to the cluster using the SWIFT protocol. This has been replaced by the S3 protocol in OneFS. |
You can enable or disable each of these services independently, using the CLI or platform API, if you have a user account with the ISI_PRIV_HTTP RBAC privilege.
You can use the isi http services CLI command set to view and modify the nonessential HTTP services:
# isi http services list ID Enabled ------------------------------ Platform-API-External Yes PowerScaleUI Yes RAN Yes RemoteService Yes SWIFT No ------------------------------ Total: 5
For example, you can easily disable remote HTTP access to the OneFS /ifs namespace as follows:
# isi http services modify RAN --enabled=0
You are about to modify the service RAN. Are you sure? (yes/[no]): yes
Similarly, you can also use the WebUI to view and edit a subset of the HTTP configuration settings, by navigating to Protocols > HTTP settings:
That said, the implications and impact of disabling each of the services is as follows:
Service | Disabling impacts |
WebUI | The WebUI is completely disabled, and access attempts (default TCP port 8080) are denied with the warning Service Unavailable. Please contact Administrator. If the WebUI is re-enabled, the external platform API service (Platform-API-External) is also started if it is not running. Note that disabling the WebUI does not affect the PlatformAPI service. |
Platform API | External API requests to the cluster are denied, and the WebUI is disabled, because it uses the Platform-API-External service. Note that the Platform-API-Internal service is not impacted if/when the Platform-API-External is disabled, and internal pAPI services continue to function as expected. If the Platform-API-External service is re-enabled, the WebUI will remain inactive until the PowerScaleUI service is also enabled. |
RAN | If RAN is disabled, the WebUI components for File System Explorer and File Browser are also automatically disabled. From the WebUI, attempts to access the OneFS file system explorer (File System > File System Explorer) fail with the warning message Browse is disabled as RAN service is not running. Contact your administrator to enable the service. This same warning also appears when attempting to access any other WebUI components that require directory selection. |
RemoteService | If RemoteService is disabled, the WebUI components for Remote Support and In-Product Activation are disabled. In the WebUI, going to Cluster Management > General Settings and selecting the Remote Support tab displays the message The service required for the feature is disabled. Contact your administrator to enable the service. In the WebUI, going to Cluster Management > Licensing and scrolling to the License Activation section displays the message The service required for the feature is disabled. Contact your administrator to enable the service. |
SWIFT | Deprecated object protocol and disabled by default. |
You can use the CLI command isi http settings view to display the OneFS HTTP configuration:
# isi http settings view Access Control: No Basic Authentication: No WebHDFS Ran HTTPS Port: 8443 Dav: No Enable Access Log: Yes HTTPS: No Integrated Authentication: No Server Root: /ifs Service: disabled Service Timeout: 8m20s Inactive Timeout: 15m Session Max Age: 4H Httpd Controlpath Redirect: No
Similarly, you can manage and change the HTTP configuration using the isi http settings modify CLI command.
For example, to reduce the maximum session age from four to two hours:
# isi http settings view | grep -i age Session Max Age: 4H # isi http settings modify --session-max-age=2H # isi http settings view | grep -i age Session Max Age: 2H
The full set of configuration options for isi http settings includes:
Option | Description |
--access-control <boolean> | Enable Access Control Authentication for the HTTP service. Access Control Authentication requires at least one type of authentication to be enabled. |
--basic-authentication <boolean> | Enable Basic Authentication for the HTTP service. |
--webhdfs-ran-https-port <integer> | Configure Data Services Port for the HTTP service. |
--revert-webhdfs-ran-https-port | Set value to system default for --webhdfs-ran-https-port. |
--dav <boolean> | Comply with Class 1 and 2 of the DAV specification (RFC 2518) for the HTTP service. All DAV clients must go through a single node. DAV compliance is NOT met if you go through SmartConnect, or using 2 or more node IPs. |
--enable-access-log <boolean> | Enable writing to a log when the HTTP server is accessed for the HTTP service. |
--https <boolean> | Enable the HTTPS transport protocol for the HTTP service. |
--https <boolean> | Enable the HTTPS transport protocol for the HTTP service. |
--integrated-authentication <boolean> | Enable Integrated Authentication for the HTTP service. |
--server-root <path> | Document root directory for the HTTP service. Must be within /ifs. |
--service (enabled | disabled | redirect | disabled_basicfile) | Enable/disable the HTTP Service or redirect to WebUI or disabled BasicFileAccess. |
--service-timeout <duration> | The amount of time (in seconds) that the server will wait for certain events before failing a request. A value of 0 indicates that the service timeout value is the Apache default. |
--revert-service-timeout | Set value to system default for --service-timeout. |
--inactive-timeout <duration> | Get the HTTP RequestReadTimeout directive from both the WebUI and the HTTP service. |
--revert-inactive-timeout | Set value to system default for --inactive-timeout. |
--session-max-age <duration> | Get the HTTP SessionMaxAge directive from both WebUI and HTTP service. |
--revert-session-max-age | Set value to system default for --session-max-age. |
--httpd-controlpath-redirect <boolean> | Enable or disable WebUI redirection to the HTTP service. |
Note that while the OneFS S3 service uses HTTP, it is considered a tier-1 protocol, and as such is managed using its own isi s3 CLI command set and corresponding WebUI area. For example, the following CLI command forces the cluster to only accept encrypted HTTPS/SSL traffic on TCP port 9999 (rather than the default TCP port 9021):
# isi s3 settings global modify --https-only 1 –https-port 9921 # isi s3 settings global view HTTP Port: 9020 HTTPS Port: 9999 HTTPS only: Yes S3 Service Enabled: Yes
Additionally, you can entirely disable the S3 service with the following CLI command:
# isi services s3 disable The service 's3' has been disabled.
Or from the WebUI, under Protocols > S3 > Global settings:
Author: Nick Trimbee
OneFS and PowerScale F-series Management Ports
Mon, 22 Apr 2024 20:12:20 -0000
|Read Time: 0 minutes
Another security enhancement that OneFS 9.5 and later releases brings to the table is the ability to configure 1GbE NIC ports dedicated to cluster management on the PowerScale F900, F710, F600, F210, and F200 all-flash storage nodes and P100 and B100 accelerators. Since these platforms were released, customers have been requesting the ability to activate the 1GbE NIC ports so that the node management activity and front end protocol traffic can be separated on physically distinct interfaces.
For background, since their introduction, the F600 and F900 have shipped with a quad port 1GbE rNDC (rack Converged Network Daughter Card) adapter. However, these 1GbE ports were non-functional and unsupported in OneFS releases prior to 9.5. As such, the node management and front-end traffic was co-mingled on the front-end interface.
In OneFS 9.5 and later, 1GbE network ports are now supported on all of the PowerScale PowerEdge based platforms for the purposes of node management, and are physically separate from the other network interfaces. Specifically, this enhancement applies to the F900, F600, F200 all-flash nodes, and P100 and B100 accelerators.
Under the hood, OneFS has been updated to recognize the 1GbE rNDC NIC ports as usable for a management interface. Note that the focus of this enhancement is on factory enablement and support for existing F600 customers that have the unused 1GbE rNDC hardware. This functionality has also been back-ported to OneFS 9.4.0.3 and later RUPs. Since the introduction of this feature, there have been several requests raised about field upgrades, but that use case is separate and will be addressed in a later release through scripts, updates of node receipts, procedures, and so on.
Architecturally, aside from some device driver and accounting work, no substantial changes were required to the underlying OneFS or platform architecture to implement this feature. This means that in addition to activating the rNDC, OneFS now supports the relocated front-end NIC in PCI slots 2 or 3 for the F200, B100, and P100.
OneFS 9.5 and later recognizes the 1GbE rNDC as usable for the management interface in the OneFS Wizard, in the same way it always has for the H-series and A-series chassis-based nodes.
All four ports in the 1GbE NIC are active, and for the Broadcom board, the interfaces are initialized and reported as bge0, bge1, bge2, and bge3.
The pciconf CLI utility can be used to determine whether the rNDC NIC is present in a node. If it is, a variety of identification and configuration details are displayed. For example, let’s look at the following output from a Broadcom rNDC NIC in an F200 node:
# pciconf -lvV pci0:24:0:0
bge2@pci0:24:0:0: class=0x020000 card=0x1f5b1028 chip=0x165f14e4 rev=0x00 hdr=0x00 class = network subclass = ethernet VPD ident = ‘Broadcom NetXtreme Gigabit Ethernet’ VPD ro PN = ‘BCM95720’ VPD ro MN = ‘1028’ VPD ro V0 = ‘FFV7.2.14’ VPD ro V1 = ‘DSV1028VPDR.VER1.0’ VPD ro V2 = ‘NPY2’ VPD ro V3 = ‘PMT1’ VPD ro V4 = ‘NMVBroadcom Corp’ VPD ro V5 = ‘DTINIC’ VPD ro V6 = ‘DCM1001008d452101000d45’
We can use the ifconfig CLI utility to determine the specific IP/interface mapping on the Broadcom rNDC interface. For example:
# ifconfig bge0 TME-1: bge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 TME-1: ether 00:60:16:9e:X:X TME-1: inet 10.11.12.13 netmask 0xffffff00 broadcast 10.11.12.255 zone 1 TME-1: inet 10.11.12.13 netmask 0xffffff00 broadcast 10.11.12.255 zone 0 TME-1: media: Ethernet autoselect (1000baseT <full-duplex>) TME-1: status: active
In this output, the first IP address of the management interface’s pool is bound to bge0, which is the first port on the Broadcom rNDC NIC.
We can use the isi network pools CLI command to determine the corresponding interface. Within the system zone, the management interface is allocated an address from the configured IP range within its associated interface pool. For example:
# isi network pools list ID SC Zone IP Ranges Allocation Method ---------------------------------------------------------------------------------------------- groupnet0.mgt.mgt cluster_mgt_isln.com 10.11.12.13-10.11.12.20 static # isi network pools view groupnet0.mgt.mgt | grep -i ifaces Ifaces: 1:mgmt-1, 2:mgmt-1, 3:mgmt-1, 4:mgmt-1, 5:mgmt-1
Or from the WebUI, under Network configuration > External network:
Drilling down into the mgt pool details shows the 1GbE management interfaces as the pool interface members:
Note that the 1GbE rNDC network ports are solely intended as cluster management interfaces. As such, they are not supported for use with regular front-end data traffic.
The F900 and F600 nodes already ship with a four port 1GbE rNDC NIC installed. However, the F200, B100, and P100 platform configurations have also been updated to include a quad port 1GbE rNDC card. These new configurations have been shipping by default since January 2023. This required relocating the front end network’s 25GbE NIC (Mellanox CX4) to PCI slot 2 in the motherboard. Additionally, the OneFS updates needed for this feature have also now allowed the F200 platform to be offered with a 100GbE option too. The 100GbE option uses a Mellanox CX6 NIC in place of the CX4 in slot 2.
With this 1GbE management interface enhancement, the same quad-port rNDC card (typically the Broadcom 5720) that has been shipped in the F900 and F600 since their introduction, is now included in the F200, B100 and P100 nodes as well. All four 1GbE rNDC ports are enabled and active under OneFS 9.5 and later, too.
Node port ordering continues to follow the standard, increasing numerically from left to right. However, be aware that the port labels are not visible externally because they are obscured by the enclosure’s sheet metal.
The following back-of-chassis hardware images show the new placements of the NICs in the various F-series and accelerator platforms:
F600
F900
For both the F600 and F900, the NIC placement remains unchanged, because these nodes have always shipped with the 1GbE quad port in the rNDC slot since their launch.
F200
The F200 sees its front-end NIC moved to slot 3, freeing up the rNDC slot for the quad-port 1GbE Broadcom 5720.
Because the B100 backup accelerator has a fibre-channel card in slot 2, it sees its front-end NIC moved to slot 3, freeing up the rNDC slot for the quad-port 1GbE Broadcom 5720.
Finally, the P100 accelerator sees its front-end NIC moved to slot 3, freeing up the rNDC slot for the quad-port 1GbE Broadcom 5720.
Note that, while there is currently no field hardware upgrade process for adding rNDC cards to legacy F200 nodes or B100 and P100 accelerators, this will be addressed in a future release.
Author: Nick Trimbee