OneFS SupportAssist Management and Troubleshooting
Tue, 18 Apr 2023 20:07:18 -0000
|Read Time: 0 minutes
In this final article in the OneFS SupportAssist series, we turn our attention to management and troubleshooting.
Once the provisioning process above is complete, the isi supportassist settings view CLI command reports the status and health of SupportAssist operations on the cluster.
# isi supportassist settings view Service enabled: Yes Connection State: enabled OneFS Software ID: xxxxxxxxxx Network Pools: subnet0:pool0 Connection mode: direct Gateway host: - Gateway port: - Backup Gateway host: - Backup Gateway port: - Enable Remote Support: Yes Automatic Case Creation: Yes Download enabled: Yes
This can also be obtained from the WebUI by going to Cluster management > General settings > SupportAssist:
There are some caveats and considerations to keep in mind when upgrading to OneFS 9.5 and enabling SupportAssist, including:
- SupportAssist is disabled when STIG hardening is applied to a cluster.
- Using SupportAssist on a hardened cluster is not supported.
- Clusters with the OneFS network firewall enabled (isi network firewall settings) might need to allow outbound traffic on port 9443.
- SupportAssist is supported on a cluster that’s running in Compliance mode.
- Secure keys are held in key manager under the RICE domain.
Also, note that Secure Remote Services can no longer be used after SupportAssist has been provisioned on a cluster.
SupportAssist has a variety of components that gather and transmit various pieces of OneFS data and telemetry to Dell Support and backend services through the Embedded Service Enabler (ESE). These workflows include CELOG events; in-product activation (IPA) information; CloudIQ telemetry data; Isi-Gather-info (IGI) logsets; and provisioning, configuration, and authentication data to ESE and the various backend services.
Activity | Information |
Events and alerts | SupportAssist can be configured to send CELOG events. |
Diagnostics | The OneFS isi diagnostics gather and isi_gather_info logfile collation and transmission commands have a SupportAssist option. |
HealthChecks | HealthCheck definitions are updated using SupportAssist. |
License Activation | The isi license activation start command uses SupportAssist to connect. |
Remote Support | Remote Support uses SupportAssist and the Connectivity Hub to assist customers with their clusters. |
Telemetry | CloudIQ telemetry data is sent using SupportAssist. |
CELOG
Once SupportAssist is up and running, it can be configured to send CELOG events and attachments through ESE to CLM. This can be managed by the isi event channels CLI command syntax. For example:
# isi event channels list ID Name Type Enabled ----------------------------------------------- 1 RemoteSupport connectemc No 2 Heartbeat Self-Test heartbeat Yes 3 SupportAssist supportassist No ----------------------------------------------- Total: 3 # isi event channels view SupportAssist ID: 3 Name: SupportAssist Type: supportassist Enabled: No
Or from the WebUI:
CloudIQ telemetry
In OneFS 9.5, SupportAssist provides an option to send telemetry data to CloudIQ. This can be enabled from the CLI as follows:
# isi supportassist telemetry modify --telemetry-enabled 1 --telemetry-persist 0 # isi supportassist telemetry view Telemetry Enabled: Yes Telemetry Persist: No Telemetry Threads: 8 Offline Collection Period: 7200
Or in the SupportAssist WebUI:
Diagnostics gather
Also in OneFS 9.5, the isi diagnostics gather and isi_gather_info CLI commands both include a --supportassist upload option for log gathers, which also allows them to continue to function through a new “emergency mode” when the cluster is unhealthy. For example, to start a gather from the CLI that will be uploaded through SupportAssist:
# isi diagnostics gather start –supportassist 1
Similarly, for ISI gather info:
# isi_gather_info --supportassist
Or to explicitly avoid using SupportAssist for ISI gather info log gather upload:
# isi_gather_info --nosupportassist
This can also be configured from the WebUI at Cluster management > General configuration > Diagnostics > Gather:
License Activation through SupportAssist
PowerScale License Activation (previously known as In-Product Activation) facilitates the management of the cluster's entitlements and licenses by communicating directly with Software Licensing Central through SupportAssist.
To activate OneFS product licenses through the SupportAssist WebUI:
- Go to Cluster management > Licensing.
For example, on a new cluster without any signed licenses:
- Click the Update & Refresh button in the License Activation section. In the Activation File Wizard, select the software modules that you want in the activation file.
- Select Review changes, review, click Proceed, and finally Activate
Note that it can take up to 24 hours for the activation to occur.
Alternatively, cluster license activation codes (LAC) can also be added manually.
Troubleshooting
When it comes to troubleshooting SupportAssist, the basic process flow is as follows:
The OneFS components and services above are:
Component | Info |
ESE | Embedded Service Enabler |
isi_rice_d | Remote Information Connectivity Engine (RICE) |
isi_crispies_d | Coordinator for RICE Incidental Service Peripherals including ESE Start |
Gconfig | OneFS centralized configuration infrastructure |
MCP | Master Control Program; starts, monitors, and restarts OneFS services |
Tardis | Configuration service and database |
Transaction journal | Task manager for RICE |
Of these, ESE, isi_crispies_d, isi_rice_d, and the transaction journal are new in OneFS 9.5 and exclusive to SupportAssist. In contrast, Gconfig, MCP, and Tardis are all legacy services that are used by multiple other OneFS components.
For its connectivity, SupportAssist elects a single leader single node within the subnet pool, and NANON nodes are automatically avoided. Ports 443 and 8443 are required to be open for bi-directional communication between the cluster and Connectivity Hub, and port 9443 is for communicating with a gateway. The SupportAssist ESE component communicates with a number of Dell backend services:
- SRS
- Connectivity Hub
- CLM
- ELMS/Licensing
- SDR
- Lightning
- Log Processor
- CloudIQ
- ESE
Debugging backend issues might involve one or more services, and Dell Support can assist with this process.
The main log files for investigating and troubleshooting SupportAssist issues and idiosyncrasies are isi_rice_d.log and isi_crispies_d.log. There is also an ese_log, which can be useful, too. These logs can be found at:
Component | Logfile location | Info |
Rice | /var/log/isi_rice_d.log | Per node |
Crispies | /var/log/isi_crispies_d.log | Per node |
ESE | /ifs/.ifsvar/ese/var/log/ESE.log | Cluster-wide for single instance ESE |
Debug level logging can be configured from the CLI as follows:
# isi_for_array isi_ilog -a isi_crispies_d --level=debug+ # isi_for_array isi_ilog -a isi_rice_d --level=debug+
Note that the OneFS log gathers (such as the output from the isi_gather_info utility) will capture all the above log files, plus the pertinent SupportAssist Gconfig contexts and Tardis namespaces, for later analysis.
If needed, the Rice and ESE configurations can also be viewed as follows:
# isi_gconfig -t ese [root] {version:1} ese.mode (char*) = direct ese.connection_state (char*) = disabled ese.enable_remote_support (bool) = true ese.automatic_case_creation (bool) = true ese.event_muted (bool) = false ese.primary_contact.first_name (char*) = ese.primary_contact.last_name (char*) = ese.primary_contact.email (char*) = ese.primary_contact.phone (char*) = ese.primary_contact.language (char*) = ese.secondary_contact.first_name (char*) = ese.secondary_contact.last_name (char*) = ese.secondary_contact.email (char*) = ese.secondary_contact.phone (char*) = ese.secondary_contact.language (char*) = (empty dir ese.gateway_endpoints) ese.defaultBackendType (char*) = srs ese.ipAddress (char*) = 127.0.0.1 ese.useSSL (bool) = true ese.srsPrefix (char*) = /esrs/{version}/devices ese.directEndpointsUseProxy (bool) = false ese.enableDataItemApi (bool) = true ese.usingBuiltinConfig (bool) = false ese.productFrontendPrefix (char*) = platform/16/supportassist ese.productFrontendType (char*) = webrest ese.contractVersion (char*) = 1.0 ese.systemMode (char*) = normal ese.srsTransferType (char*) = ISILON-GW ese.targetEnvironment (char*) = PROD # isi_gconfig -t rice [root] {version:1} rice.enabled (bool) = false rice.ese_provisioned (bool) = false rice.hardware_key_present (bool) = false rice.supportassist_dismissed (bool) = false rice.eligible_lnns (char*) = [] rice.instance_swid (char*) = rice.task_prune_interval (int) = 86400 rice.last_task_prune_time (uint) = 0 rice.event_prune_max_items (int) = 100 rice.event_prune_days_to_keep (int) = 30 rice.jnl_tasks_prune_max_items (int) = 100 rice.jnl_tasks_prune_days_to_keep (int) = 30 rice.config_reserved_workers (int) = 1 rice.event_reserved_workers (int) = 1 rice.telemetry_reserved_workers (int) = 1 rice.license_reserved_workers (int) = 1 rice.log_reserved_workers (int) = 1 rice.download_reserved_workers (int) = 1 rice.misc_task_workers (int) = 3 rice.accepted_terms (bool) = false (empty dir rice.network_pools) rice.telemetry_enabled (bool) = true rice.telemetry_persist (bool) = false rice.telemetry_threads (uint) = 8 rice.enable_download (bool) = true rice.init_performed (bool) = false rice.ese_disconnect_alert_timeout (int) = 14400 rice.offline_collection_period (uint) = 7200
The -q flag can also be used in conjunction with the isi_gconfig command to identify any values that are not at their default settings. For example, the stock (default) Rice gconfig context will not report any configuration entries:
# isi_gconfig -q -t rice [root] {version:1}
Related Blog Posts
PowerScale OneFS 9.7
Wed, 13 Dec 2023 13:55:00 -0000
|Read Time: 0 minutes
Dell PowerScale is already powering up the holiday season with the launch of the innovative OneFS 9.7 release, which shipped today (13th December 2023). This new 9.7 release is an all-rounder, introducing PowerScale innovations in Cloud, Performance, Security, and ease of use.
After the debut of APEX File Storage for AWS earlier this year, OneFS 9.7 extends and simplifies the PowerScale in the public cloud offering, delivering more features on more instance types across more regions.
In addition to providing the same OneFS software platform on-prem and in the cloud, and customer-managed for full control, APEX File Storage for AWS in OneFS 9.7 sees a 60% capacity increase, providing linear capacity and performance scaling up to six SSD nodes and 1.6 PiB per namespace/cluster, and up to 10GB/s reads and 4GB/s writes per cluster. This can make it a solid fit for traditional file shares and home directories, vertical workloads like M&E, healthcare, life sciences, finserv, and next-gen AI, ML and analytics applications.
Enhancements to APEX File Storage for AWS
PowerScale’s scale-out architecture can be deployed on customer managed AWS EBS and ECS infrastructure, providing the scale and performance needed to run a variety of unstructured workflows in the public cloud. Plus, OneFS 9.7 provides an ‘easy button’ for streamlined AWS infrastructure provisioning and deployment.
Once in the cloud, you can further leverage existing PowerScale investments by accessing and orchestrating your data through the platform's multi-protocol access and APIs.
This includes the common OneFS control plane (CLI, WebUI, and platform API), and the same enterprise features: Multi-protocol, SnapshotIQ, SmartQuotas, Identity management, and so on.
With OneFS 9.7, APEX File Storage for AWS also sees the addition of support for HDFS and FTP protocols, in addition to NFS, SMB, and S3. Granular performance prioritization and throttling is also enabled with SmartQoS, allowing admins to configure limits on the maximum number of protocol operations that NFS, S3, SMB, or mixed protocol workloads can consume on an APEX File Storage for AWS cluster.
Security
With data integrity and protection being top of mind in this era of unprecedented cyber threats, OneFS 9.7 brings a bevy of new features and functionality to keep your unstructured data and workloads more secure than ever. These new OneFS 9.7 security enhancements help address US Federal and DoD mandates, such as FIPS 140-2 and DISA STIGs – in addition to general enterprise data security requirements. Included in the new OneFS 9.7 release is a simple cluster configuration backup and restore utility, address space layout randomization, and single sign-on (SSO) lookup enhancements.
Data mobility
On the data replication front, SmartSync sees the introduction of GCP as an object storage target in OneFS 9.7, in addition to ECS, AWS and Azure. The SmartSync data mover allows flexible data movement and copying, incremental resyncs, push and pull data transfer, and one-time file to object copy.
Performance improvements
Building on the streaming read performance delivered in a prior release, OneFS 9.7 also unlocks dramatic write performance enhancements, particularly for the all-flash NVMe platforms - plus infrastructure support for future node hardware platform generations. A sizable boost in throughput to a single client helps deliver performance for the most demanding GenAI workloads, particularly for the model training and inferencing phases. Additionally, the scale-out cluster architecture enables performance to scale linearly as GPUs are increased, allowing PowerScale to easily support AI workflows from small to large.
Cluster support for InsightIQ 5.0
The new InsightIQ 5.0 software expands PowerScale monitoring capabilities, including a new user interface, automated email alerts, and added security. InsightIQ 5.0 is available today for all existing and new PowerScale customers at no additional charge. These innovations are designed to simplify management, expand scale and security, and automate operations for PowerScale performance monitoring for AI, GenAI, and all other workloads.
In summary, OneFS 9.7 brings the following new features and functionality to the Dell PowerScale ecosystem:
We’ll be taking a deeper look at these new features and functionality in blog articles over the course of the next few weeks.
Meanwhile, the new OneFS 9.7 code is available on the Dell Support site, as both an upgrade and reimage file, allowing both installation and upgrade of this new release.
OneFS SSL Certificate Renewal – Part 1
Thu, 16 Nov 2023 04:57:00 -0000
|Read Time: 0 minutes
When using either the OneFS WebUI or platform API (pAPI), all communication sessions are encrypted using SSL (Secure Sockets Layer), also known as Transport Layer Security (TLS). In this series, we will look at how to replace or renew the SSL certificate for the OneFS WebUI.
SSL requires a certificate that serves two principal functions: It grants permission to use encrypted communication using Public Key Infrastructure and authenticates the identity of the certificate’s holder.
Architecturally, SSL consists of four fundamental components:
SSL Component | Description |
Alert | Reports issues. |
Change cipher spec | Implements negotiated crypto parameters. |
Handshake | Negotiates crypto parameters for SSL session. Can be used for many SSL/TCP connections. |
Record | Provides encryption and MAC. |
These sit in the stack as follows:
The basic handshake process begins with a client requesting an HTTPS WebUI session to the cluster. OneFS then returns the SSL certificate and public key. The client creates a session key, encrypted with the public key it is received from OneFS. At this point, the client only knows the session key. The client now sends its encrypted session key to the cluster, which decrypts it with the private key. Now, both the client and OneFS know the session key. So, finally, the session, encrypted using a symmetric session key, can be established. OneFS automatically defaults to the best supported version of SSL, based on the client request.
A PowerScale cluster initially contains a self-signed certificate, which can be used as-is or replaced with a third-party certificate authority (CA)-issued certificate. If the self-signed certificate is used upon expiry, it must be replaced with either a third-party (public or private) CA-issued certificate or another self-signed certificate that is generated on the cluster. The following are the default locations for the server.crt and server.key files.
File | Location |
SSL certificate | /usr/local/apache2/conf/ssl.crt/server.crt |
SSL certificate key | /usr/local/apache2/conf/ssl.key/server.key |
The ‘isi certificate settings view’ CLI command displays all of the certificate-related configuration options. For example:
# isi certificate settings view Certificate Monitor Enabled: Yes Certificate Pre Expiration Threshold: 4W2D Default HTTPS Certificate ID: default Subject: C=US, ST=Washington, L=Seattle, O="Isilon", OU=Isilon, CN=Dell, emailAddress=tme@isilon.com Status: valid |
The above ‘certificate monitor enabled’ and ‘certificate pre expiration threshold’ configuration options govern a nightly cron job, which monitors the expiration of each managed certificate and fires a CELOG alert if a certificate is set to expire within the configured threshold. Note that the default expiration is 30 days (4W2D, which represents 4 weeks plus 2 days). The ‘ID: default’ configuration option indicates that this certificate is the default TLS certificate.
The basic certificate renewal or creation flow is as follows:
The steps below include options to complete a self-signed certificate replacement or renewal, or to request an SSL replacement or renewal from a Certificate Authority (CA).
Backing up the existing SSL certificate
The first task is to obtain the list of certificates by running the following CLI command, and identify the appropriate one to renew:
# isi certificate server list ID Name Status Expires ------------------------------------------- eb0703b default valid 2025-10-11T10:45:52 ------------------------------------------- |
It’s always a prudent practice to save a backup of the original certificate and key. This can be easily accomplished using the following CLI commands, which, in this case, create the directory ‘/ifs/data/ssl_bkup’ directory, set the perms to root-only access, and copy the original key and certificate to it:
# mkdir -p /ifs/data/ssl_bkup # chmod 700 /ifs/data/ssl_bkup # cp /usr/local/apache24/conf/ssl.crt/server.crt /ifs/data/ssl_bkup # cp /usr/local/apache24/conf/ssl.key/server.key /ifs/data/ssl_bkup # cd !$ cd /ifs/data/ssl_bkup # ls server.crt server.key |
Renewing or creating a certificate
The next step in the process involves either the renewal of an existing certificate or creation of a certificate from scratch. In either case, first, create a temporary directory, for example /ifs/tmp:
# mkdir /ifs/tmp; cd /ifs/tmp |
a) Renew an existing self-signed Certificate.
The following syntax creates a renewal certificate based on the existing ssl.key. The value of the ‘-days’ parameter can be adjusted to generate a certificate with the wanted expiration date. For example, the following command will create a one-year certificate.
# cp /usr/local/apache2/conf/ssl.key/server.key ./ ; openssl req -new -days 365 -nodes -x509 -key server.key -out server.crt |
Answer the system prompts to complete the self-signed SSL certificate generation process, entering the pertinent information location and contact information. For example:
Country Name (2 letter code) [AU]:US
State or Province Name (full name) [Some-State]:Washington
Locality Name (eg, city) []:Seattle
Organization Name (eg, company) [Internet Widgits Pty Ltd]:Isilon
Organizational Unit Name (eg, section) []:TME
Common Name (e.g. server FQDN or YOUR name) []:isilon.com
Email Address []:tme@isilon.com
When all the information has been successfully entered, the server.csr and server.key files will be generated under the /ifs/tmp directory.
Optionally, the attributes and integrity of the certificate can be verified with the following syntax:
# openssl x509 -text -noout -in server.crt |
Next, proceed directly to the ‘Add the certificate to the cluster’ steps in section 4 of this article.
b) Alternatively, a certificate and key can be generated from scratch, if preferred.
The following CLI command can be used to create an 2048-bit RSA private key:
# openssl genrsa -out server.key 2048 Generating RSA private key, 2048 bit long modulus ............+++++
...........................................................+++++
e is 65537 (0x10001) |
Next, create a certificate signing request:
# openssl req -new -nodes -key server.key -out server.csr |
For example:
# openssl req -new -nodes -key server.key -out server.csr -reqexts SAN -config <(cat /etc/ssl/openssl.cnf <(printf "[SAN]\nsubjectAltName=DNS:isilon.com")) You are about to be asked to enter information that will be incorporated into your certificate request. What you are about to enter is what is called a Distinguished Name or a DN. There are quite a few fields but you can leave some blank For some fields there will be a default value, If you enter '.', the field will be left blank. ----- Country Name (2 letter code) [AU]:US State or Province Name (full name) [Some-State]:WA Locality Name (eg, city) []:Seattle Organization Name (eg, company) [Internet Widgits Pty Ltd]:Isilon Organizational Unit Name (eg, section) []:TME Common Name (e.g. server FQDN or YOUR name) []:h7001 Email Address []:tme@isilon.com Please enter the following 'extra' attributes to be sent with your certificate request A challenge password []:1234 An optional company name []: # |
Answer the system prompts to complete the self-signed SSL certificate generation process, entering the pertinent information location and contact information. Additionally, a ‘challenge password’ with a minimum of 4-bytes in length will need to be selected and entered.
As prompted, enter the information to be incorporated into the certificate request. When completed, the server.csr and server.key files will appear in the /ifs/tmp directory.
If wanted, a CSR file for a Certificate Authority, which includes Subject-Alternative-Names (SAN) can be generated. For example, additional host name entries can be added using a comma (IE. DNS:isilon.com,DNS:www.isilon.com).
In the next article, we will look at the certificate singing, addition, and verification steps of the process.