Your Browser is Out of Date

Nytro.ai uses technology that works best in other browsers.
For a full experience use one of the browsers below

Blogs

blogs (4)

Tag :

All Tags

Author :

All Authors

PowerMax VMAX

Mitigating Slow Drain with Per Initiator Bandwidth Limits

Pat Tarrant

Tue, 21 Jun 2022 21:06:23 -0000

|

Read Time: 0 minutes

This blog discusses a recently introduced functionality for PowerMax and All Flash arrays as part of the Foxtail release. This new feature allows customers to leverage QoS settings on the PMAX/VMAXAF to reduce or eliminate Slow Drain issues due to servers with slow HBAs. This was introduced in Q2 2019 (OS 5978 Q2 2019 SR 5978.479.479). From a PowerMax or VMAX AF perspective, this can help customers alleviate congestion spreading caused by Slow Drain devices, such as HBAs that have a lower link speed compared to the storage array SLIC. Customers have been burdened by this Slow Drain phenomenon for many years and this issue can lead to severe fabric wide performance degradation.

While there are many definitions and descriptions of Slow Drain, in this blog we define Slow Drain as:

Slow Drain is an FC SAN phenomenon where a single FC end point, due to an inability to accept data from the link at the speed at which it is being sent, causes switch/link buffers and credits to be consumed, resulting in data being “backed up” on the SAN. This causes overall SAN degradation as central components, such as switches, encounter resources that are monopolized by traffic that is destined for the slow drain device, impacting all SAN traffic.

In short, Slow Drain is caused by an end device that is not capable of accepting data at the rate that it is being received.

For example, if an 8 GBs HBA sends a series of large block read requests to a 32Gbs PowerMax front-end Coldspell SLIC, the transition rate will be 32 GBs when the Array starts to send the data back to the host. Since the HBA is only capable of receiving data at 8 GB, there will be congestion at the host end. Too much congestion can lead to congestion spreading, which can affect unrelated server and storage ports to experience performance degradation.

The goal of the Foxtail release is to reduce Slow Drain issues, as they are difficult to prevent and diagnose. It does this through a mechanism where the customer can prevent an end device from becoming a Slow Drain by limiting the amount of data that will be sent to it.

On PowerMax, this can be accomplished by applying a per initiator bandwidth limit. This application limits the amount of data that is sent to the end device (host) at a rate at which it can receive data. We have provided customers the ability to leverage Unisphere or Solutions Enabler QoS settings to keep faster array SLICs from overwhelming slower host HBAs. Figure 1 shows a scenario of congestion spreading caused by Slow Drain devices, which can lead to severe fabric-wide performance degradation.

 

Figure 1: Congestion spreading

Implementing per initiator bandwidth limits

Customers can now configure per initiator bandwidth limits at an Initiator Group (IG) or Host level. The I/O for that Initiator is throttled to a set limit by the PowerMaxOS. This can be configured through Unisphere, REST API, Solutions Enabler (9.1 and above), and Inlines.

Note: The Array must be running PowerMaxOS 5978 Q2 2019 or later.

Figure 2: Unisphere for PowerMax 9.1 and higher support

This release also includes a bandwidth limit setting. Users can go to this new menu item by clicking on the “traffic lights” which prompts the set bandwidth limit dialogue to open. The range for the bandwidth limit is between zero and the maximum rate that the Initiators can support (for example, 16,32 GB).

Note: The menu item to bring up the dialogue is only enabled for Fibre (FC) and disabled for ISCSI hosts.

The Bandwidth Limit is set using a value in MB/sec. For example, Figure 3 shows setting the Bandwidth Limit to 800MB/s. The details panel for the Host displays an extra field for the bandwidth limit.

Figure 3: Unisphere Set Host Bandwidth Limit

Slow Drain monitoring

Typically, B2B credit starvation or drops in R_RDYs symptoms of Slow Drain. Writes can take a long time to complete if they are stuck in the queue behind reads, it may also cause the XFER_RDY to be delayed.

The Bandwidth Limit Exceeded seconds metric is available for Solutions Enabler 9.1. In the Performance dashboard, this is the number of seconds that the director port and initiator has run at maximum quota. This metric uses the bw_exceeded_count. The KPI is available under the initiator objects section.

Solutions Enabler 9.1 also features enhanced support setting bandwidth limits at the Initiator Group Level. It allows the user to create an IG with bandwidth limits, set bandwidth limits on an IG, clear bandwidth limits on an IG, Modify the bandwidth limits on an IG, and of course display the IG bandwidth limits

Figure 4: Solutions Enabler bandwidth limit

REST API Support

REST API can also be used to set bandwidth limits. All communication for REST is HTTPS over IP, and calls authenticated against Unisphere for PowerMax server. REST API supports these four main verbs:

GET

POST

PUT

DELETE

To set the bandwidth limit, we used a REST PUT call, as shown in Figure 6.

Figure 5: Inlines support

Note: Inlines support is also available with the 8F Utility.

Additional information

The following list provides more information about this release:

  • Host I/O limits and Initiator limits can co-exist, one at the SG level and the other at Initiator level.
  • PowerMaxOS supports a max of 4096 limits, both host I/O limits and initiator limits share this limit
  • When an Initiator connects to multiple directors, the per initiator limit is distributed evenly across the directors
  • Limits can be set for child IG (Host) ONLY - not Parent IG (Host Group)
  • An Initiator must be in an IG to set a BW limit on it
  • The IG must contain an initiator in it to set a BW limit for it
  • For PowerMaxOS downgrades (NDDs) if the system has Initiators which have bandwidth limits set, the downgrade will be blocked until the limits are cleared from the config
  • Currently, only FC support is offered

Note: The bandwidth limit is set per Initiator and split across directors. Although the limit is applied within the IG/Host group screen within Unisphere for PowerMax, it applies to every initiator in that group. This means that the limit is not aggregate across all the initiators in that group, but individually applied to all of them and split across directors. 

Resources

Author Information

Author: Pat Tarrant, Principal Engineering Technologist

Read Full Blog
PowerMax data protection Cyber Recovery snapshots

Using Snapshot Policies to Mitigate Ransomware Attacks

Richard Pace

Tue, 10 May 2022 21:33:41 -0000

|

Read Time: 0 minutes

Cyber security remains a priority for organizations. A cyber or ransomware attack occurs every 11 seconds1, causing organizations to continually implement security requirements in order to safeguard mission critical and sensitive data. There is an important need not only to protect this data but have the ability to recover and restore data in the event of a ransomware attack. PowerMax SnapVX Snapshots are a powerful tool to help protect, recover, and restore in the event of a cyber-attack.

SnapVX provides space saving and efficient local replication in PowerMax arrays. SnapVX snapshots are a pointer-based structure that preserves a point-in-time view of a source volume. Snapshots provide the ability to manage consistent point-in-time copies for storage groups. Host accessible target volumes can be linked if a point-in-time snapshot needs to be accessed without affecting the point-in-time of the source.

SnapVX snapshots can be set as secure snaps. Secure snaps are snapshots that cannot be deleted, either accidentally or intentionally. They are retained in resource-limited situations in which conventional snapshots are placed in a failed state to release resources.

SnapVX snapshot users can take advantage of automated scheduling using Snapshot Policies. Snapshot Policies are customizable with rules that specify when to take snapshots, how many to take, how long to keep them, and whether they are standard or secure snaps.  

The following is an example snapshot policy dashboard:

 

SnapVX snapshots with Snapshot Policies allows for 1024 snapshots per source device and 65 million per PowerMax array. Users can take advantage of the frequency and large snapshot scale in policy-driven snapshots to provide enhanced data resiliency. 

Because secure snaps cannot be maliciously or accidentally deleted prior to any planned expiration date, they can be leveraged for organizations to preserve multiple point in time copies that can be recovered from, in the event of a malware or ransomware attack. Snapshot policies can be automated to take secure snaps with a high frequency and a short retention duration for fine granularity, with a lower frequency and longer retention for added security, or a mixture of both. If an attack occurs, the user can review the secure snaps to determine which point in time has the most relevant and up to date copy of data without malware impact. When the precise point in time is identified, restoring critical data can be done almost instantaneously by bringing application data back to the original state prior to any attack. 

Secure snaps also provide an additional layer of security in the case of multiple attacks and can be used for forensic work to help determine what happened during the attack and when it originally occurred. With the lower frequency and longer retention period, secure snaps can be used to validate data and data change rate to help identify any suspicious activity.

The following figure provides an example of creating secure snaps with snapshot policies:

Traditional snapshots can be set with a policy to take snapshots at a frequency and retention that works best for the organization. These snapshots can be used for daily business continuity, such as development, operations, and data analytics. They can also assist in any forensic analysis and can be compared against secure snaps to help determine what changed and when it started to change. Unlike secure snaps, traditional snapshots can be deleted or fail in array resource constraint situations. However, the data on an existing snapshot cannot be changed and could be used for additional recovery options. 

Both secure and traditional snaps are a power tool for organizations to leverage to help protect and restore data rapidly, to minimize any impact of a malware or ransomware attack. The large scalability of snapshots can be easily managed using Snapshot policies for scheduling frequency and retention time duration to fit any size organization. 

The following is an operational example of the frequency, retention, and scale out of the value of SnapVX secure snaps. The numbers are based on an average of 5000 production volumes in a PowerMax array.

  • Secure snaps every 10 minutes with a 48-hour retention
    • 288 per volume point-in-time copies
    • Fine grain protection and recovery
  • Secure snaps every 60 minutes with a 7-day retention
    • 168 per volume point-in-time copies
    • Extended protection and data validation

Total of 2,040,000 secure point-in-time copies

The flexible and scalable features for PowerMax SnapVX traditional and secure snapshots are powerful tools for protecting against ransomware attacks.

Resources

1 "Cybercrime To Cost The World $10.5 Trillion Annually By 2025," by Cybercrime Magazine, November 2020.

Author: Richard Pace, PowerMax Engineering Technologist

Twitter: @pace_rich

Read Full Blog
PowerMax Ansible

What’s New with Ansible Collection for PowerMax Version 1.7

Paul Martin

Mon, 25 Apr 2022 14:39:01 -0000

|

Read Time: 0 minutes

Ansible Modules for Dell PowerMax help automate and orchestrate the configuration and management of Dell PowerMax arrays. Specifically, they are used for managing volumes, storage groups, ports, port groups, host, host groups, masking views, initiators, SRDF links, RDF groups, snapshots, job, snapshot policies, storage pools, role for automatic volume provisioning, and Metro DR environments for PowerMax arrays. The modules use playbooks to list, show, create, delete, and modify each of the entities.

Ansible Modules for Dell PowerMax support the following features:

  • Create volumes, storage groups, hosts, host groups, port groups, masking views, Metro DR environments, snapshot policies, and snapshots of a storage group.
  • Modify volumes, storage groups, hosts, host groups, Metro DR environments, snapshot policies, initiators and port groups in the array.
  • Delete volumes, storage groups, hosts, host groups, port groups, masking views, Metro DR environments, snapshot policies, and snapshots of a storage group.
  • Get details of volumes, storage groups, hosts, host groups, port, port groups, masking views, Metro DR environments, Job, RDF groups, snapshot policies, storage pools, initiators, and snapshots of a storage group.

Each quarter we are improving our Ansible collections and modules for our storage platforms. This quarter sees the release of version 1.7 of the Ansible Collection for PowerMax available on GitHub and Ansible Galaxy. This blog highlights a few major changes in Version 1.7, and a few minor ones too. Full release notes are here.

Module name changes

To start off, there have been some naming changes to the Ansible modules. In previous releases, the modules were handed a very long name with dellemc_powermax_<function>. This made sense when modules were installed standalone. However with the advent of collections, if the user followed Ansible best practices and used the Fully Qualified Collection Name (FCQN) when referencing the modules in playbooks, the name became redundant and quite long.

For example, with Ansible collection for PowerMax <=1.6.x, calling the modules in a playbook would look like this:

  tasks:
    - name: Create Storage group
       dellemc.powermax.dellemc_powermax_storagegroup:
        <<: *uni_connection_vars
        sg_name: "{{ sg_name }}"
        service_level: "Diamond"
        state: 'present'

With Ansible Modules for PowerMax 1.7 and higher, the new syntax is shorter:

  tasks:
    - name: Create Storage group
       dellemc.powermax.storagegroup:
        <<: *uni_connection_vars
        sg_name: "{{ sg_name }}"
        service_level: "Diamond"
        state: 'present'   

The name changes have been made in such a way that upgrading will not affect your existing playbooks for some time. Redirection has been implemented so that existing playbooks continue to work exactly as before, but you’ll get a warning message if you use the older names, as shown here:

(You won’t see this message if you have turned off deprecation warnings in your ansible.cfg file.)

The warning states that you have two years to update playbooks to use the new naming conventions, so there’s plenty of time!

Another naming change is the dellemc_powermax_gather_facts module. This has been renamed to info, in line with what other vendors are doing, and with current Ansible standards. All existing functionality remains. If the old name is used in a playbook, redirection yields a similar deprecation warning.

New initiator module

PowerMax customers have historically used initiator aliases for host WWNs to make management tasks easier. From the UI, it is easy to see which HBA belongs to which host when listing all initiators. Some of our customers have asked for the ability to set these aliases in the Ansible management suite. The initiator module allows us to do just that.

The new module lets you get initiator details using a WWN or an alias, and to rename the initiator alias using either the alias or the WWN. Some example tasks are shown here:

- name: Get initiator details using initiator WWN
   dellemc.powermax.initiator:
    unispherehost: "{{unispherehost}}"
    universion: "{{universion}}"
    verifycert: "{{verifycert}}"
    user: "{{user}}"
    password: "{{password}}"
    serial_no: "{{serial_no}}"
    initiator_id: 1000000000000001
    state: 'present'
 
- name: Get initiator details using alias
   dellemc.powermax.initiator:
    unispherehost: "{{unispherehost}}"
    universion: "{{universion}}"
    verifycert: "{{verifycert}}"
    user: "{{user}}"
    password: "{{password}}"
    serial_no: "{{serial_no}}"
    alias: 'test/host_initiator'
    state: 'present'
 
- name: Rename initiator alias using initiator id
   dellemc.powermax.initiator:
    unispherehost: "{{unispherehost}}"
    universion: "{{universion}}"
    verifycert: "{{verifycert}}"
    user: "{{user}}"
    password: "{{password}}"
    serial_no: "{{serial_no}}"
    initiator_id: 1000000000000001
    new_alias:
      new_node_name: 'test_rename'
      new_port_name: 'host_initiator_rename'
    state: 'present'
 
- name: Rename initiator alias using alias
   dellemc.powermax.initiator:
    unispherehost: "{{unispherehost}}"
    universion: "{{universion}}"
    verifycert: "{{verifycert}}"
    user: "{{user}}"
    password: "{{password}}"
    serial_no: "{{serial_no}}"
    alias: 'test/host_initiator'
    new_alias:
      new_node_name: 'test_rename'
      new_port_name: 'host_initiator_rename'
    state: 'present'

The host module has also been updated with this release, making it possible for you to modify a host initiator by either WWN or Host Alias, as shown here:

- name: Create host with host_flags
  dellemc.powermax.host:
    unispherehost: "{{unispherehost}}"
    universion: "{{universion}}"
    verifycert: "{{verifycert}}"
    user: "{{user}}"
    password: "{{password}}"
    serial_no: "{{serial_no}}"
    host_name: "VMWARE_HOST1"
    initiators:
      - 1000000000000001
      - 'VMWARE_HOST1/VMHBA1'
    host_type: default
    state: 'present'
    initiator_state: 'present-in-host'

If you run into any issues or need assistance, the GitHub Issues tracking section has been opened to the world. You can also request new features, report problems, seek assistance from the Dell Technologies engineering team here, and see where other customers have commented about their own experiences.

I hope you found this information useful! Check back on our Info Hub site regularly for more updates about our products and services.

Author: Paul Martin, Senior Principal Technical Marketing Engineer

Twitter: @rawstorage

Read Full Blog
PowerMax Ansible

Creating Ansible Execution Environments for Dell Technologies Storage

Paul Martin

Mon, 11 Apr 2022 13:52:13 -0000

|

Read Time: 0 minutes

Dell Technologies has been providing Ansible Collections via Ansible Galaxy and GitHub for some time. This has greatly simplified the setup of Ansible configurations so that playbooks can be run quickly. However, in some environments the need to be even more dynamic means that even this process can provide challenges. This is where ansible execution environments can help.

Ansible execution environments are self-contained container environments that provide reproducible environments as container images that can be run as needed, keeping configurations to a bare minimum. They also ensure that the storage automation environment and its dependencies can be spun up when needed, knowing that these environments are tested and have no configuration clashes with other automation environments. Here’s a great introduction to the Ansible execution environment.

Creating an execution environment is done in Python with an optional Python package called Ansible-builder that helps users create their own Ansible execution environments packaged as containers. An organization can build up a repository of execution environment images for use with their playbook libraries from multiple different vendors, to reduce complexity and eliminate any dependency clashes. Organizations can even maintain versions of execution environments for their automation for different configurations of the same products, ensuring that what is tested and known to work will always work.

In order to create an execution environment, you will need to create three files:

  • execution-environment.yml
  • requirements.txt
  • environment.yml

The contents and purpose of these files are as follows.

execution-environment.yml

execution-environment.yml – This YAML file describes the setup and dependencies from Ansible Galaxy. It also points to any Python requirements in requirements.txt. Here’s an example of its contents:

# sample execution-environment.yml
---
version: 1
build_arg_defaults:
  EE_BASE_IMAGE: 'quay.io/Ansible/Ansible-runner:latest'
Ansible_config: 'Ansible.cfg'
dependencies:
  galaxy: requirements.yml
  python: requirements.txt

An example of the requirements.yml file is below.

# Sample requirements.yml
collections:
- name: dellemc.powermax
  version: 1.6

requirements.txt

Here is a sample requirements.txt file:

# Sample requirements.txt 
PyU4V==9.2.1.4

After the requirements.yml, requirements.txt and execution-environment.yml files have been created, you can go on to create the execution environment container.

To create the container, simply run the command as follows. (Note that the tag can be anything that makes sense in your environment.)

ansible-builder build –tag dellemc_ee_powermax1.6 –container-runtime podman

 

After the execution environment has been created, run the build command for podman or Docker to create the execution environment container image. You must run this from the context directory that was created by the Ansible-builder command.

After the image is built and in the local registry, you can run the container and verify that everything is installed as expected.

The picture above shows the Ansible execution environment running with an interactive terminal. The pip list command executed in the terminal shows all the Python dependencies installed as expected. The ansible-galaxy list command shows the collections installed.

This environment is ready to execute playbooks or to be distributed to a container registry to be used on demand.

Uploading to a container registry such as quay.io will make the container available to anyone in your organization that requires it or will make it available to tools like Ansible Automation Controller.

To upload to quay.io, follow the steps detailed here.

For additional information about getting started with Ansible, and using Ansible automation controller with Dell Technologies Storage products, see the Dell PowerMax: Ansible Modules Best Practices white paper on the PowerMax Info Hub.

With each new release of Ansible collections, we will publish a copy of the requirements.txt, requirements.yml, and execution-environment.yml files on https://github.com/dell/{{collection_name}}. We hope this will help people streamline the process of creating new execution environments. Users will create their own execution-environment.yml file and build as described, using the processes outlined here.

Author: Paul Martin, Senior Principal Technical Marketing Engineer

Twitter: @rawstorage


Read Full Blog