Cyber Intrusion Detection for z Systems (zCID)
Tue, 12 Dec 2023 18:42:09 -0000
|Read Time: 0 minutes
Any cyber security event can have a devastating impact on a company’s financials. Stolen credit cards, identity theft, hacked emails, and so on hurt both the customer and the company’s brand, even going so far as to potentially ruin that company. Data Recovery takes time, but rebuilding customer trust may take even longer.
Dell Technologies has made major investments in a series of continuous security product enhancements to help protect companies and their end users from data loss and/or compromise in the event of an attack. Whether it’s an attack on open systems data or mainframe data, the result of any attack is the same: loss of productivity and concern over theft and exposure of sensitive information.
Ideally, technologies like storage should be able to detect a cyber threat, protect data from the threat, and, in the event of a loss or corruption of data, recover to a known good point. Eight years ago, Dell Technologies developed the first snapshot-based recovery capability for mainframe and open systems data and, as of the latest release of PowerMax in October 2023, has moved into the “intrusion detection” realm of cyber resiliency.
This blog is about a new enhancement to our Mainframe Enabler Software for PowerMax that is designed to provide advanced threat detection for PowerMax mainframe environments.
Mainframe Enabler for intrusion detection
Mainframe Enabler Software (MFE) runs on a z/OS LPAR and is designed to manage PowerMax 2500/8500 and 8000. During discussions about the most recent customer requirements for this release of MFE, it became apparent that customers urgently needed a way to determine whether a cyber event was imminent or occurring. The ask was to send the equivalent of a ‘flare in the sky’ to single-out any atypical behavior in mainframe data access. Upon learning of zCID’s capability within the larger Dell cyber solution, a large mainframe service provider commented “Dell’s innovation around detection of cyber events within PowerMax and CloudIQ is ahead of any other storage provider we talked to”.
Dell Mainframe Solutions development, Product Management, and other organizations within Dell designed a way to enhance MFE to provide awareness of atypical data access behavior. The result of that work was delivered as an enhancement in MFE 10.1.0, released 17 October 2023. This enhancement is known as ‘Cyber Intrusion Detection for z Systems’ or zCID for short.
We will jump into the technical details of zCID; but first, let’s cover the What, Why, and How of this valuable new feature.
What: zCID is a utility that detects atypical data access patterns in mainframe workloads.
Why: To warn PowerMax mainframe customers that atypical access is occurring, and which should be investigated if a cyber intrusion is suspected.
How: zCID monitors the number of unique tracks accessed for mainframe CKD devices and SMS groups within a customer specified time interval. First a baseline of “normal/typical” access is confirmed by the storage administrator. The next step is to create a set of rules for warning statements that will be generated if an anomaly was detected when data was accessed. Next, zCID is started and runs continually in the background. Finally, if an intrusion is suspected, zCID raw data can be converted to a CSV format for detailed analysis.
Technical and install requirements for zCID
The minimum technical requirements for zCID are:
- MFE 10.1.0 with available SCF address space
- PowerMax 8000, 2500, or 8500
- A list of CKD volumes or SMS groups to monitor
Customers must APF-authorize the MFE 10.1.0 LINKLIB dataset and add a STEPLIB DD statement in their zCID batch jobs. (zCID can also run as a started task.)
zCID is delivered in two programs:
- ECTRAARD is the zCID utility program
- ECTREXTR is a zCID program that converts the raw zCID data to a CSV file. This CSV file is intended to be imported into Microsoft Excel for additional analysis and reporting as determined by a storage analyst.
zCID modes of operation and high-level implementation strategy
ECTRAARD can run in “Live Run mode” or “Batch Run mode”. It is important to understand these two modes before deploying zCID:
- Live Run mode: processes data in real time and collects data from the resources you tell zCID to monitor.
- Batch Run mode: takes the output produced in Live Run mode and creates reports about the historical information.
To maximize the benefits of zCID, follow these five-steps:
- Live Run mode will vary from customer to customer. Typically, you would run zCID in Live Run mode to capture access rates for the z/OS resources you are monitoring. Typically, I would start Live Run mode for one week (seven days), then capture a month end batch processing cycle, and ideally, a quarterly and year end closing cycle. With that information, you can calibrate your WARN statements for your highest accessed rate z/OS workloads that zCID is monitoring.
- Run zCID in Live Run mode over a “long” period. View this period of time as an opportunity to collect access rate information for the z/OS resources that zCID is monitoring. In the future, you can use this information to test your warning statements for atypical access rates on monitored z/OS resources.
- Stop Live Run mode at the end of the “long period" so that the datasets zCID was building can be closed.
- Run zCID Batch Mode to create reports, then analyze the results.
- Create warning statements for the atypical access rates for which you want to be notified. To calibrate the warning statements, take the datasets created in Step 2 and run zCID in Batch Run mode. Are zCID warning messages being issued from the warning statements you created?
Calibrate the WARN statements ensures that z/OS SYSLOG, z/OS master console, and z/OS zCID started tasks are not spammed with zCID warning messages. - Restart zCID in Live Run mode with the calibrated warning control statements.
zCID will now actively monitor the z/OS resources you provided and generate an alert every time an atypical access rate occurs!
Summary
Cyber Intrusion Detection for z Systems (zCID) makes Dell PowerMax the industry’s first intrusion detection mechanism for on-array mainframe storage [1]. zCID is a layer of intelligence that detects atypical data access patterns for specified workloads by providing for first-time PowerMax customers insight into their z/OS workloads’ access rates. Customers can then automate the monitoring of those workloads with the goal of detecting cyber events within their mainframe storage infrastructure.
Check out https://infohub.delltechnologies.com/ for more information about zCID and Dell’s PowerMax mainframe solutions.
Author: Justin Bastin, Senior Principal Engineer
[1] Based on Dell's internal analysis comparing PowerMax 2500/8500 cyber detection for mainframe storage to mainstream mainframe competitors. August 2023.
Related Blog Posts
Important Updates in Dell’s Geographically Dispersed Disaster Restart (GDDR)
Tue, 30 Aug 2022 20:53:25 -0000
|Read Time: 0 minutes
Dell Technologies created Geographically Dispersed Disaster Restart (GDDR) to provide mainframe customers a comprehensive business continuity automation product for their Dell PowerMax Storage and Disk Library for mainframe virtual tape environments. GDDR achieves this by reacting to events within your IT environment.
The three functions of automate, react, and monitor (ARM) combine to enable continuous operations across both planned and unplanned outages. GDDR is designed to perform planned data-center site-switch operations and to restart operations following disasters. These incidents can range from the loss of compute capacity or disk-array access, to the total loss of a single data center, or a regional disaster resulting in the loss of dual data centers. GDDR also provides automation to protect data from cyberattack in a separate physical vault array. For more information about GDDR, see the document GDDR (Geographically Dispersed Disaster Restart) for PowerMax 8000 and VMAX ALL FLASH 950F.
Dell’s GDDR 5.3 enhancements
GDDR introduced an exciting new feature in GDDR 5.3 called Cyber Protection Automation (zCPA) which populates a separate physical cyber vault for your mainframe environment. zCPA automates cyber protection copy creation and preservation by using Dell’s Data Protector for z Systems (zDP). zCPA automates the creation and transmission of PowerMax snapshots to a physically separate cyber vault PowerMax array. This provides a protected copy of data that can be used for testing purposes, recovery from a cyber event, or an analytical process to better understand the extent of damage caused by a cyberattack.
The transmission of data to the cyber vault leverages SRDF/Adaptative Copy. To take advantage of zCPA, customers need GDDR 5.3 with the zCPA PTF, Mainframe Enabler 8.5, and a PowerMax at 5978.711.711 or higher.
Unique benefits of GDDR zCPA types
zCPA supports air gapped and non-air gapped physical cyber vaults. Any site in a GDDR topology can be an attached cyber vault array managed by zCPA. To provide customers choice, there are three types of methods for creating zCPA vault arrays. The three zCPA types are defined by the different configuration and operational attributes that dictate how zCPA will function.
zCPA Type 1
- Type 1 is defined as an environment that has no airgap in connectivity between a data center and the cyber vault. The data copied to the cyber vault is initiated when a newly created zDP Snapset is detected. The cyber vault in Type 1 does not have to be a dedicated physical vault and could be another storage array within the production data center. This type is the default for zCPA.
zCPA Type 2
- Type 2 is an air-gapped environment between two storage environments. The data copied to the cyber vault is triggered by SRDF link online operation. GDDR monitors the SRDF Link Operation process to know when SRDF connectivity to the vault has been established and closed it when it has populated the vault.
zCPA Type 3
- Type 3 is an environment that does not provide an airgap solution. The data copied to the cyber vault is triggered by the SCHEDULE or INTERVAL parameter in GDDR.
The airgap support between the production and vault site is optional.
For more information about GDDR’s zCPA with respect to cyber, see the white paper Dell PowerMax: Cyber Security for Mainframe Storage or contact us at mainframe@dell.com.
Resources
- Mainframe Enablers TimeFinder SnapVX and zDP 8.5 Product Guide
- Data Protector for z Systems (zDP) Essentials White Paper
- Dell PowerMax: Cyber Security for Mainframe Storage
- GDDR (Geographically Dispersed Disaster Restart) for PowerMax 8000 and VMAX ALL FLASH 950F
Author: Justin F. Bastin
Kubernetes on Z with PowerMax: Modern Software Running on Mainframe
Mon, 02 Oct 2023 13:21:45 -0000
|Read Time: 0 minutes
Benefits of Kubernetes on System Z and LinuxOne
When I was a customer, I consistently evaluated how to grow the technical influence of the mainframe platform. If I were talking about the financials of the platform, I would evaluate the total cost of ownership (TCO) alongside various IT solutions and the value deduced thereof. If discussing existing technical pain points, I would evaluate technical solutions that may alleviate the issue.
For example, when challenged with finding a solution for a client organization aiming to refresh various x86 servers, I searched online presentations, YouTube videos, and technical websites for a spark. The client organization had already identified the pain point. The hard part was how.
Over time, I found the ability to run Linux on a mainframe (called Linux on Z), using an Integrated Facility for Linux (IFL) engine. Once the idea was formed, I started baking the cake. I created a proof-of-concept environment installing Linux and a couple of applications and began testing.
The light-bulb moment came not in resolving the original pain point, but in discovering new opportunities I had not originally thought of. More specifically:
- Physical server consolidation – I’ll create a plethora of virtual servers when needed
- License Consolidation – Certain applications with x86 were licensed on a per engine basis. A quad core x86 server may need four application licenses to function. I needed one license for my Linux on Z environment (at the time of testing)
- Scalability – I could scale horizontally by adding more virtual machines and vertically by increasing the network ports accessible to the server and adding more memory/storage
- Reliability – Mainframe technology has been known to be reliable, utilizing fault tolerant mechanisms within the software and hardware to continue business operations
With the 2023 addition of Kubernetes on LinuxOne (mainframe that only runs Linux), you can scale, reduce TCO, and build that hybrid cloud your IT management requires. With Kubernetes providing container orchestration irrelevant of the underlying hardware and architecture, you can leverage the benefits of LinuxOne to deploy your applications in a structured fashion.
Benefits when deploying Kubernetes to Linux on Z may include:
- Enablement of DevOps processes
- Container Scalability – using one LinuxOne box with hundreds (if not thousands) of containers
- Hybrid Cloud Strategy – where LinuxOne is servicing various internal business organizations with their compute and storage needs
With Dell providing storage to mainframe environments with PowerMax 8500/2500, a Container Storage Interface (CSI) was created to simplify your experience with allocating storage to Kubernetes environments when using Linux on Z with Kubernetes.
The remaining content will focus on the CSI for PowerMax. Continue reading to explore what’s possible.
Deploy Kubernetes
Linux on IBM Z runs on s390x architecture. This means that all the software we use needs to be compiled with that architecture in mind.
Luckily, Kubernetes, CSI sidecars, and Dell CSI drivers are built in Golang. Since the early days of Go, the portability and support of different OS and architectures has been one of the goals of the project. You can get the list of compatible OS and architecture with your go version using the command:
go tool dist list
The easiest and most straightforward way of trying Kubernetes on LinuxOne is by using the k3s distro. It installs with the following one-liner:
curl -sfL https://get.k3s.io | sh -
Build Dell CSI driver
The Dell CSI Driver for PowerMax is composed of a container to run all actions against Unisphere and mount a LUN to a pod, with a set of official CSI sidecars to interact with Kubernetes calls.
The Kubernetes official sidecars are published for multiple architectures including s390x while Dell publishes only images for x86_64.
To build the driver, we will first build the binary and then the image.
Binary
First, let’s clone the driver from https://github.com/dell/csi-powermax in your GOPATH. To build the driver, go in the directory and just execute:
CGO_ENABLED=0 GOOS=linux GOARCH=s390x GO111MODULE=on go build
At the end of the build, you must have a single binary with static libs compiled for the s390x:
file csi-powermax
csi-powermax: ELF 64-bit MSB executable, IBM S/390, version 1 (SYSV), statically linked, Go BuildID=…, with debug_info, not stripped
Container
The distributed driver uses minimal Red Hat Universal Base Image. There is no s390x compatible UBI image. Therefore, we need to rebuild the container image from a Fedora base-image.
The following is the Dockerfile:
# Dockerfile to build PowerMax CSI Driver
FROM docker.io/fedora:37
# dependencies, following by cleaning the cache
RUN yum install -y \
util-linux \
e2fsprogs \
which \
xfsprogs \
device-mapper-multipath \
&& \
yum clean all \
&& \
rm -rf /var/cache/run
# validate some cli utilities are found
RUN which mkfs.ext4
RUN which mkfs.xfs
COPY "csi-powermax" .
COPY "csi-powermax.sh" .
ENTRYPOINT ["/csi-powermax.sh"]
We can now build our container image with the help of docker buildx, which makes building cross-architecture a breeze:
docker buildx build -o type=registry -t coulof/csi-powermax:v2.8.0 --platform=linux/s390x -f Dockerfile.s390x .
The last step is to change the image in the helm chart to point to the new one: https://github.com/dell/helm-charts/blob/main/charts/csi-powermax/values.yaml
Et voilà! Everything else is the same as with a regular CSI driver.
Wrap-up, limitations, and disclaimer
Thanks to the open-source model of Kubernetes and Dell CSM, it’s easy to build and utilize them for many different architectures.
The CSI driver for PowerMax supports FBA devices via Fiber Channel and iSCSI. There is no support for CKD devices which require code changes.
The CSI driver for PowerMax allows CSI-compliant calls.
Note: Dell officially supports (through Github tickets, Service Requests, and Slack) the image and binary, but not the custom build.
Useful links
Stay informed of the latest updates of the Dell CSM eco-system by subscribing to:
Authors: Justin Bastin & Florian Coulombel