Your Browser is Out of Date

Nytro.ai uses technology that works best in other browsers.
For a full experience use one of the browsers below

Home > APEX > Blogs

Blogs

Blogs on various topics exploring Dell Technologies APEX

blogs (8)

APEX Private Cloud SUSE Rancher Terraform

Using Terraform to Deploy SUSE Rancher in an APEX Private Cloud Environment

Juan Carlos Reyes

Sat, 28 Jan 2023 23:44:51 -0000

|

Read Time: 0 minutes

Automating deployments and managing hardware through code is a beautiful thing. Not only does it free up time, it also enables environment standardization. Infrastructure as Code (IaC) manages and provides an infrastructure through machine-readable definition files rather than manual processes.

In this blog, I demonstrate how to use HashiCorp’s Terraform, a popular open-source Infrastructure-as-Code software tool, in an APEX Private Cloud environment to deploy a fully functional SUSE Rancher cluster. By doing so, infrastructure engineers can set up environments for developers in a short time. All of this is accomplished by leveraging vSphere, Rancher, and standard Terraform providers.

Note: The PhoenixNAP GitHub served as the basis for this deployment.

 

Pre-requisites

In this blog, we assume that the following steps have been completed:

  1. Network – This is a three-node RKE2 cluster and an optional HAProxy load balancer. Assign three IPs and DNS names for the RKE2 nodes and the same for the single load balancer.
  2. Virtual Machine Gold Image – This virtual machine template will be the basis for the RKE2 nodes and the load balancer. To create a SLES 15 SP4 template with the required add-ons, see the blog Using HashiCorp Packer in Dell APEX Private Cloud.
  3. Admin Account – Have a valid vCenter account with enough permissions to create and provision components.
  4. Here is the GitHub repo with all the files and templates to follow along. Using Terraform, here are the files used to provision Rancher:
    • Main.tf file – Defines how secrets, tokens, and certificates are stored and defined in the variables.tf file. This file provides the steps for providers and resources to create the vSphere infrastructure and the commands to deploy provisioners. This is where the Rancher integration is outlined.
    • Versions.tf – Specifies which version of Terraform, Rancher, and vSphere on which providers are required to run code that contains no syntax or fatal errors.
    • Variables.tf – This file can be used for defining defaults for certain variables. It includes CPU, memory, and vCenter information.
    • Templates Folder - During the RKE2 clustering, we require a configuration YAML file that contains the RKE2 Token. This folder stores Terraform’s templates that are used to create the RKE2 configuration files. These files contain Subject Alternative Name (SAN) information and the secret required for subsequent nodes to join the cluster. There is a method to obfuscate the configuration file in a template format, making it more secure when uploading the code to a GitHub repo.
    • HAProxy Folder – This folder contains the certificate privacy enhanced mail (PEM) file, key, and configuration file for the HAProxy load balancer.
    • Files folder – The configuration files are stored after being created from the templates. You also find the scripts to deploy RKE2 and Rancher.

Creating the HAProxy Node

The first virtual machine defined in the Main.tf file is the HAProxy load balancer. The resource “vsphere_virtual_machine” creation has a standard configuration such as assigning memory, CPU, network, and so on. The critical part is when we start provisioning files to the template files. We use file provisioners to add the HAProxy configuration, certificate, and key files to the virtual machine.

Note: HashiCorp recommends using provisioners as a last-resort option. The reason is that they do not track the state of the object that is modifying and require credentials that are exposed if not appropriately handled.

I used the following command to create a valid self-signed certificate in SLES15. The name of the PEM file must be “cacerts.pem” because it is a requirement by Rancher to propagate appropriately.

openssl req -newkey rsa:2048 -nodes -keyout certificate.key -x509 -days 365 -out cacerts.pem -addext "subjectAltName = DNS:rancher.your.domain"

Next, we use a remote execution provisioner that outlines the commands to install and configure HAProxy in the virtual machine:

    inline = [
      "sudo zypper addrepo https://download.opensuse.org/repositories/server:http/SLE_15/server:http.repo",
      "sudo zypper --gpg-auto-import-keys ref",
      "sudo zypper install -y haproxy",
      "sudo mv /tmp/haproxy.cfg /etc/haproxy",
      "sudo mv /tmp/certificate.pem /etc/ssl/",
      "sudo mv /tmp/certificate.pem.key /etc/ssl/",
      "sudo mkdir /run/haproxy/",
      "sudo systemctl enable haproxy",
      "sudo systemctl start haproxy"
    ]

We add a standard OpenSUSE repo with access to the HAProxy binaries that are compatible with SLES15. Next, the HAProxy installation takes place and moves critical files to the correct location. The last couple systemctl commands start the HAProxy service.

Here is the sample HAProxy configuration file:

global
        log /dev/log    daemon
        log /var/log    local0
        log /var/log    local1 notice
        chroot /var/lib/haproxy
        stats socket /run/haproxy/admin.sock mode 660 level admin
        stats timeout 30s
        user haproxy
        group haproxy
        daemon
 
        # Default SSL material locations
        ca-base /etc/ssl/certs
        crt-base /etc/ssl/private
        maxconn 1024
        # Default ciphers to use on SSL-enabled listening sockets.
        # For more information, see ciphers(1SSL). This list is from:
        # https://hynek.me/articles/hardening-your-web-servers-ssl-ciphers/
         ssl-default-bind-ciphers ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:ECDH+3DES:DH+3DES:RSA+AESGCM:RSA+AES:RSA+3DES:!aNULL:!MD5:!DSS
        ssl-default-bind-options ssl-min-ver TLSv1.2 prefer-client-ciphers
         tune.ssl.default-dh-param 2048
        cpu-map  1 1
        cpu-map  2 2
        cpu-map  3 3
        cpu-map  4 4
 
defaults
        log     global
        mode    http
        option  httplog
        option   forwardfor
        option  dontlognull
        timeout connect 50000s
        timeout client  50000s
        timeout server  50000s
        retries 4
        maxconn 2000000
 
frontend www-http
        mode http
        stats enable
        stats uri /haproxy?stats
        bind *:80
        http-request set-header X-Forwarded-Proto http
        option http-server-close
        option forwardfor except 127.0.0.1
        option forwardfor header X-Real-IP
        # MODIFY host
        acl host_rancher hdr(host) -i rancher.apca1.apextme.dell.com
        acl is_websocket hdr(Upgrade) -i WebSocket
        acl is_websocket hdr_beg(Host) -i wss
        use_backend rancher if host_rancher
 
frontend www-https
        bind *:443 ssl crt /etc/ssl/certificate.pem alpn h2,http/1.1
        option http-server-close
        http-request set-header X-Forwarded-Proto https if { ssl_fc }
        redirect scheme https code 301 if !{ ssl_fc }
        option forwardfor except 127.0.0.1
        option forwardfor header X-Real-IP
        # MODIFY host
        acl host_rancher hdr(host) -i rancher.apca1.apextme.dell.com
        acl is_websocket hdr(Upgrade) -i WebSocket
        acl is_websocket hdr_beg(Host) -i wss
        use_backend rancher if host_rancher
 
frontend kubernetes
        # MODIFY IP
        bind 100.80.28.72:6443
        option tcplog
        mode tcp
        default_backend kubernetes-master-nodes
 
frontend supervisor_FE
        # MODIFY IP
        bind 100.80.28.72:9345
        option tcplog
        mode tcp
        default_backend supervisor_BE
 
backend rancher
        redirect scheme https code 301 if !{ ssl_fc }
        mode http
        balance roundrobin
        option httpchk HEAD /healthz HTTP/1.0
        # MODIFY IPs
        server rke-dev-01 100.80.28.73:80 check
        server rke-dev-02 100.80.28.74:80 check
        server rke-dev-03 100.80.28.75:80 check
 
backend kubernetes-master-nodes
        mode tcp
        balance roundrobin
        option tcp-check
        # MODIFY IPs
        server rke-dev-01 100.80.28.73:6443 check
        server rke-dev-02 100.80.28.74:6443 check
        server rke-dev-03 100.80.28.75:6443 check
 
backend supervisor_BE
        mode tcp
        balance roundrobin
        option tcp-check
        # MODIFY IPs
        server rke-dev-01 100.80.28.73:9345 check
        server rke-dev-02 100.80.28.74:9345 check
        server rke-dev-03 100.80.28.75:9345 check

To troubleshoot the configuration file, you can execute the following command:

haproxy -f /path/to/haproxy.cfg -c

Another helpful troubleshooting tip for HAProxy is to inspect the status page for more information about connections to the load balancer. This is defined in the configuration file as stats uri /haproxy?stats. Use a browser to navigate to the page http://serverip/haproxy?stats.

After HAProxy starts successfully, the script deploys the RKE2 nodes. Again, the initial infrastructure configuration is standard. Let’s take a closer look to the files config.yaml and script.sh that are used to configure RKE2. The script.sh file contains the commands that will download and start the RKE2 service on the node. The script.sh file is copied to the virtual machine via the file provisioner and also made executable in the remote-exec provisoner. In a separate file provisioner module, the config.yaml file is moved to a newly created rke2 folder, the default location where the rke2 service looks for such a file.

Here is a look at the script.sh file:

sudo curl -sfL https://get.rke2.io | sudo sh -
sudo systemctl enable rke2-server.service
n=0
until [ "$n" -ge 5 ]
do
   sudo systemctl start rke2-server.service && break  # substitute your command here
   n=$((n+1))
   sleep 60
done

Notice that the start service command is in a loop to ensure that the service is running before moving on to the next node.

Next, we make sure to add the host information of the other nodes to the current virtual machine host file.

The subsequent two nodes follow the same sequence of events but use the config_server.yaml file, which contains the first node’s API address. The final node has an additional step: using the rancher_install.sh file to install Rancher on the cluster.

Here is a look at the rancher_install.sh file:

echo "Create ~/.kube"
mkdir -p /root/.kube
echo "Grab kubeconfig"
while [ ! -f /etc/rancher/rke2/rke2.yaml ]
do
  echo "waiting for kubeconfig"
  sleep 2
done
echo "Put kubeconfig to /root/.kube/config"
cp -a /etc/rancher/rke2/rke2.yaml /root/.kube/config
echo "Wait for nodes to come online."
i=0
echo "i have $i nodes"
while [ $i -le 2 ]
do
   i=`/var/lib/rancher/rke2/bin/kubectl get nodes | grep Ready | wc -l`
  echo I have: $i nodes
  sleep 2s
done
echo "Wait for complete deployment of node three, 60 seconds."
sleep 60
echo "Install helm 3"
curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3
chmod 700 get_helm.sh
./get_helm.sh
echo "Modify ingress controller to use-forwarded-headers."
cat << EOF > /var/lib/rancher/rke2/server/manifests/rke2-ingress-nginx-config.yaml
---
apiVersion: helm.cattle.io/v1
kind: HelmChartConfig
metadata:
  name: rke2-ingress-nginx
  namespace: kube-system
spec:
  valuesContent: |-
    controller:
      config:
        use-forwarded-headers: "true"
EOF
echo "Install stable Rancher chart"
helm repo add rancher-stable https://releases.rancher.com/server-charts/stable
/var/lib/rancher/rke2/bin/kubectl create namespace cattle-system
/var/lib/rancher/rke2/bin/kubectl -n cattle-system create secret generic tls-ca --from-file=cacerts.pem=/tmp/cacerts.pem
# Modify hostname and bootstrap password if needed
helm install rancher rancher-stable/rancher \
  --namespace cattle-system \
  --set hostname=rancher.your.domain \
  --set bootstrapPassword=admin \
  --set ingress.tls.source=secret \
  --set tls=external \
  --set additionalTrustedCAs=true \
  --set privateCA=true
 
/var/lib/rancher/rke2/bin/kubectl -n cattle-system create secret generic tls-ca-additional --from-file=ca-additional.pem=/tmp/cacerts.pem
echo "Wait for Rancher deployment rollout."
/var/lib/rancher/rke2/bin/kubectl -n cattle-system rollout status deploy/rancher

Before the installation begins, there is a step that waits for all the rke2 nodes to be ready. This rancher_install.sh script follows the installation steps from the Rancher website. For this example, we are using an external load balancer. We, therefore, modified the ingress controller to use-forwarded-headers, as stated in the Rancher documentation. The other key parts of this script are the default bootstrap password and the TLS/CA flags assigned in the helm command. To change the administrator password successfully, it must be the same as the password used by the Rancher provider. The TLS and CA flags let the pods know that a self-signed certificate is being used and not to create additional internal certificates.

Note: The wait timers are critical for this deployment because they allow the cluster to be fully available before moving to the next step. Lower wait times can lead to the processes hanging and leaving uncompleted steps.

Navigate to the working directory, then use the following command to initialize Terraform:

terraform init

This command verifies that the appropriate versions of the project’s providers are installed and available.

Next, execute the ‘plan’ and ‘apply’ commands.

terraform plan

terraform apply –auto-approve

 

The deployment takes about 15 minutes. After a successful deployment, users can log in to Rancher and deploy downstream clusters (which can also be deployed using Terraform). This project also has a non-HAProxy version if users are interested in that deployment. The main difference is setting up a manual round-robin load balance within your DNS provider.

With this example, we have demonstrated how engineers can use Terraform to set up a SUSE Rancher environment quickly for their developers within Dell APEX Private Cloud.

Author: Juan Carlos Reyes

Read Full Blog
APEX APEX Private Cloud HashiCorp Packer

Using HashiCorp Packer in Dell APEX Private Cloud

Juan Carlos Reyes

Wed, 01 Mar 2023 16:42:16 -0000

|

Read Time: 0 minutes

Throughout the lifespan of a project, infrastructure engineers can spend hours redeploying virtual machines with the proper configurations, such as for a VDI project in which each team needs a specific configuration. The infrastructure team might have a base image template and customize it for each group. That means booting the base virtual machine, configuring it, and saving it as a newly configured template. Rinse and repeat for each team’s requirements, and it can mean a lot of manual work.

HashiCorp Packer can help create all these virtual machine templates with less manual work. Virtual machine templates help infrastructure administrators standardize their offerings and speed delivery. There are multiple ways to create a template, from manually to a full CI/CD pipeline. In this blog, we will create a SLES 15 SP4 golden image for a Dell APEX Private Cloud environment using HashiCorp Packer. Packer is an open-source tool for creating templates for virtual machines, servers, or hard disk drives (golden images). We’ll design this template to work with Dell APEX Data Storage Services, SUSE RKE2, and Rancher. Within Rancher, we can use this template to deploy downstream clusters using vSphere as the provider.

There are a few prerequisites. We’ll need:

  • To install Packer on our workstation
  • A SLES 15 SP4 image
  • A SLES15 license
  • A DHCP network configured in the Dell APEX Private Cloud environment

Plenty of GitHub repositories have Packer templates to get users started. I forked the David-VTUK repo and used it as my starting point. This SLES 15 template works great for an RKE deployment. However, this same template wouldn’t work for an RKE2 deployment, maybe because some packages included (or excluded) in this image conflict with RKE2.

I started by manually creating a virtual machine with SLES 15 SP4 installed, then verified that RKE2 could be installed. (This document has many steps I followed to configure this image.) The only added extension in the configuration was the public cloud module.

After installing SLES 15 SP4, I booted up the virtual machine and installed the cloud-init and open-iscsi services as root.

sudo -i
zypper addrepo https://download.opensuse.org/repositories/Cloud:Tools/SLE_15_SP4/Cloud:Tools.repo
zypper refresh
zypper install cloud-init
zypper install open-iscsi

Cloud-init is required to automate Rancher downstream RKE2 clusters. Open-iscsi allows the image to use Dell APEX Data Storage Services (ADSS) for external storage. In a SUSE Rancher cluster, it is useful for persistent volumes. Make sure to disable the AppArmor and Firewalld services.

Note: The open-iscsi package can be installed as part of the cloud-config setup later on, so we can skip this part.

With the manual image appropriately configured, I used AutoYast to create a new autoinst.xml file. I compared this new file with the autoinst.xml file from the original repo and added the new packages and services that were found on the new autoinst.xml file to the original file.

Now. with an autoinst.xml file compatible with an RKE2 cluster, I ran the Packer script from the GitHub repo to create the new virtual machine template. Don’t forget to modify the variables.json file with the Dell APEX Private Cloud credentials and add the SLES 15 SP4 image to the working directory. The entire creation process takes about 10 minutes to complete.

There you have it, short and sweet. Packer with Dell APEX Private Cloud services is simple, and you can use existing vSphere Packer scripts within APEX Private Cloud.

Note: SUSE recently made available a SLES 15 image with cloud-init enabled. With this new image, creating a template is easier because it only requires spinning up a VM and creating a template inside vCenter. However, Packer makes life easier with specific VM configurations and software installation.

Resources

Author: Juan Carlos Reyes, Senior Engineer

Read Full Blog
VMware VxRail APEX APEX Private Cloud

Day 1 – Deploying an APEX Private Cloud Subscription

David O'Dell

Fri, 04 Nov 2022 19:03:21 -0000

|

Read Time: 0 minutes

Ordering and deploying new physical infrastructure for a business-critical application is often challenging. 

This series of blogs reveals the Dell differences that simplify the complex task of infrastructure deployment[1], specifically the processes of fulfillment, configuration, and workload creation. These steps are typically referred to as Day 0, Day 1, and Day 2, respectively. Each blog in this series will show how an APEX subscription can remove complexity and achieve quicker time to value and operational efficiency. (In this blog series, we assume that the application being built requires the compute resources of a 4-node general purpose APEX Private Cloud Dell-integrated rack, ordered through the APEX console with typical network and power requirements.)

Before we dive in, let’s review briefly what has happened so far during the fulfillment stage after an order for a subscription is submitted. To get to this point, the APEX backend business coordination team has been orchestrating the entire fulfillment process, including people, parts, and procedures. The Dell Project Manager and Planning & Optimization Manager have been in frequent contact with the customer, assisting them with configuration and site review. Dell team members support the customer through the Enterprise Project Services portal: a planning and approval tool that allows customer visibility throughout the deployment process, from setting up the host network, to verifying and validating the new hardware. During planning, the Dell Customer Success Manager meets the customer and becomes the customer’s main point of contact for Day 2 operations and afterward.  

Delivery day begins when Dell’s preferred shipping partner carefully escorts the rack from the customer loading area to the tile where it needs to be installed inside the customer’s data center. While the rack is being shipped and installed, the Dell Project Manager assigns and coordinates with an on-site professional services technician. 

Day 1

Day 1 starts when the professional services technician arrives at the customer site, ready to configure the rack with the agreed upon options. The technician first inspects the rack inside and out, making sure that the wiring is secure and that there are no electrical or physical hazards. The technician then guides the customer or the customer’s electrician to plug the PDUs into datacenter power, and to power up the rack. The technician also plugs the customer provided network uplink cables into the APEX switches. When power and networking are connected, the technician verifies that all systems are in compliance and validated for the next steps.

The technician then configures the APEX switches and works with the customer to get the switches communicating on the customer’s core network, according to the specifications previously agreed upon during the planning meetings. Each APEX Private Cloud rack is pre-wired for 24 compute nodes, regardless of the number of nodes in a subscription. This forward-thinking feature is yet another Dell difference that simplifies rapid expansion. (When the need for an expansion arises, the customer can contact their CSM directly to expedite the order process. Both Dell-integrated and customer provided rack options come with on-site configuration by a professional services technician.)

After the technician performs network health checks, the technician initiates the cluster build. Upon verification and validation of the APEX compute nodes, the technician installs the latest VxRail Manager and vCenter on each, to tie all nodes together into a robust, highly available cluster. 

With APEX VxRail compute nodes, customers get a broad range of benefits from the only hardware platform that is co-engineered with VMware. VxRail is a highly-trusted platform with thousands of tested and validated firmware and hypervisor installations around the globe. Each node hosts an instance of the integrated Dell Remote Access Controller (iDRAC) with the latest security and update features. Built-in automations include hot node addition and removal, capacity expansions, and graceful cluster shutdown.  

An APEX subscription also includes Dell’s Secure Connect Gateway appliance, which proactively monitors compute nodes and switches. If an anomaly is detected, the appliance gathers logs and issues a support ticket, reducing the time it takes to resolve problems if they arise.  

VMware vCenter on VxRail, included with each APEX Private Cloud subscription, comes equipped with Dell integrations such as firmware, driver, and power alerts, and an intuitive physical view to help resolve any hardware issues simply and quickly. Dell is the customer’s single point of contact for help with our streamlined Pro Deploy Plus service and Pro Support Plus with Mission Critical Support - all included in the customer’s APEX Private Cloud subscription.  

After the latest versions of VxRail Manager and vCenter are installed, the technician brings up the vCenter interface at an IP address, in accordance with the customer’s network requests. Even after the technician is gone and additional help is needed, customers can ask support to review and help guide updates twice a year at no additional cost.

While the underlying hardware is essential and a major differentiator when comparing Dell to the rest of the market, the spirit behind APEX is to provide the best possible outcome for the customer by removing the complexity when deploying a rack-scale solution. To achieve this goal, the APEX Console simplifies the planning process with a wide variety of subscription choices with preconfigured compute, memory, and GPU accelerators. This means that the customer can easily select the number and type of instances they need, or use the Cloud Sizer for assistance to match their workload needs to the available subscription options. The customer can use the APEX Console to contact support directly, manage console users, and assign roles and permissions to those with console access to facilitate the entire lifecycle of their subscription.  

Licensing

After vCenter is up and running, the technician installs enterprise licenses for both vCenter and vSAN. APEX is flexible enough that the customer can also bring their own licenses for a potential discount on their subscription. If this is the case, during the planning phase and prior to the subscription order, VMware will review the licenses to eliminate any lapses during the APEX subscription term.  

All APEX Private Cloud subscriptions include 60-day full-feature trial licenses for VMware Tanzu and NSX Advanced Load Balancer. After licenses are installed and all software stacks are running successfully, the Dell technician securely hands the usernames and passwords to the customer and requests that they change the passwords.  

Additional Services

The technician is also available to configure additional services such as a stretched cluster within the rack, deduplication compression, and in-flight or at-rest encryption. The technician can also help stretch a cluster across racks and to configure fault domains. Although these additional services and costs need to be declared and agreed to during the planning phase, this is well within the capabilities of Dell professional services.  

When all the customer requested services are up and running, the technician updates the EPS portal to conclude their tasks and to offer any notes and feedback on the process. 

At this point the customer’s subscription is activated! Customers can now move into Day 2 operations and start using new resources for various business workloads.

Resources

Author: David O’Dell, Senior Principal Tech Marketing Engineer - APEX

[1] Deployment time is measured between order acceptance and activation. The 28-day deployment applies to single rack deployments of select APEX Cloud services pre-configured solutions and does not include customizations to the standard configuration.

Read Full Blog
data protection APEX APEX Backup Services ransomware

Say No to Costly Ransom Demands with APEX Backup Services

Vinod Kumar Kumaresan

Tue, 14 Jun 2022 15:27:47 -0000

|

Read Time: 0 minutes

Ransomware attacks are constantly evolving. Whatever you build today will be obsolete sooner than you might expect because multiple groups are constantly releasing new ransomware packages. You need to be able to evolve with them.

Ransomware is a relentless threat to every enterprise and the attacks are becoming more frequent, advanced, and expensive. Ransomware is a form of malicious software or malware that when downloaded to a computer or server, encrypts victims’ data so that threat actors can demand a “ransom” in exchange for a decryption key needed to unlock your data. Even if the ransom is paid, there is no guarantee that the attacker will provide you with the decryption key.

Ransomware attacks today play out across the entire IT infrastructure, targeting all forms of data and requiring a more robust response. In addition to solutions for preventing attacks, organizations need response and data recovery plans that can quickly bring information back online.

How does ransomware work?

Infection and attack vectors

There are several vectors that ransomware can take to access the target system. One of the most common delivery systems is phishing spam attachments or links that come to the victim in an email. When the victim falls for the phish, the ransomware is downloaded and executed on the target system where ransomware starts encrypting the data.

Encrypt data

When ransomware gains access to the system, it begins to encrypt the files. It encrypts the files with an attacker-controlled key and replaces the original files with the encrypted versions.

Ransom demand

When the file encryption is complete, the victim is presented with a message explaining that the files are now inaccessible and will only be decrypted if the victim sends the requested payment to the attacker through an untraceable medium.

Defensive steps to prevent ransomware

Some of the steps that can be taken to prevent ransomware infection include:

  • Ensure that the operating system is patched regularly and up to date
  • Install antivirus software that detects malicious programs such as ransomware
  • Do not install unauthorized software
  • Do not open email attachments automatically or trust email attachments with macros
  • And of course, frequently back up important data externally

In today’s diverse and distributed IT environment, restoring your organization’s applications and data quickly in the event of a ransomware attack is a significant challenge. Reliable backup and recovery are a crucial line of defense against ransomware.

Dell Technologies APEX Backup Services ransomware recovery

Dell Technologies APEX Backup Services ransomware recovery is a fast, reliable data recovery solution that eliminates any reason to even think of paying a ransom. APEX Backup Services is based on a secure and robust cloud architecture that can help you protect your business assets, limit the impact of ransomware, and accelerate recovery. Data protected in APEX Backup Services cannot be modified or deleted by ransomware. APEX Backup Services can ensure that your backup data is safe, help you operationalize security across your backup and primary environments, and accelerate the recovery process so you can get back to normal faster.

APEX Backup Services empowers security operations and IT teams to protect, detect, respond, and recover faster from external or internal attacks, ransomware, and accidental or malicious data deletion.

 

How does APEX Backup Services protect against ransomware?

Zero-trust security and immutable backups

Prevent infection of backups with air-gapped, immutable data and zero-trust security including MFA (multi-factor authentication), single sign-on (SSO) and role-based access control (RBAC).

APEX Backup Services was designed around a zero-trust security architecture and offers rich multi-layered defense features including MFA. Built natively on AWS’s security framework, APEX Backup Services also inherits the global security, compliance, and data residency controls, thus adhering to the highest standards for privacy and data security.

APEX Backup Services provides RBAC. It is strongly recommended to ensure that only a small group of administrators can perform destructive actions such as deleting backup data.

APEX Backup Services offers immutable protection in which backup data is isolated from the customer’s network and protected within the APEX Backup Services platform.

Another key to security is encryption for data, both in flight and at rest. APEX Backup Services provides a secure, multi-tenant environment for customer data. APEX Backup Services encrypts the data using envelope encryption technology, making it impossible for anyone other than the customer to access the data.

Unusual data activity monitoring and user access insights

Identify anomalous data sets and activity to understand which data may be affected with unusual data activity and access monitoring.

Suspicious modification of data on a resource is called Unusual Data Activity (UDA). A user or malicious software can make such changes. The UDA feature with APEX Backup Services provides continuous monitoring of your backup data. With UDA and access monitoring, you can identify anomalous data sets and activity to understand which data might be affected. Our proprietary entropy-based algorithm uses machine learning (ML) to understand norms for your specific backup environment and provides automated alerts for unusual data activity, including bulk deletion and encryption.

APEX Backup Services offers “Security Events”, a dashboard that shows you upfront the count of all administrator login events, data access events, API requests, and unusual data activity alerts, and nudges you to take remedial actions if required. This data helps you gain situational awareness about the backed-up data by gathering events from all APEX Backup Services products.

The APEX Backup Services accelerated ransomware recovery module provides access insights and anomaly detection that help you quickly identify possible ransomware attacks. The APEX Backup Services dashboard provides a single pane of glass where you can see all access attempts and activity across all your data sources. You can use these insights to quickly identify affected snapshots during recovery.

Contain the spread of infection and orchestrate response using API based integrations

Contain the spread of infection with APIs that automate ransomware playbooks, including quarantining affected resources and deleting infected snapshots.

The APEX Backup Services accelerated ransomware recovery enables you to quarantine infected snapshots on the impacted resources, which helps safeguard your system from further infection by barring users or administrators from downloading or restoring data to other resources.

The APEX Backup Services accelerated ransomware recovery module offers robust API integrations that make it easy to fit the solution into your overall security ecosystem. Orchestrating response activities using security information and event management (SIEM) and Security orchestration, automation, and response (SOAR) solutions can dramatically reduce your mean time to respond (MTTR) by automatically completing actions such as quarantining infected systems or snapshots based on a predetermined ransomware playbook.

After you quarantine snapshots, access to the quarantined snapshots is blocked for the administrators and the users of that resource. Administrators and users cannot download data or restore data from the quarantined snapshots.

Automated recovery with curated snapshots

Automate the recovery of clean and complete data by scanning snapshots for malware before restoring them and automatically find the most recent clean version of each file.

The APEX Backup Services cloud platform backs up workloads directly to the cloud, so that they are ready for immediate recovery in the event of a ransomware attack. The accelerated ransomware recovery module enables you to recover with confidence by ensuring the hygiene of recovery data.

You can scan snapshots for malware and IOCs (indicators of compromise) using built-in antivirus detection or using threat intelligence from your own forensic investigations or threat intel feeds. Scanning snapshots before recovery eliminates reinfection.

Accelerated ransomware recovery also solves the problem of data loss due to point-in-time recovery. Flexible recovery options allow you to restore full backups or specific files from a previous point in time. Now you can automatically identify the most recent clean version of every file within a specified timeframe and consolidate those versions into a single “Golden Snapshot”. Eliminating the manual search and recovery process drastically reduces time to recovery and prevents data loss.

The Curated Recovery feature automatically finds the most recent clean version of every file and compiles it into a single curated snapshot.  

It is unlikely that paying will resolve the problem or allow the victim to recover the encrypted data.  It may even make the situation worse, confirming for the hacker that they have found a good target. By selecting the right ransomware recovery solution, you can ensure that your organization has a rock-solid multi-layer defense plan in place to reduce the impact of ransomware or malware.

With APEX Backup Services accelerated ransomware recovery solution, you’ll be far less vulnerable to costly ransom demands and debilitating downtime. The APEX Backup Services cloud-based ransomware protection and accelerated ransomware recovery module prevent data loss, reduce costs, and accelerate ransomware attack response and recovery.

For more details about APEX Backup Services ransomware recovery, visit the dell.com/apex-backup-services website.

Author: Vinod Kumar Kumaresan, Principal Engineering Technologist, Data Protection Division

LinkedIn

Read Full Blog
APEX APEX Private Cloud

Serverless Workload and APEX Private Cloud

Juan Carlos Reyes

Wed, 11 May 2022 17:45:02 -0000

|

Read Time: 0 minutes

What is a serverless service?

To begin answering this question, let’s build upon my previous blog in which I walked through how a developer can deploy a machine-learning workload on APEX Private Cloud Services. Now, I’ll expand on this workload example and demonstrate how to deploy it as a serverless service.

A serverless service is constructed in a serverless architecture, a development model that allows developers to build and run applications without managing the infrastructure. Combining serverless architecture and APEX Cloud Services can provide developers with a robust environment for their application development.

Knative Serving and Eventing

Knative is a popular open-source Kubernetes-based platform to deploy and manage modern serverless workloads. It consists of two main components: Serving and Eventing.

Knative Serving builds on Kubernetes and a network layer to support deploying and serving serverless applications/functions. Serving is easy to get started with, and it scales to support complex scenarios.

The Knative Serving project provides middleware components that enable:

  • Rapid deployment of serverless containers
  • Autoscaling, including scaling pods down to zero
  • Support for multiple networking layers such as Ambassador, Contour, Kourier, Gloo, and Istio for integration into existing environments
  • Point-in-time snapshots of deployed code and configurations

Knative Eventing enables developers to use an event-driven architecture with serverless applications. An event-driven architecture is based on the concept of decoupled relationships between event producers that create events and event consumers, or sinks, that receive events.

Examples of event sources for applications include Slack, Zendesk, and VMware.

Deployment demo

Following the Knative installation instructions, I configured Knative in my cluster. Next, I configured real DNS in my environment.

I also installed the Knative CLI through homebrew to make deploying of Knative services easier. Using the kn CLI, I wrapped my flask server in the serving framework. After a successful deployment, I used the following command to view the current Knative services:

kubectl get ksvc

You can see from the screenshots how the pods get created and destroyed as the service receives traffic.

 

Now, the serverless user interphase can request predictions from my model.

Kserve

My first attempt to wrap the TensorFlow service with Knative wasn't effective. The service dropped the opening requests, and the response times were slower. The spinning up and down of the pods was creating the delay and the execution drops. I fixed these issues by having a constant heartbeat so that the pods would stay active. Unfortunately, this workaround defeats some of the benefits of Knative. This was not the way for me to move forward.

In my quest to have the model in a serverless framework, I came across Kubeflow.

Kubeflow is a free and open-source machine-learning platform designed to use machine-learning pipelines to orchestrate complicated workflows running on Kubernetes.

Kubeflow integrates with Knative to deploy and train ML models. Kserve is the part of Kubeflow used for serving machine-learning models on arbitrary frameworks. Kserve recently graduated from the Kubeflow project, and you can configure it by itself without installing the whole suite of Kubeflow.

Following the Kserve installation guide, I configured it in my cluster.

 

Creating the YAML file for this service is straightforward enough. However, the tricky part was entering the correct storageUri for my APEX Private Cloud environment. This parameter is the path to the model’s location, and depending on the storage used, it can look a little different. For example, for APC, we need to save the model in a persistent volume claim (pvc).

Here is the YAML file code snippet I used to create the pvc:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: task-pv-claim-1
spec:
  storageClassName: vsan-default-storage-policy
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi

Once the pvc is formed, we need to copy the model to the pv. I achieved this by creating a pod and attaching the volume. After the pod is created, we can copy the model to the pvc directory.

#pv-model-store.yaml
apiVersion: v1
kind: Pod
metadata:
  name: model-store-pod
spec:
  volumes:
    - name: model-store
      persistentVolumeClaim:
        claimName: task-pv-claim-1
  containers:
    - name: model-store
      image: ubuntu
      command: [ "sleep" ]
      args: [ "infinity" ]
      volumeMounts:
        - mountPath: "/pv"
          name: model-store
      resources:
        limits:
          memory: "1Gi"
          cpu: "1"
  imagePullSecrets:
    - name: regcred

By running the following command, we can copy the model to the PVC:

kubectl cp [model folder location] [name of pod with PVC]:[new location within PVC] -c model-store
kubectl cp /home/jreyes/HotDog_App/hotdog_image_classification_model/new_model model-store-pod:/pv/hotdog/1 -c model-store

The critical part is not forgetting to add a version number to the model. In this case, I added version number 1 to the end of the path.

Once the model is stored, we can log in to the pod to verify the contents using the following command:

kubectl exec -it model-store-pod – bash

After verification, we need to delete the pod to free up the pvc.

We can now run the Kserve Inference service YAML file that will use the pvc.

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "hotdog"
spec:
  predictor:
    tensorflow:
      storageUri: "pvc://task-pv-claim-1/hotdog"

The TensorFlow serving container automatically looks for the version inside the folder, so there is no need to add the version number in the storageUri path.

After executing the YAML file, we can find the address of our Kserve service with the following command:

kubectl get isvc

With this address, we can update the resnet client to test the model.

Here are the predictions when we run the client with two different images:

We have successfully made our user interface and model use a serverless framework. The final step is to update the flask server to point to the new address.

Note: I could not get an inference service to listen to two ports at a time (REST and gRPC). My solution was to create two inference services and adjust the flask code as necessary.

Conclusion

Now, we have a complete image-recognition application on a serverless architecture. The serverless architecture grants us greater resource flexibility with autoscaling and facilitates a canary deployment for the machine-learning model. Furthermore, combining this architecture with APEX Private Cloud services provides an environment that is powerful and flexible for many edge application deployments. In my next blog, I will cover migrating the application to the public cloud to compare the differences and provide a cost analysis.

Until next time!

 Author: Juan Carlos Reyes

Read Full Blog
PowerStore PowerScale APEX

Dell APEX Data Storage Service (DSS) in Colocation

Vincent Shen

Wed, 19 Jan 2022 21:03:40 -0000

|

Read Time: 0 minutes

With the December 2021 update for APEX DSS, Dell Technologies has now an option to provide a colocation capability for APEX DSS customers. This article will walk you through this new feature in the following aspects:

  • APEX DSS in Colocation: Overview
  • APEX DSS in Colocation: Architecture
  • APEX DSS in Colocation: Shared responsibility model

APEX DSS in Colocation: Overview

Dell Technologies APEX Data Storage Services is an as-a-Service portfolio of scalable and elastic, outcome-based storage resources delivered so that customers only pay for what they use with the ability to scale up and down, delivered to the service level they need with infrastructure that is owned and maintained by Dell Technologies.

APEX Data Storage Services in colocation are storage services hosted at Dell Technologies’ partners that provide colocation data centers for customers and the deployment is in Dell-managed colocation facilities. Dell Technologies offers leading storage solution services for file, block, and object storage, backed by proven, best-in-class Dell storage technologies. File and Object storage are provided with Dell PowerScale appliances; Block storage is provided with Dell PowerStore appliances.

Storage Services includes a core set of infrastructure management capabilities, from deployment to ongoing monitoring, operations, optimization, and support, plus a clearly defined process for renewals and decommission at the end of service. A self-service portal console, the APEX Console, allows customers to identify, configure, deploy, monitor, and expand the solutions quickly. As non-colo APEX deployment, you can file a service ticket for advanced operations.

APEX DSS in Colocation: Architecture

The following figure shows the overall architecture of the APEX DSS in Colocation.

Dell Technologies data centers host the APEX Console and APEX backend systems. APEX Console is a secure portal for customers to manage and monitor their storage in the APEX Data Storage Services in colocation.

The Management Zone in the colocation is used for managing the service components in the management and customer zones, including availability management, patch management, and logging and monitoring.

The Customer Zone is where the storage appliances reside, and where customer data is stored. Customers have three options for accessing the customer zone and its data:

  • (A) From the customer’s own colocation environment
    • Data access to customer’s own colocation environment. Direct connection from a customer provided instance like a bare-metal server or virtual machine in same colocation provider/partner (same provider/partner that Dell Technologies is using); connects with dedicated fabric port and VLAN.
  • (B) Through the customer’s cloud service provider (CSP)
    • Data access to customer’s cloud service provider (CSP) – Direct connection from a Cloud (hyperscaler or other cloud connection available in same colocation (same provider that Dell Technologies is using); connects with dedicated fabric port and VLAN.
  • (C) From the customer’s on-premises data center
    • Data access to customer’s on-premises data center – on-premises replication over MPLS or Direct Internet; connects into Dell Technologies colocation partner fabric with a dedicated crossconnect.

APEX DSS in Colocation: Shared responsibility model

The security of APEX Data Storage Services in colocation is a shared responsibility between Dell Technologies and the customer:

  • Dell Technologies is responsible for securing the data storage service and protecting the infrastructure that runs the service. This infrastructure is composed of hardware, software, networking, and facilities.
  • The customer is responsible for securing their data within the storage service. Ensuring data security and maintaining security controls for accessing the data are always the responsibility of the customer.

The following figure shows the areas of responsibility between Dell Technologies and customers.

The overall security of this storage service is achieved through the shared responsibilities of Dell Technologies and customers.

Conclusion

To recap, for customer-owned storage inside a customer’s on-premises datacenter, the whole stack is owned, maintained, and paid for the customer.

The difference is that when consuming Dell APEX Data Services in colocation, many responsibilities are shifted from you to Dell Technologies, relieving you from worrying about the operational burdens of securing your infrastructure.

Author: Vincent Shen







Read Full Blog
PowerScale AWS object storage APEX

How to Create Object Storage in Dell APEX Data Storage Services (DSS)

Vincent Shen

Wed, 19 Jan 2022 21:18:36 -0000

|

Read Time: 0 minutes

As of the December 2021 release of APEX DSS, Dell now supports creating object storage! APEX File Services provides multi-protocol data access and includes support for the S3 (Simple Storage Service) Object protocol.

During activation of APEX File deployments (or subsequently, in response to a Service Request), Dell Services will enable the specific data access protocols (SMB, NFS, and S3) as requested by the customer.

Object capabilities are a good fit for file users who are leveraging complex application designs that demand File and Object access to the same data, thus expanding file storage to include cloud-native workloads without the need to make a data set copy.

Here is a walkthrough of how to create S3 object storage in APEX DSS:

  1. Launch the OneFS web UI. Make sure the S3 object service is enabled by clicking Protocols > Object storage (S3) > Global settings:

2. Create the secret key for the end-user. In this case, I will create the key for the user vince. Under the Key management tab in the Object storage (S3) panel, click Select user. Select the user vince and click the button Create a key. Note the Access id and the corresponding secret key for future use. In my case they are:

Access id: 1_vince_accid

Secret key: yHVUjcEJR1u1wq3glGJleAqXyVh6

3. To create the S3 bucket, select the Buckets tab under the Object storage (S3) panel. Click the button Create bucket. In my example, I will create a bucket using the following parameters:

Bucket name: vince

Owner: vince

Path: /ifs/vince

4. Test your S3 object storage. You can use any S3 client tools for this purpose. In my case, I am using CloudBerry Explorer to set up the connection:

Note: by default, it will use an SSL certificate to encrypt the connection. The default port for HTTPS is 9021 which you can configure in the OneFS web UI under Global settings.

Conclusion

Using APEX DSS, you can easily deploy your S3 object storage in minutes. With this capability, clients can access APEX DSS file-based data as objects efficiently. OneFS S3 in APEX DSS is designed as a first-class protocol including features for bucket and object operations, security implementation, and management interface.

In our next blog, we will go through the colocation feature in APEX DSS for file.

Author: Vincent Shen



Read Full Blog
cloud video analytics APEX

Cloud-Native Workloads: Object Detection in Video Streams

Bob Ganley

Wed, 02 Mar 2022 22:18:43 -0000

|

Read Time: 0 minutes

See containers and Kubernetes in action with a streaming video analysis Advanced Driver Assistance System on APEX Cloud Services.

Initially published on November 11, 2021 at https://www.dell.com/en-us/blog/cloud-native-workloads-object-detection-in-video-streams/.

A demo may be the best way to make a concept real in the IT world. This blog describes one of a series of recorded demonstrations illustrating the use of VMware Tanzu on APEX Cloud Services as the platform for open-source based cloud-native applications leveraging containers with Kubernetes orchestration.

 This week we’re showcasing an object detection application for an Advanced Driver-Assistance System (ADAS) monitoring road traffic in video sources that leverages several open-source projects to analyze streaming data using an artificial intelligence and machine learning (AI/ML) algorithm.

 The base platform is VMware Tanzu running on APEX Private Cloud Service. APEX Private Cloud simplifies VMware cloud adoption as a platform for application modernization. It is based on Dell EMC VxRail with VMware vSphere Enterprise Plus and vSAN Enterprise available as a 1- or 3-year subscription with hardware, software and services (deployment, rack integration, support and asset recovery) components included in a single monthly price.  VMware Tanzu Basic Edition was added post-deployment to create the Container-as-a-Service (CaaS) platform with Kubernetes running integrated in the vSphere hypervisor.

Object detection in video sources requires managing streaming data for analysis, both real time and storing that data for later analysis. This demo includes the newly announced Dell EMC ObjectScale object storage platform which was designed for Kubernetes as well as the innovative Dell EMC Streaming Data Platform for ingesting, storing and analyzing continuously streaming data in real time.

The image recognition application leverages several open-source components:

  • Pravega software that implements a storage abstraction called a “stream” which is at the heart of the Streaming Data Platform.
  • Apache Flink real time analytics engine for the object detection computations.
  • Tensor Flow for the object detection model.
  • Jupyter as the development environment for data flow and visualization.

The demo shows these components running in Tanzu Kubernetes Grid clusters to host the components of the object detection demo. It looks from the perspective of a data scientist who configures the projects and data flows in the Streaming Data Platform. Also, the Jupyter notebooks are configured to push the data into the Pravega stream and display the video with the object detection.

 You can view the demo here.

 Demos like these are a great way to see how the Dell Technologies components can be combined to create a modern application environment. Please view this demo and provide us some feedback on other demos you’d like to see in the future.

 You can find more information on Dell Technologies solutions with VMware Tanzu here.

Author: Bob Ganley


Read Full Blog