Bare Metal Compared with Kubernetes
Thu, 04 Jun 2020 16:19:26 -0000|
Read Time: 0 minutes
It has been fascinating to watch the tide of application containerization build from stateless cloud native web applications to every type of data-centric workload. These workloads include high performance computing (HPC), machine learning and deep learning (ML/DL), and now most major SQL and NoSQL databases. As an example, I recently read the following Dell Technologies knowledge base article: Bare Metal vs Kubernetes: Distributed Training with TensorFlow.
Bare metal and bare metal server refer to implementations of applications that are directly on the physical hardware without virtualization, containerization, and cloud hosting. Many times, bare metal is compared to virtualization and containerization is used to contrast performance and manageability features. For example, contrasting an application on bare metal to an application running in a container can provide insights into the potential performance differences between the two implementations.
Figure 1: Comparison of bare metal to containers implementations
Containers encapsulate an application with supporting binaries and libraries to run on one shared operating system. The container’s runtime engine or management applications, such as Kubernetes, manage the container. Because of the shared operating system, a container’s infrastructure is lightweight, providing more reason to understand the differences in terms of performance.
In the case of comparing bare metal with Kubernetes, distributed training with TensorFlow performance was measured in terms of throughput. That is, we measured the number of images per second when training CheXNet. Five tests were run in which each test consecutively added more GPUs across the bare metal and Kubernetes systems. The solid data points in Figure 2 show that the tests were run using 1, 2, 3, 4, and 8 GPUs.
Figure 2: Running CheXNet training on Kubernetes compared to bare metal
Figure 2 shows that the Kubernetes container configuration was similar in terms of performance to the bare metal configuration through 4 GPUs. The test through 8 GPUs shows an eight percent increase for bare metal compared with Kubernetes. However, the article that I referenced offers factors that might contribute to the delta:
- The bare metal system takes advantage of the full bandwidth and latency of raw InfiniBand while the Kubernetes configuration uses software defined networking using flannel.
- The Kubernetes configuration uses IP over InfiniBand, which can reduce available bandwidth.
Studies like this are useful because they provide performance insight that customers can use. I hope we see more studies that encompass other workloads. For example, a study about Oracle and SQL Server databases in containers compared with running on bare metal would be interesting. The goal would be to understand how a Kubernetes ecosystem can support a broad ecosystem of different workloads.
Hope you like the blog!
Related Blog Posts
SAP HANA Tiering: The Pressures of Data Growth
Fri, 01 May 2020 14:24:20 -0000|
Read Time: 0 minutes
“Data growth is accelerating!” Quotes like this appear frequently in studies, papers, and blogs. You will not find more data growth quotes in this blog article, however, because I think it is more interesting to look at this from a data management investment policy perspective. A data management investment policy has similar benefits to a corporate travel investment policy—the goal is to efficiently maximize the bottom line. In most customer accounts, the SAP HANA licensing investment happens early and the business must maximize the benefits in the long term. First cost is not the sole driver because volume, variety, veracity, and velocity are all considerations when a company is looking for a strategy. Evaluating a data management investment policy for the long term can be complex. SAP provides Native Storage Extensions (NSE) to address both cost pressures and the intelligent placement of data over time.
SAP HANA with NSE offers the functionality of tiering data using different storage solutions based on the age of the data. The NSE data management policy categorizes data into three classes: hot, warm, or cold. This blog post focuses on the hot and warm data tiers. Hot data can use both volatile and nonvolatile memory, as follows:
- DRAM: DRAM is the fastest storage media. DRAM is volatile, however, meaning the data must be loaded into memory on restart of the database or server.
- PMEM: Persistent Memory (or PMEM) is faster than SSD storage but not as fast as DRAM. PMEM is also nonvolatile memory, meaning the data does not have to be loaded into memory on restart of the database or server. PMEM is used for the SAP HANA Fast Restart option.
SAP HANA on-premises Native Storage Extension
If you are interested in learning more about maximizing your data management investment strategy, the SAP HANA TDI on Dell EMC PowerEdge Servers Validation Guide provides detailed configurations. The hot data tier both drives the fastest performance and is the most expensive tier (hardware + SAP HANA licensing + annual support). Maximizing the performance-to-cost trade-off of hot data placement requires consideration of two factors:
- Keep actively used data that is critical to the business in the hot data tier
- Migrate less frequently used data out of the hot data tier to sustain costs
The first consideration is a performance guideline for when the responsiveness of the database and applications is at a premium for the business. Data that is less frequently used can be placed in the warm tier to minimize the impact on queries on the hot data tier. Another benefit is related to SAP HANA restarts. For example, planned maintenance events involving Linux operating system or SAP HANA database updates can require a restart. An SAP HANA system with NSE could have less data in the hot tier compared to the same SAP HANA system without NSE, thus improving restart times.
NOTE: SAP HANA also has a Fast Restart Option that uses file storage to speed up restarts. Fast Restart leverages PMEM to accelerate file storage access, significantly reducing the start time of the database. SAP HANA Fast Restart applies to scenarios in which only SAP HANA is restarted and not the operating system.
The second consideration is an avoidance guideline that sustains existing investments while mitigating additional ones. Success could be defined as a strategy in which performance increases with each new server generation while SAP HANA costs remain constant if the size of the hot data tier remains the same. The business impact is a continual increase in performance combined with efficiently maximizing the bottom line.
The warm data tier is for less frequently accessed data that occasionally resides in SAP HANA memory. If kept in memory, warm data accelerates costs, mainly through the additional licensing that is needed to increase the memory size. To mitigate the impact of rapid data growth, maximize the usage of the warm data tier. Keep in mind that the warm data tier is limited to four times the size of the hot data tier. For example, a hot data tier of 1 TB means the warm data tier can be up to 4 TB. The warm data tier also cannot exceed 10 TB in size. The 10 TB maximum is a first-release restriction.
Data in the warm tier is transactionally consistent with the hot data tier. This means that the warm data tier must be protected in conjunction with the hot data tier so that the entire database backup is consistent. While the hot and warm data tiers are transactionally consistent, they differ in how data is loaded into memory. The hot data tier is ”column loadable,” meaning the columnar tables are loaded into memory. In contrast, the warm data tier is “page loadable,” meaning granular portions of data are loaded into memory or partially in memory. The page-loadable design has two key benefits for the warm data tier:
- It does not significantly impact the memory footprint.
- It does not significantly impact the start time of the database.
Use of the warm data tier depends on the SAP HANA NSE buffer cache. This buffer cache is enabled by default and is initially sized as 10 percent of SAP HANA memory (for the sizing reference, see the SAP HANA Administration Guide for SAP HANA Platform 2.0 SPS 04). For example, the NSE buffer size is recommended to be at least 12.5 percent of the total size of the warm data tier. You can modify the NSE buffer cache size by using the ALTER SYSTEM ALTER CONFIGURATION command.
Warm Data Tier
Overall, use of the warm data tier enables customers to balance fast performance with increased data volumes while minimizing cost, thus achieving greater value. Customers have the flexibility to design an amazingly fast warm data tier with storage I/O latencies measured in microseconds, narrowing the difference between the hot and warm data tiers in terms of performance.
Dell Technologies has a team of experienced SAP HANA experts that can assist with accurate sizing and design of an infrastructure solution for your databases. Our goal is to work closely with you to maximize the value of NSE and create an extremely fast warm data tier that narrows the performance gap with the hot data tier. Your Dell Technologies representative can put you in contact with one of our SAP HANA experts.
SQL Server in containers: Dell EMC CSI plug-in—It's about manageability!
Mon, 30 Mar 2020 18:46:49 -0000|
Read Time: 0 minutes
A picture can be worth a thousand words, however, not every slide in a presentation is self-explanatory and sometimes even the speaker notes don’t provide enough real estate to cover the full meaning of the content. That happened to me recently with this slide in a technical presentation that I created:
The unanswered question was what does this sentence mean? - “Get fixes and upgrades faster as Dell EMC’s plug-in doesn’t require Kubernetes updates and upgrades!” I wrote this blog give more background and details about that statement. Before we can get to that, let’s discuss the value that the CSI plug-in has for customers using XtremIO X2 and VxRack FLEX. The CSI is a standard used by Dell EMC and other storage providers to provide an interface for container orchestration systems to expose storage services to containers. Thus, the CSI plug-in enables orchestration between containers and storage via Kubernetes. Other orchestration systems such as Mesos, Docker, and Cloud Foundry also use the same CSI specification for managing containers and storage together.
The CSI plug-in has another advantage for both orchestration systems (like Kubernetes) and the storage providers. For example, Kubernetes development can progress independently without requiring storage vendors to check code into the core Kubernetes repository. Similarly, the storage vendors update the CSI plug-in only when required and not with every update or upgrade of Kubernetes. Overall there is less complexity for both Kubernetes developers and storage vendors because the CSI plug-in simplifies the integration between the orchestration and storage layers. Thus, the CSI plug-in enables faster fixes and upgrades by Dell EMC to work with Kubernetes. I hope that answers the question from above. You can also take a look at this Kubernetes blog that goes into greater detail: Introducing Container Storage Interface (CSI) Alpha for Kubernetes.
We also recently wrote a white paper about SQL Server Containers that provides an overview of how the XtremIO X2 features available with our CSI plug-in can be used with SQL Server 2019 Linux containers . Here is a shortcut to the CSI plug-in overview in the paper. With the CSI plug-in, the Kubernetes administrator can:
- Dynamically provision and decommission volumes
- Attach and detach volumes from a host node
- Mount and unmount a volume from a host node
The Kubernetes administrator can even use the XtremIO X2 snapshot capabilities to provision a copy of the SQL Server. It’s these capabilities that really make automation and orchestration of SQL Server containers easier and faster. Want to learn more? The SQL Server Containers white paper is the right starting place because it takes you through the technology and shows how the XtremIO X2 CSI plug-in with Kubernetes and Docker can address traditional challenges.
Please rate this blog and provide us with ideas for future solutions. Thanks!