Dell evaluated the potential benefits of Granulate on a cluster of 11 Dell PowerEdge servers (one head and 10 compute nodes) powered by Intel® 3rd Generation Xeon® Scalable Processor on Cloudera Data Platform Base 7.1.7. The cluster ran a set of end-to-end data science pipelines. The pipelines included model training of relevant machine learning (ML) and deep learning (DL) algorithms as part of their workflow. These algorithms included Logistic Regression (LogR), Support-Vector Machines (SVMs), Seasonal ARIMA (Holt-Winters), K-means clustering, and Recurring Neural Networks (RNN).
These algorithms have been extensively applied in both industry and academia to solve different types of problems, from fraud detection to speech translation. They are perfectly suited for testing Granulate’s capabilities to optimize workloads with different runtime characteristics. Table 1 presents the characteristics of the algorithms used for this test. The characteristics of systems used for the test are shown in Table 2.
Table 1. Characteristics of ML and DL algorithms used for testing
Domain | Algorithm | Class | Datatype |
Fraud detection | Logistic Regression (LogR) | Classification | Number |
Hardware failure | Support-Vector machines (SVMs) | Classification | Number |
Price prediction | Recurrent Neural Network | Regression | Text |
Sales forecasting | Seasonal ARIMA(Holt-Winters) | Regression | Number |
Customer segmentation | K-means clustering | Clustering | Number |