Home > AI Solutions > Artificial Intelligence > White Papers > Automate Machine Learning with H2O Driverless AI on Dell Infrastructure > Sizing of AutoML infrastructure
This section provides sizing recommendations for H2O Driverless AI platforms. Data scientists deploy individual instances for each experiment and each instance can be sized separately.
Consider several factors for sizing H2O Driverless AI deployment, such as the number of projects, the type of use cases, the number of concurrent users, and the number of active tasks. The size of the dataset drives storage and memory requirements. With its modular and container design, a H2O Driverless AI deployment can be scaled easily depending on resource use.
CPU and GPU recommendations—H2O Driverless AI containers with CPUs only can be used for exploratory data analysis and classical machine algorithms that do not require GPU acceleration. For example, data scientists use these templates to work on problems related to classification, regression or clustering. H2O Driverless AI benefits from multi-core CPUs with sufficient system memory and GPUs with sufficient RAM, as it can schedule and run experiments in parallel.
H2O Driverless AI containers with GPUs can be useful for building and training deep learning models. Some of the deep learning problem types include image classification and NLP. Ampere-based NVIDIA GPUs are also supported on x86 processors, as H2O Driverless AI ships with the NVIDIA CUDA 11.2.2 toolkit. NVIDIA MIG capability also enables effective partitioning of GPUs for various use cases and increases GPU use.
Memory recommendations—As guidance, the memory requirement per experiment is approximately five to 10 times the size of the dataset. Dataset size can be estimated as the number of rows x columns x 4 bytes; if text is present in the data, more bytes per element are needed.
H2O Driverless AI supports automatic queuing experiments to avoid system overload. You can launch multiple experiments simultaneously, which are automatically queued and run when the necessary resources become available.
We recommend three sizing deployments for our validated design:
The following table describes the recommended sizing:
Deployment | Number of H2O Driverless AI instances and total resource requirements | Recommended storage |
Minimum production | 5 H2O Driverless AI instances with total resource recommendation:
| 5 TB |
Mainstream | 10 H2O Driverless AI instances with total resource recommendation:
| 10 TB |
High performance | 20 H2O Driverless AI instances with total resource recommendation:
| > 20 TB |