The generative AI solution architecture addresses three primary workflows:
Each of these workflows has distinct compute, storage, network, and software requirements. The solution design is modular and each of the components can be independently scaled depending on the customer’s workflow and application requirements. Also, some modules are optional or swappable with equivalent existing solutions in an organization’s AI infrastructure such as their preferred MLOps and Data Prep module or their preferred data module. The following table shows the functional modules in the solution architecture:
Table 1. Functional architecture modules for generative AI solution
Module | Description |
Training | Module for AI optimized servers for training, powered by PowerEdge XE9680 and XE8640 servers with NVIDIA H100 GPUs |
Inferencing | Module for AI optimized servers for inferencing, powered by PowerEdge XE9680 servers with NVIDIA H100 or R760xa servers with NVIDIA L40 or L4 GPUs |
Management | Module for system and cluster management, including a head node for NVIDIA BCM, powered by PowerEdge R660 servers. |
MLOps and Data Prep | Module for machine learning operations and data preparation for running MLOps software, database, and other CPU-based tasks for data preparation, powered by PowerEdge R660 servers |
Data | Module for high-throughput, scale-out Network Attached Storage (NAS) powered by Dell PowerScale, plus high-throughput scale out object storage powered by Dell ECS and ObjectScale |
InfiniBand | Module for very low latency, high-bandwidth GPU-to-GPU communication, powered by NVIDIA QM9700 InfiniBand switches |
Ethernet | Module for high throughput and high-bandwidth communication between other modules in the solution powered by Dell PowerSwitch Z9432F-ON |