Home > AI Solutions > Gen AI > White Papers > Dell Scalable Architecture for Retrieval-Augmented Generation (RAG) with NVIDIA Microservices > Hardware design
PowerEdge R760xa is a high-performance, scalable server for intensive GPU applications. The R760xa is a purpose-built server designed to boost acceleration performance for AI workloads like inferencing and RAG. We use two R760xa servers, each containing 2x NVIDIA H100 GPUs for 4 GPUs across 2 R760xa's. The PowerEdge R760xa can connect to the PowerScale using high-speed 100 GB Ethernet networking. A third server, such as the PowerEdge R660, should be used to create a 3-node Kubernetes deployment.