Home > AI Solutions > Gen AI > White Papers > Dell PowerScale Storage Reference Architecture for NVIDIA DGX SuperPOD > ResNet-50
ResNet-50 is the canonical image classification benchmark. Its dataset size is over 140 GiB, and it requires fast data ingestion. On a DGX system, a single node training requires approximately 3 GiB per second and the dataset is small enough that it can fit into cache. Preprocessing can vary, but the typical image size is approximately 128 KiB. One challenge of this benchmark is that at NVIDIA the processed images are stored in the RecordIO format (that is one large file for the entire dataset) since this format provides the best performance for MLPerf. Since it is a single file, this method can stress shared file system architectures that do not distribute the data across multiple targets or controllers.