Home > AI Solutions > Gen AI > Guides > Generative AI in the Enterprise with AMD Accelerators > High-Performance Linpack
High-Performance Linpack (HPL) is a benchmark that solves a uniformly random system of linear equations and reports a floating-point execution rate using a standard formula for operation count. See https://github.com/amd/InfinityHub-CI/tree/main/rochpl for the compiling instructions and run steps for rocHPL, the ROCm version HPL for AMD.
The following command is used to run HPL benchmark for 8x MI300X:
./mpirun_rochpl -P 2 -Q 4 -N 430080 --NB 512
The following table summarizes the results. These performance results are tested under ROCm 6.1.1. The MI300X accelerator is 2.88 times faster than MI210 accelerator per GPU, and 5.76x faster compared to the PowerEdge R760xa system with four previous generation MI210 GPUs. A single PowerEdge XE9680 server with eight MI300X accelerators can deliver an impressive 415 TFLOPs.
Number of GPUs | R760xa_MI210 | XE9680_MI300X | Per GPU | MI300X over MI210 per card |
4 | 72 |
| 18 |
|
8 |
| 415 | 52 | 2.88x |