Home > Servers > PowerEdge Cyber Security > White Papers > Securing AI workloads on Dell PowerEdge with Intel Xeon processors using Intel Trust Domain Extensions > Test results
The outcome of the tests is tabulated in the following table:
| 128 Input Tokens/32 Output Tokens | 2048 Input Tokens/32 Output Tokens | ||||
Precision | FP32 | BF16 | INT8 | FP32 | BF16 | INT8 |
TDX | 61.2 | 35.9 | 25.63 | 67.03 | 39.93 | 29.33 |
NO TDX | 59.93 | 34.97 | 24.93 | 65.2 | 38.8 | 28.83 |
TDX Overhead | 2.12% | 2.66% | 2.81% | 2.81% | 2.91% | 1.73% |
Table 1 presents 5th Gen Intel® Xeon® Platinum 8562Y+ for LLama2-7B performance with TDX vs. without TDX and shows an overhead of less than 3% across fp32, bf16, and int8. For most use cases, such overhead is minimal and acceptable, reinforcing the fact that the benefits of security need not be traded off for performance.
Figure 4 presents 5th Gen Intel® Xeon® Platinum 8462Y+ for LLama2-7B performance secured with TDX and shows a performance improvement of 1.7x for bf16 and 2.3x for int8 using Intel® AMX over fp32 using Intel® AVX512. The performance acceleration benefits of Intel® AMX for AI workloads continue to be seen even when the security is enabled with TDX.
Benchmark results are highly dependent upon the workload, the specific application requirements, and the system design and implementation. Relative system performance will vary due to these and other factors. Therefore, this workload should not be used as a substitute for a specific customer application benchmark when critical capacity planning and/or product evaluation decisions are contemplated.
All performance data contained in this report was obtained in a rigorously controlled environment. Results obtained in other operating environments may vary significantly. Dell Technologies does not guarantee or represent that a user can or will achieve similar performance results.