Home > Workload Solutions > Safety & Security > White Papers > Vix Vizion Imagus Video Analytics Validation with Dell EMC PowerEdge Server and NVIDIA-T4 GPU > Performance Test Results Single VM Minimum Configuration
This testing was performed to determine system performance with a minimum configuration. A full single GPU processor board was allocated to determine a baseline with a minimum hardware configuration. We also tested the performance of a single VM using only 25% of the GPU resources so that an estimation of performance for a minimally resourced multi-tenant configuration could be obtained.
The following graph shows that with one stream the system easily processes at 12 fps with very little deviation.
Figure 1 Processed frames per second
The CPU utilization stays primarily between 5% and 15% with some excursions.
Figure 2 CPU Utilization
GPU utilization remained between 5% and 15% with some variations.
Figure 3 GPU utilization.
The following graph shows a more complete picture of the CPU, GPU, Frame Buffer, and System Memory utilization.
Note: Hyperthreading was enabled – The CPU count is for vCPUs not physical CPU or cores. Using hyperthreading a single CPU core can support two vCPUs.
The System memory utilization was steady at 69% and the GPU utilization never exceeded 30% and spent most of the time around 10% to 15%. The GPU frame buffer was pretty steady at around 35%. CPU utilization was more extensive in range but most of the activity stayed below 20%. The T4-16Q shows that we were allocating the entire resources of a single GPU.
Figure 4 GPU and System Utilization (T4-16Q-16GB FB)
This test uses only a quarter of the GPU capability and frame buffer, and the same 2 core, 8 GB memory configuration as the first test. The test was started with a single input stream, and then a second video stream was added during the test to show the change in resource utilization with the addition of a second camera. We can see that even with 25% of the full T4 resources the system is not overloaded. The CPU utilization starts having excursions above 80% utilization with the addition of a second stream. The GPU is still operating comfortably below 30% and the decoder is staying below 10% utilization.
The following graph shows System Resource Utilization with one stream and then adding a second stream.
Figure 5 GPU and System Utilization (T4-4Q-4GB FB)
This system was allocated 25% GPU capacity (25% processor and 4 GB GPU memory) with 4 vCPUs and 8 GB system memory. The system is processing one stream with steady performance at 12 fps. When a second stream is added we see some effect on the lengthening of the frame processing time, but the overall the drop is not outside acceptable limits. The value for the 95% confidence interval with 2 streams was 11.5 fps.
Figure 6 Processed fps with the addition of a second camera stream
The following graph shows that the performance at 12 fps is relatively steady (The solid line represents the quartiles which are basically compressed into a solid line) with a number of outliers. The 95% confidence interval was 11.6 fps.
Figure 7 Two stream performance with the addition of 2 cores and 8 GB of memory (Full GPU).
Figure 8 System Resource Utilization with one full GPU
The previous graph shows the System Resource Utilization with one full GPU (T4-16Q16GB Memory) 8 vCPUs, 16 GB System RAM, and two camera streams. All resources are well below critical levels.
This test was run on a single VM with a single full GPU (T4-16Q) with 16 GB memory. There were 16 vCPUs and 24 GB of system memory. The frame processing time was getting stretched out as more streams were added. The average frame processing time was 59.2 ms which should support 6 streams at 12 fps (6x59.2= 355 ms, 1000/12 = 83 ms) however there were many instances when the frame processing time within a window would expand to 110-150 ms/frame/stream. This results in the pattern we see below where there are lengthy excursions below 12 fps for all streams for long intervals of time. The sampling rate on the graph below was 5 seconds. We concluded that six streams for a single GPU was the maximum number of streams without degrading to unacceptable performance.
Figure 9 GPU and System Resource Utilization (6 Streams)
The previous graph shows system utilization well below critical levels, the 90% confidence interval is 8.2 fps and the 95% confidence level was 7.23 fps. Single GPU T4-16Q, 16 vCPUs, 16 GB RAM.