Home > Workload Solutions > Computer Vision > Guides > Design Guide—Virtualized Computer Vision for Smart Transportation with Genetec > Performance results and findings
The data for one of the Archivers under load with 75 cameras is shown below:
The test VxRail Host contained 3 Genetec Archivers:
The CV testing can proceed with this Genetec design.
Our performance testing in support of the recommendation in this Design Guide was focused on the BriefCam RESPOND real-time analytics feature (Alert Processing Service). The integration between the RESPOND functionality and the Genetec VMS uses the Real-Time Streaming Protocol (RTSP) interface to access video streams with the lowest latency possible.
A real-time task processing request goes through several preprocessing steps before being able to generate alerts. The BriefCam RESPOND user interface allows operators to monitor the status of real-time tasks to track how many are queued, recovering (between the queued and processing state), and actively processing alerts. We tested performance by adding requests for new real-time alert processing tasks until we began to detect queued requests increasing but were not able to change state to processing.
We tested both the maximum numbers of a single-use case workload stream (face recognition or person detection in a restricted zone) plus various mixtures of the workload specifications. In all tests, the results were consistent. In this test, five cameras were added every 3 to 4 minutes until we began to see that new streams were remaining in a queued state.
The total CPU and GPU Utilization in the build-up to processing for 30 Face Recognition Cameras are as follows:
Detailed GPU utilization metrics are shown below:
The green line in the above chart indicates that there is a problem with data collection for the % utilization of the A40 hardware encoder. The counter was available in our monitoring tool, however, we only received zero values although we know this is an important GPU feature for CV applications. Upgrading to the latest vGPU bundle from Nvidia and VMware is expected to resolve this issue.
The BriefCam Alert Processing Service can function as a scale-out active/active cluster by installing and configuring multiple instances of the service that run on different servers. When multiple service instances are available, a cluster mesh service detects how many service instances are running and on what machines. When new real-time alert processing tasks are requested, the request is queued using the PostgreSQL database. The service mesh then assigns tasks from the queue to available servers that are running the alert processing service.
We configured 3 VMs running the BriefCam server components to process real-time alerts. Our goal was to compare the total number of streams that could be processed with a 3-node cluster compared with the single VM results. The maximum number of streams for the cluster was 95 before new requests started to build up in the queue.
The Ipsotek testing that was performed contained a mixture of Face Recognition and Object Detection scenarios. The Ipsotek platform support multiple tracking modes that can be configured for a camera:
A "Face in Watchlist" rule was created to report for people on a watchlist and used for all Face Detection cameras. An "In Zone" rule was created to detect people entering a restricted zone.
Ipsotek uses the Nvidia GPU encoder to reencode the video received from different cameras and generate synchronized video with I-Frames produced every second at a fixed bit rate.
This allows Ipsotek to:
We tested both the maximum numbers of a single-use case workload stream (face recognition vs person detection in a restricted zone) plus various mixtures of the workload specifications. In all tests, the results were consistent. Cameras were added 5 at a time until the max of 30 cameras were active.
The total CPU and GPU Utilization in the build-up to processing for 30 Face Recognition Cameras are as follows:
The breakdown of GPU utilization is as follows:
The Ipsotek platform does not currently support vGPUs, so all GPU resources must be assigned to the VM as a passthrough device. This means that it is not possible to migrate VMs as was shown in High availability validation. To build a viable Ipsotek cluster in a virtualized environment, it is necessary to provision all Ipsotek Processing nodes at system setup time. The goal is then to load the system so that sufficient capacity exists across the cluster to ensure that during a failover the cameras can migrate to an available node and continue processing.
In this testing, the approach was to load three of the four nodes in the cluster to their maximum. This ensures that there is sufficient capacity to disperse the workload across the cluster in a failure scenario. A total of 45 Face Recognition Cameras and 45 Object detection cameras were enabled.
The maximum count of active cameras across three processing nodes was 86.
The purpose of this test was to validate if a combination of three applications from three different vendors can share a common platform using VxRail and VMware without introducing processing delays. We have the individual application results from above for our baseline. Our performance data for this test was collected while the following workloads were processed in parallel.
The following chart shows the relative utilization of each host in the cluster while the system was processing the full load of 675 cameras streams, 150 of which were running real-time analytics.
When the system was under full load, a snapshot was taken of the utilization of the VMs running across the VxRail cluster.