A camera at a POS captures a video stream that is analyzed to detect the presence of products in the scanning area. The Real Time Streaming Protocol (RTSP) is used as the streaming protocol for video in the solution. The RSTP stream is received and decoded using one or more NVIDIA GPUs, as shown in the following figure:
This Ready Solution for Retail Loss Prevention supports cameras with up to 4k resolution. H.264 is a digital video compression standard that uses half the space of the standard for DVDs for equal quality video. H.264 and H.265 are supported for the coding and decoding. NVIDIA DeepStream SDK is used for decoding the video in the T4 GPU. The decoded video stream is then sent to the product recognition model.
In this solution, the product recognition model is a key differentiator from traditional asset protection approaches. A deep learning-based AI model from our software partner Malong Technologies is used for product recognition. The decoded video stream from the T4 GPU is fed into the input layer of the deep learning model. The AI model outputs a list of the most likely product matches in that video frame.
Simultaneously, the retail shopper scans the UPC barcode of the product. The UPC barcode scan is fed into the decision module. The decision model compares the outputs from the deep learning model and the UPC barcode-scanned product from the retail database:
The retail store then takes the appropriate action.
Importantly, the AI algorithm only considers the visual image of the item that is being scanned. It does not take into consideration the image of the shopper, ensuring customer privacy and no bias, which is a fundamental principle in responsible AI.