Home > Storage > PowerFlex > White Papers > NVIDIA Riva on Red Hat OpenShift with Dell PowerFlex > Test methodology
Speech recognition in Riva is a GPU-accelerated compute pipeline with optimized performance and accuracy. Riva supports offline/batch and streaming recognition modes.
Automatic Speech Recognition (ASR) takes an audio stream or audio buffer as input and returns one or more text transcripts, along with additional optional metadata.
The text-to-speech (TTS) pipeline that is implemented for the Riva TTS service is based on a two-stage pipeline. Riva first generates a mel-spectrogram using the first model, and then generates speech using the second model. This pipeline forms a TTS system that enables you to synthesize natural sounding speech from raw transcripts without any additional information such as patterns or rhythms of speech.
For this paper, the PowerFlex engineering team chose the most common use cases of Riva ASR and Riva TTS along with basic performance tests were chosen to demonstrate that the PowerFlex family is well suited for NVIDIA A100 GPUs on Red Hat OpenShift environment.
For Riva ASR, they considered the following use cases:
For Riva TTS, they considered the following use cases: