Data Transform

Thank you for your feedback!

Training neural networks with images requires developers to first normalize those images. Moreover, images are often compressed to save storage. Developers have therefore built multi-stage data processing pipelines that include loading, decoding, cropping, resizing, and many other augmentation operators. Here are some key design considerations for data ingestion and data management for an ADAS DL workflow:
- Data augmentation strategy: Dataset augmentation applies transformations to training data. Transformations can be as simple as flipping an image, or as complicated as applying neural style transfer. By performing data augmentation, it is possible to increase training dataset size and diversity. This can help to prevent a neural network from learning irrelevant patterns (common with small datasets), essentially boosting overall performance quality. Here are two common methods for data augmentation:
 Offline augmentation is to create a new augmented data which stored on storage. This can help effectively increasing the training sample size many times over with variety of different augmentation technologies.
 Online augmentation is to apply augmentation to data in real time before the images are fed to the neural network. CPUs are used heavily in background / parallel for online augmentations during the training.
- Use NVIDIA Data Loading Library (DALI): The NVIDIA Data Loading Library (DALI) is a portable, open source library for decoding and augmenting images, videos, and speech to accelerate DL applications. As shown in Figure 8, DALI is a set of highly optimized building blocks plus an execution engine to accelerate input data pre-processing for DL applications. DALI provides performance and flexibility for accelerating different data pipelines. DALI reduces latency and training time, mitigating bottlenecks, by overlapping training and pre-processing. It provides a drop-in replacement for built in data loaders and data iterators in popular DL frameworks for easy integration or retargeting to different frameworks. Here are some key features:
        Easy-to-use Python API
        Transparent scaling across multiple GPUs
        Accelerated image classification (ResNet-50), object detection (SSD) workloads and speech recognition models such as Jasper and RNN-T.
        Flexible graphs let developers create custom pipelines
        Supports multiple data formats - LMDB, RecordIO, TFRecord, COCO, JPEG, wav, flac, ogg,
        H.264 and HEVC
        Developers can add custom audio image and video processing operators For more information, refer to NVIDIA blog.
Figure 8.   DALI inside architecture

Your Browser is Out of Date

Data Transform

Data Transform