Training and validating new deep neural networks, such as those used in ADAS / AD development, require large datasets along with significant IT infrastructure that includes compute, networking and storage. The right infrastructure is crucial for safety-critical system development. These advanced algorithms must operate even within complex circumstances like varying weather conditions, visibility and road surface quality.
Key challenges of the DL training workload for ADAS are:
- Explosive Data Growth: A typical vehicle used for data collection in the ADAS system test use case includes multiple sensors such as LiDAR, RADAR, ultrasonic, GPS and cameras – all of which continuously generate data. Also, the vehicle controller area network (CAN) bus data and test driver captures the control information. This high level of visibility and redundancy builds a detailed picture to enable the vehicle to make reliable decisions in adverse weather conditions or in the event of an individual component failure. Due to the safety requirements for driving, development engineers need to ensure that the system used can detect objects sufficiently far away to operate safely at high speeds. This combination of vehicle speed and critical safety demands higher image resolutions than used in other industries, which in turn generates more data. Massive challenges occur in terms of the scale of the unstructured sensor data (videos, cloud point, images, text) that must be captured and replayed to test ADAS subsystems.
To illustrate, a typical SAE Level 2 ADAS project, capturing 200,000 km of driving at an average speed of 65 km/h, would generate over 3,076 hours of test data, requiring approximately 6.2 petabytes (PB) of storage. Note that even within SAE L2 solutions, the total number of ADAS sensors required varies with functionality (lane departure warning, self-parking, and more). Multiple sensors are typically required. A typical SAE Level 3 ADAS project, which typically requires 1,000,000 km of driving, could generate 30 PB of raw sensor test data per test vehicle. As most ADAS developers have multiple cars, typical total storage for a single production vehicle development averages between 50 – 200 PBs of data.
- Fast training cycle: To assure safety and reliability, the neural networks designed must utilize millions of parameters which generate more compute-intensive requirements for the underlying systems and hardware architecture. To accelerate time-to-market, neural network training must be as fast and efficient as possible. First, the deeper the network, the higher the number of parameters and operations needed to store many intermediate results in GPU memory. Second, training usually proceeds in the method of mini-batches, I/O throughput is thus the primary performance metric of concern in DL training.
- Test and validation: Validation is a key stage of the ADAS development cycle. Since most ADAS systems are intended to improve safety, the robustness and reliability of the trained model is paramount. This demands exhaustive testing and verification on the trained algorithm to represent diverse traffic scenarios and dimensions, which might include road geometry, driver and pedestrian behaviors, traffic conditions, weather conditions, vehicle characteristics and variants, spontaneous component faults, security, and more.
- High quality labeled data: The availability of labeled data is critical for ADAS DL training. High quality labeled data yields better model performance. Labels are added either manually (often via crowd sourcing) or automatically by image analysis, depending on the complexity of the problem. Labeling massive collections of training data is a tedious task and requires significant effort.