Home > AI Solutions > Artificial Intelligence > White Papers > Dell Automotive Reference Architecture > Overview
In Figure 2, the forward path shows data being collected from test vehicles that serves as the starting point of an ADAS/AD control loop that implements continuous model improvement, and model V/V. Starting with the vehicle sensor buildup, data is recorded, filtered, and collected inside the vehicle and moved from the vehicle to a copy station. Data storage requirements for a copy station vary depending on fleet size, amount of In-Vehicle persistent storage, and size of individual solid state drives.
In the copy station, data is curated and filtered again based on priority. Copy stations (the red box in Figure 2) usually resides close to the data center and are sized with enough storage to accommodate an entire fleet of vehicles. The copy station is typically attached with a Fast Ethernet connection to a data lake inside the data center. Data enrichment (the first blue box) is inside the data center and the beginning of DARA functionality.
Data enrichment for autonomous driving is a critical preprocessing step to enhance the quality and usefulness of the collected data. It supplements raw sensor data with additional objects, weather conditions, and lighting conditions called “scene decoration” to improve the performance of the autonomous driving system. This data is referred to as Synthetic Data Generation (SDG). Data enrichment plays a crucial role in building robust and reliable autonomous driving systems by providing a more comprehensive and diverse dataset for training and testing. It enhances the perception and decision-making capabilities of the autonomous vehicle, making it better equipped to handle real-world driving scenarios.
AD companies typically curate data by equipping test vehicles with many sensors (video, RADAR, LiDAR, SONAR, GPS, IMU) and driving around, recording and saving sensor data in an onboard storage repository. After a drive, the data is uploaded to a data lake for cleaning and prepared for AI model development use. This process is not only expensive, but difficult to impossible to capture unique weather conditions, varied lighting environments, capturing underrepresented objects and re-creating dangerous scenarios. SDG has become increasingly popular because it allows a data engineer to take an existing scene and address all these issues at a lower cost. Typically, generating each scene is performed manually based on the particular training gap.
Coupling generative AI with SDG solves the manual process of using SDG to create photorealistic scenes automatically from a single prompt. Generative AI is also used by a few forward-thinking companies that use the equivalent of an LLM called a Vision Language Model (VLM) that makes autonomous driving decisions, allowing the vehicle to “learn as it drives.” This process is known as end-to-end AI or AV2.0.
Other key data enrichment tasks include data cleaning, data balancing, Ground Truth annotation, semantic segmentation, instance segmentation, and sensor fusion for unfused data.
After the raw data scenes go through the data enrichment process, labeled and segmented images are ready for AI training. The following sections describe the AI software stack required to create and develop AI models.
The green boxes in Figure 2 consist of the return path of the control loop and focus on HiL and sensor integration, which is complementary to the SiL forward path as shown in the figure. The return path employs physical ECUs and sensors to bolster the precision of AI models and their interaction with sensors. This step serves as the final preparation before deploying the AI-trained models into the test vehicle for further test drives, effectively completing the control loop.