Home > Storage > PowerScale (Isilon) > Industry Solutions and Verticals > Analytics > PowerScale Deep Learning Infrastructure with NVIDIA DGX A100 Systems for Autonomous Driving > Data ingestion and data management
Here are some key design considerations for data ingestion and data management for ADAS DL workflow:
DMS is integrated into the managed service offering for ingesting data from test vehicles in an automated fashion. Specific metadata types and tags are configurable and can be customized for each customer. The service provider can manage logistics to receive the storage media from the vehicles and insert them into ingestion stations. This will automatically initiate processes that dynamically decide which cluster/location to upload the raw data to—in such a fashion that all ingestion processes are well balanced across the storage environment. DMS then initiates the data transfer automatically while enriching the data with appropriate metadata tagging, resulting in all data and tags being logged in a global index that tracks the data sets across the namespace and clusters. At the completion of the data transfer and indexing, DMS can launch post-processing jobs within the DL and HPC server grid as preconfigured through the scheduling platform. Functions available can include accurately splitting and merging data files as required, decoding the data into human readable form, automatically updating the index upon completion and other functions as defined by the customer. These processes are wrapped within a series of real-time monitoring, status dashboards, and logs to alert of any errors or anomalies.
DataIQ also equips organizations with the ability to rapidly search across billions of files and locate data, which can then be precisely moved or copied to on-prem or public cloud storage as needed, ensuring the right projects have access to pertinent data. This can help free up space on higher performance tiers and enables organizations to archive data as long as is required. The cost savings can be immense, and data can be retrieved easily from a well-curated, searchable archive for re-simulation and DL development in the future.