A modern data stack is an open data management architecture that combines the best aspects of a data lake and a data warehouse in a single platform.
A traditional data lake tends to be flexible and cost-effective by storing data in its raw or natural form typically unstructured or semistructured. A data warehouse is a more advanced repository of data for reporting and analysis that tends to store data that is more structured. The data has typically been cleansed or operationalized for better data quality, often the result of extract-transform-load (ETL) or extract-load-transform (ELT) operations.
Data analytics usage is increasingly widespread and changing in nature. Those trends, coupled with the requirement to access large amounts of disparate data by many different users, require a modern, integrated approach to data access. A modern data stack combines the best of data lakes and data warehouses, supporting business intelligence and machine learning technologies in one platform. The platform can store all types of data and provide it with a cloud-like, multiresource, and self-service interface for data scientists and other users.
The Dell Validated Design for Analytics — Modern Data Stack has been developed to address the needs of organizations deploying advanced analytics. It incorporates the concepts of a modern data stack architecture together with a container platform using decoupled compute and storage.
This document provides design guidance for data analytics infrastructure managers and architects by describing a predesigned, validated, and scalable architecture for advanced analytics on Dell hardware infrastructure.