Data science involves large amounts of data, but frequently the same data is stored in multiple formats or with transformations applied. This scenario can cause storage requirements to grow as you build more models to deploy in the enterprise. Consistent, high-performance access to data is critical for developing models, maximizing GPU resource utilization, and developing new insights.
The following Dell EMC storage is typically used in building an environment to power Domino Data Science Platform:
- Dell EMC Isilon Scale-Out NAS—Isilon is a scale-out network-attached storage platform that Dell EMC offers for high-volume storage, backup, and archiving of unstructured data. The Isilon H series can provide capacity of up to 960 TB of capacity per chassis. Features built into the OneFS operating system such as SMARTPOOLS and SMARTCONNECT ensure that users have fast and consistent access to their data.
- Dell EMC NFS Storage Solutions (NSS)—NSS comes in three configurations that offer 20 TB, 40 TB, or 80 TB of usable space. NSS consists of the NFS gateway that is powered by the PowerEdge R740 server and one or more PowerVault ME4 Series storage enclosures. With the NSS 7.3 release, the maximum usable capacity of this solution increases to 768 TB of usable and protected space. For more information about the NSS 7.3 solution, including how it can be configured in a highly available manner and the testing and validation that Dell engineering performed to validate this new maximum capacity, see NFS Storage Solution with the latest Dell EMC storage -- Performance Results.