Server infrastructure

The server infrastructure provides compute, memory, and some of the storage resources that are required to run customer workloads. A wide variety of PowerEdge server configurations are possible. The recommendations here support a wide variety of workloads typical in a modern data stack architecture implementation.

Modern data stack control plane node

Modern data stack control plane nodes support the core platform services that are defined in the Symcloud Platform Manager role. Dell recommends the configuration in Modern data stack control plane node configuration as a starting point for these nodes.

Table 2. Modern data stack control plane node configuration
Machine function	Component
Platform	PowerEdge R660 server
Chassis	2.5 in chassis with up to 10 hard drives (SAS or SATA), two CPU slots, and PERC 12
Chassis configuration	Riser configuration 2, three 16-channel, low-profile slots (two Gen5 and one Gen4)
Power supply	Dual hot-plug, fault-tolerant (1+1), 1100 W mixed mode (100-240 Vac), Titanium, normal airflow (NAF) power supplies
Processor	Intel Xeon Gold 6426Y 2.5 G, 16 C/32 T, 16 GT/s, 38 M cache, turbo, HT (185 W) DDR5-4800
Memory capacity	128 GB (eight 16 GB RDIMM, 3200 MT/s, dual rank)
Internal RAID storage controllers	Dell PERC H965i with rear load bracket
Disk—SSD	Two 1.6 TB 2.5 in hot-plug, SAS, mixed-use, up to 24 Gbps 512e Federal Information Processing Standard (FIPS), three drive writes per day (DWPD) SSDs
Network interface controllers	NVIDIA ConnectX-6 Lx dual port 10/25 GbE SFP28 adapter, PCIe low profile

Dell Technologies recommends the disk volume and partition layouts for this set of machines that are listed in Modern data stack control plane node volumes and Modern data stack control plane node partitions.

Table 3. Modern data stack control plane node volumes
Usage	Volume type	Physical disks	Volume ID
Operating system and Symcloud Platform	RAID 1	Two 1.6 TB SAS SSDs	0

Table 4. Modern data stack control plane node partitions
Mount point	Size	File system type	Volume ID	Partition type	Description
/boot	1024 MB	XFS	0	Primary	This partition contains BIOS start-up files that must be within first 2 GB of disk.
/boot/efi	650 MB	VFAT	0	Extended	This partition contains EFI start-up files.
/	200 GB	XFS	0	LVM	This partition contains the root file system.
/var	60 GB	XFS	0	LVM	This partition contains variable data like system logging files, databases, mail and printer spool directories, transient, and temporary files.
/var/lib/docker	400 GB	XFS	0	LVM	This partition contains the container images.
/home	300 GB	XFS	0	LVM	This partition contains the user home directories.
/home/robinds	40 GB	XFS	0	LVM	This partition contains the Symworld home directory.
/home/robinds/var/lib/pgsql	80 GB	XFS	0	LVM	This partition contains the PostgreSQL database that Symworld uses.
/home/robinds/var/log	60 GB	XFS	0	LVM	This partition contains the Symworld log files.
/home/robinds/var/crash	100 GB	XFS	0	LVM	This partition contains the Symworld crash files.

Three control plane nodes are required for production clusters to provide high availability for the control plane. For pilot testing, the control plane services that are defined in the Symcloud Platform Manager role can be assigned to worker nodes instead of using dedicated control plane nodes.

Memory, storage, and processor have been sized to support all the required services in a production deployment.

The configuration includes two network ports to support high-availability (HA) networking. These ports can be from a single network card, or a pair of network cards for additional adapter level HA.

Two SDDs in a RAID 1 configuration are used for the operating system volume. The home directories are allocated in a separate small partition since user files should not be stored on infrastructure nodes. Most of the storage is allocated to the /var partition for runtime files.

You can use LVM to adjust the storage allocation between /, /home, and /var for specific needs.

General-purpose modern data stack worker node

Modern data stack worker nodes support the platform runtime services and customer workloads. The runtime services are defined in the Symcloud Platform Compute and Storage roles. User deployments define the workloads. Dell Technologies recommends the configuration in General-purpose modern data stack worker node configuration as a starting point for general-purpose workloads.

Table 5. General-purpose modern data stack worker node configuration
Machine function	Component
Platform	PowerEdge R660 server
Chassis	2.5 in chassis with up to eight SAS or SATA hard drives, three PCIe slots, and two CPUs
Chassis configuration	Riser configuration 2, three low-profile 16-channel slots, (two Gen5 and one (Gen4)
Power supply	Dual hot-plug, fully redundant (1+1), 2400 W, mixed mode power supplies
Processor	Dual Intel Xeon Gold 6438Y 2 G, 32 C/64 T, 16 GT/s, 60 M Cache, turbo, HT (205 W) DDR5-4800
Memory capacity	512 GB (sixteen 32 GB RDIMM, 4800 MT/s, dual rank)
Internal RAID storage controllers	Dell PERC HBA355i with rear load bracket
Disk—SSD	Six 3.84 TB hot-plug, SAS, mixed-use, up to 24 Gbps, 512 2.5 in, Federal Information Processing Standard (FIPS) Self-Encrypting Drives (SEDs)
Boot optimized storage cards	BOSS-N1 controller card + with two M.2 960 GB SSDs (RAID 1)
Network interface controllers	NVIDIA ConnectX-6 Lx dual port 10/25 GbE SFP28 adapter, PCIe low profile

Dell Technologies recommends the disk volume and partition layouts for this set of machines that are listed in General-purpose modern data stack worker node volumes and General-purpose modern data stack worker node partitions.

Table 6. General-purpose modern data stack worker node volumes
Usage	Volume type	Physical disks	Volume ID
Operating system	RAID 1	Two M.2 960 GB SSDs	0
Symcloud Storage	No RAID	Six 3.84 TB SAS SSDs	1-6

Table 7. General-purpose modern data stack worker node partitions
Mount point	Size	File system type	Volume ID	Partition type	Description
/boot	1024 MB	XFS	0	Primary	This partition contains BIOS start-up files that must be within first 2 GB of disk.
/boot/efi	650 MB	VFAT	0	Extended	The partition contains EFI start-up files.
/	Around 100 GB	XFS	0	LVM	This partition contains the root file system.
/home	400 GB	XFS	0	LVM	This partition contains the user home directories.
/var	400 GB	XFS	0	LVM	This partition contains variable data like system logging files, databases, mail and printer spool directories, transient, and temporary files.
None	Six 3.84 TB	Symcloud Storage	1-6	Raw partitions	Symcloud Storage manages these partitions.

Dell Technologies recommends four worker nodes for a minimum deployment.

The configuration includes two network ports to support high-availability (HA) networking. These ports can be from a single network card, or a pair of network cards for additional adapter level HA.

Two M.2 960 GB SSDs in a RAID 1 configuration are used for the operating system volume. The home directories are allocated in a separate small partition since user files are not stored at the operating system level on production nodes. Most of the storage is allocated to the /var partition for runtime files. You can use LVM to adjust the storage allocation between /, /home, and /var for specific needs.

Six SSDs are allocated for use by Symcloud Storage. The services to support this storage are deployed with the Symcloud Storage role. This storage is exposed to workloads running on the cluster through the Kubernetes CSI interface. The recommended configuration provides approximately 23 TB of storage per node. This capacity is enough for typical runtime storage in a modern data stack environment where the bulk of the data is stored on external storage. The external storage can be either HDFS provided by PowerScale, or object storage provided by ECS. If more local storage is needed, up to two more SSDs can be added, and drive sizes can be increased.

Memory has been sized to support all the required services in a production deployment with enough headroom for user workloads. The most common change is to increase the memory size to support more containers or workloads requiring more memory.

The processors have been chosen to support compute intensive AI and ML workloads and include dual Intel Advanced Vector Extensions (AVX) units for maximum compute speed. Other processor choices are possible but should be made with memory requirements and overall power consumption in mind.

GPU-accelerated modern data stack worker node

GPU-accelerated modern data stack worker nodes support the platform runtime services and customer workloads that benefit from GPU acceleration. Symcloud Platform Compute and Storage roles define the runtime services, while user deployments define the workloads. Dell Technologies recommends the configuration in GPU-accelerated modern data stack worker node configuration as a starting point for GPU-accelerated workloads.

Table 8. GPU-accelerated modern data stack worker node configuration
Machine function	Component
Platform	PowerEdge R760 server
Chassis	2.5 in chassis with up to 16 SAS or SATA drives, Smart Flow, front PERC 12, two CPUs
Chassis configuration	Riser configuration 5, full-length, with two full-height 16-channel slots (Gen4), two low-profile 16-channel slots (Gen4), one full-height 16-channel slot (Gen5), and one full-height, double-wide, 16-channel, GPU-capable slot (Gen5)
Power supply	Dual, hot-plug, 2400 W redundant, configuration D, mixed-mode power supplies
Processor	Intel Xeon Gold 6438Y 2 G, 32 C/64 T, 16 GT/s, 60 M cache, turbo, HT (205 W) DDR5-4800
Memory capacity	512 GB (sixteen 32 GB RDIMM, 3200 MT/s, dual rank)
Internal RAID storage controllers	Dell PERC H965i with rear load bracket
Disk—SSD	Six 3.84 TB hot-plug, SAS, mixed-use, up to 24 Gbps, 512e 2.5 in, Federal Information Processing Standard (FIPS) Self-Encrypting Drives (SEDs)
Boot optimized storage cards	BOSS-N1 controller card + with two M.2 960 GB SSDs (RAID 1)
Network interface controllers	NVIDIA ConnectX-6 Lx dual port 10/25 GbE SFP28 adapter, PCIe low profile
GPU, FPGA, or acceleration cards	NVIDIA Ampere A30, PCIe, 165 W, 24 GB passive, double wide, full-height GPU with cable

Dell Technologies recommends the disk volume and partition layouts for this set of machines that are listed in GPU-accelerated modern data stack worker node volumes and GPU-accelerated modern data stack worker node partitions.

Table 9. GPU-accelerated modern data stack worker node volumes
Usage	Volume type	Physical disks	Volume ID
Operating system	RAID 1	Two 960 GB SSDs	0
Symcloud Storage	No RAID	Six 3.84 TB SAS SSDs	1-6

Table 10. GPU-accelerated modern data stack worker node partitions
Mount point	Size	File system type	Volume ID	Partition type	Description
/boot	1024 MB	XFS	0	Primary	This partition contains BIOS start-up files that must be within first 2 GB of disk.
/boot/efi	650 MB	VFAT	0	Extended	The partition contains EFI start-up files.
/	Around 100 GB	XFS	0	LVM	This partition contains the root file system.
/home	300 GB	XFS	0	LVM	This partition contains the user home directories.
/var	400 GB	XFS	0	LVM	This partition contains variable data like system logging files, databases, mail and printer spool directories, transient, and temporary files.
None	Six 3.84 TB	Symcloud Storage	1-6	Raw partitions	Symcloud Storage manages these partitions.

GPU accelerated worker nodes have capabilities that are similar to general purpose worker nodes, while adding GPU acceleration. Dell Technologies recommends four worker nodes for a minimum deployment. You can use any mix of general purpose and GPU-accelerated nodes.

The configuration includes two network ports to support high-availability (HA) networking. These ports can be from a single network card, or a pair of network cards for additional adapter level HA.

Two M.2 960 GB SSDs in a RAID 1 configuration are used for the operating system volume. The home directories are allocated in a separate small partition since user files are not stored at the operating system level on production nodes. Most of the storage is allocated to the /var partition for runtime files. You can use LVM to adjust the storage allocation between /, /home, and /var for specific needs.

Six SSDs are allocated for use by Symcloud Storage. The services to support this storage are deployed with the Symcloud Storage role. This storage is exposed to workloads running on the cluster through the Kubernetes CSI interface. The recommended configuration provides approximately 23 TB of storage per node. This capacity is enough for typical runtime storage in a modern data stack environment where the bulk of the data is stored on external storage. The external storage can be either HDFS provided by PowerScale, or object storage provided by ECS. If more local storage is needed, up to 10 more SSDs can be added, and drive sizes can be increased.

The GPU has been chosen to support Spark acceleration of SQL and dataframe operations using the NVIDIA RAPIDS Accelerator for Apache Spark. These workload operations are typical in a modern data stack environment. One or two GPUs can be used in this configuration. AI and ML workloads may benefit from alternative GPU models.

Your Browser is Out of Date

Dell infrastructure

Dell infrastructure

Server infrastructure

Modern data stack control plane node

General-purpose modern data stack worker node

GPU-accelerated modern data stack worker node