Dell Technologies recommends the General purpose Worker node configuration as a starting point for most clusters. This configuration is optimized for reliability, provides high performance, and is consistent with recommendations from Cloudera.
Table 14. General purpose Worker node configuration Platform | PowerEdge R760 server |
Chassis | 3.5 in. chassis with up to 12 SAS or SATA drives, two 2.5 in. rear NVMeDirect drives, full-height adapter, two CPUs, and PERC 11 |
Chassis configuration | Riser configuration 7, four 8 full-height slots, two 16-channel low-profile slots (Gen4) |
Power supply | Dual hot-plug, fully redundant (1+1), 1400 W, mixed mode power supplies |
Processor | Intel Xeon Gold 6426Y 2.5 GHz, 16 C/32 T, 16 GT/s, 38M cache, turbo, HT (185 W) DDR5-4800 |
Memory | 512 GB - Sixteen 32 GB RDIMM, 4800 MT/s, dual rank |
OCP network card | Intel E810-XXV dual port 10/25 GbE SFP28, OCP NIC 3.0 |
Extra network card | None |
Storage controller | Dell HBA355 full-height adapter |
Disk - HDD | Twelve 8 TB 7.2 K RPM SATA 6 Gbps 512e 3.5 in. hot-plug hard drives |
Disk - SSD | None |
Disk - NVMe | Two 3.2 TB enterprise NVMe mixed use, U.2, Gen4, FlexBay AG drives |
Boot configuration | BOSS-N1 controller card + with two 960 GB M.2 SSD drives (RAID 1) |
Dell Technologies recommends the disk volume and partition layouts for this set of machines, which are listed in:
Table 15. General purpose Worker node volumes Operating system | RAID 1 | Two 960 GB M.2 SSD | 0 |
HDFS or Ozone data | No RAID | Twelve 8 TB SATA HDD | 1-12 |
Temporary data | No RAID | Two 3.2 TB NVMe | 13-14 |
Table 16. General purpose Worker node partitions boot | 1024 MB | ext4 | 0 | Primary | This partition contains BIOS start-up files that must be within first 2 GB of disk. |
/ | 100 GB | ext4 | 0 | LVM | This partition contains the root file system. |
swap | 4 GB | swap | 0 | swap | This partition contains the operating system swap space partition. |
/home | 1 GB | ext4 | 0 | LVM | This partition contains the user home directories. |
/var | ~350 GB | ext4 | 0 | LVM | This partition contains variable data like system logging files, databases, mail and printer spool directories, transient, and temporary files. |
/data/<n> | 4 TB | ext4 | 1-12 | Primary | These partitions are used for HDFS data as 12 individual file systems. |
/data/ssd<n> | 3.2 TB | ext4 | 13-14 | Primary | These partitions are used for temporary files. |
The General purpose Worker node configuration is sized for a typical mix of storage and computes in a Cloudera CDP Private Cloud Base cluster.
Two network ports are included for connection to the Cluster Data network.
Two SSDs in a RAID 1 configuration using the Boot Optimized Storage System (BOSS) card are used for the operating system volume. The swap partition is small since swapping causes excessive latency for running jobs. The home directories are allocated in a separate, small partition since user files should not be stored on Worker nodes. Most of the storage is allocated to the /var partition for runtime files. You can use LVM to adjust the storage allocation between /, /home, and /var for specific needs.
Twelve 3.5 in. hard drives are used for the primary data storage. These drives are mounted as individual partitions and provide approximately 96 TB of raw storage. Cloudera supports a maximum of 100 TB per node for HDFS storage. A larger HDFS storage capacity increases the time that is required for background scans and block reports. It also increases the recovery time should a node fail.
For nodes dedicated to Ozone storage, you can increase the drive size to 16 TB.
For nodes that support both HDFS and Ozone storage, individual drives can be allocated to Ozone or HDFS. Or, they can be partitioned to allocate the storage between Ozone and HDFS.
Two NVMe drives are used for storage of temporary files such as MapReduce temporary and spill files, Spark cache, or Hive Live Long and Process (LLAP) cache. You can also use this storage for HBase tiered cache or tiered HDFS storage. For nodes running Ozone, part of this storage can be allocated for Ozone cache. You can change the size of these drives or remove one depending on the deployment requirements .
The recommended memory size of 512 GB is intended for Worker nodes that run jobs which can benefit from additional memory such as Spark, Impala, and HBase region servers. You can reduce the memory allocation for nodes that primarily provide storage services with minimal compute capability.