The amount of expected data primarily determines modern data stack storage sizing. This aspect of the sizing is independent of the compute cluster sizing.
The available network bandwidth between the compute and storage clusters must also be considered. Bandwidth on the storage and compute clusters scales in direct proportion to the number of nodes. However, dense storage capacity is possible with ECS, ObjectScale, and PowerScale. Such density can result in a large storage capacity without enough bandwidth to support the data transfer requirements. An analysis of workload data transfer requirements is necessary to correctly size the storage for both capacity and bandwidth.
The architecture is not limited to a single type of modern data stack storage. PowerScale can be used with the HDFS protocol, or ECS and ObjectScale can be used with the S3 protocol. Any workload can reference any or all these storage types. It is also possible to use multiple external PowerScale, ECS, and ObjectScale storage systems.
The network architecture allows both compute and storage clusters to use the same fabric. This configuration enables the network bandwidth to scale as either storage or worker nodes are added. The bandwidth available to the external storage systems should also be considered when referencing external storage that is not connected to the core Cluster data network.