The multi-tier architecture consists of 2-tier or 3-tier layers to provide better scalability compared to a standalone architecture.
2-Tier
The 2-tier fabric consists of a leaf and spine switch layer where a single fabric is created using these two different switches.
The leaf switches provide workload connectivity into the fabric and have most of the networking features such as spanning-tree, link-aggregation, or quality-of-service (QoS) enabled.
The spine switches provide basic switching functions and have little networking features enabled. The spine switch layer is analogous to a "backplane" in a chassis device.
The following figure shows the typical 2-tier leaf and spine fabric.
3-tier
The 3-tier fabric consists of a leaf, spine, and super spine switch layer. In this fabric, the super spine is used to connect separate switching fabrics.
The network configuration of the super spine layer is usually a traditional BGP Layer 3 configuration, which provides next hop routing functions for each fabric.
SFM uses the 2-tier reference architecture for both a converged and anything larger than a 64 GPU cluster blueprint.
The following figure shows a 2-tier GPU fabric. In this case, SFM is using a rail optimized topology.
SFM supports 6 (routed or bridged) rail optimized GPU blueprints ranging from 512 to 8,192 GPUs