The advent of flash storage media - specifically Non-Volatile Solid-State Devices (SSDs) - has changed the landscape of the enterprise data center over the last decade.
Advances in Peripheral Component Interconnect Express (PCIe), the continued proliferation of Non-Volatile Memory Express (NVMe) storage, and the introduction of large-scale NVMe over Fabrics (NVMe-oF) infrastructures (particularly ethernet-based NVMe-oF protocols such as NVMe/TCP) will fundamentally alter the data center landscape and how enterprise applications are developed, deployed, and maintained.
NVMe-oF enables the benefits of NVMe flash-based storage to be realized at a larger scale and no longer be limited to the confines of the internal PCIe fabric and PCIe backplane-based systems.
Ethernet-based NVMe-oF (particularly NVMe/TCP) is the premier choice for this scale-out architecture. NVMe/TCP uses Transport Control Protocol (TCP), which is one of the most widely used protocols in modern digital network communications.
With the introduction of NVMe/TCP and SmartFabric Storage Software (SFSS), Dell Technologies continues its long history of helping solve customer challenges with innovative solutions that are optimized, cost-effective, and easy to implement. These innovative IP SAN-based solutions are revolutionary and represent what many customers consider to be the future of storage networking.
The following figure shows a basic dual air gapped infrastructure with dedicated storage traffic from server to storage array. The connections from the server can be active/active or active/standby depending on how the server is configured.
The Dell switches running Enterprise SONiC do not have MC-LAG configured. Storage traffic redundancy is achieved by placing the corresponding storage traffic (SAN A, and SAN B) on redundant links towards the storage array.
The connections on the switches from the server and storage devices are configured as trunks.
Deployment best practices
Special network configuration is not required because the network transports NVMe/TCP the way it would transport any other TCP traffic, but there are some recommendations and best practices that ensure optimal performance of an NVMe/TCP network:
- Implement Jumbo frames.
- MLAG is not recommended usually because storage networking best practice is to deploy multiple independent active paths. See Table 5 of the following deployment guide SmartFabric Storage Software Deployment Guide for MLAG best practice information.
- For ports connected to NVMe/TCP Endpoints, priority flow control (IEEE 802.1p) should be off for receive and transmit.
- 25 GbE is best suited for enterprise grade NVMe/TCP deployment as it presents better cost alternative to 32 GbE Fibre-Channel.
- Avoid congestion by leveraging the following best practices:
- Separate switches for LAN and SAN traffic are preferred.
- Separate broadcast domains.
- Dedicated NVMe/TCP ports on endpoints.
- No oversubscription - edge ports should not overload uplinks.
- Traffic should ingress and egress at on or near line rate.
- Ensure the network has sufficient capacity.
- Priority Flow Control off/off (802.1p).
- Separate NVMe/TCP traffic from general traffic either through separate physical networks, interfaces, or VLANs.
- Whenever possible deploy a flat network to minimize latency.
- Storage traffic and SFSS control traffic must not be firewalled. Administrators may want to firewall administrator traffic to the SFSS management interface.
Architects designing the next generation data centers are looking for lower-cost IP SAN-based solutions as an alternative to Fibre Channel (FC)-based storage area network (SAN) infrastructures.
NVMe/TCP enables architects to build highly scalable storage environments that allow large-scale deployments and operations over distances. These architects are also looking at NVMe/TCP to reduce data center costs. NVMe/TCP infrastructures can provide equal or greater performance at a reduced cost per port when compared to traditional FC storage infrastructures.
Dell Technologies has created a robust networking operating system based on collaboration and strengthening of the ecosystem by delivering a well-rounded set of deployment models or use cases covering from the basic Layer 2 to advanced solutions such as NVMe/TCP.
Detailed deployment guides and whitepapers are available for our customers to learn and deploy these solutions.