NVIDIA Mellanox MQM9790-NS2F InfiniBand Switch Technical Solution

April 13, 2026

NVIDIA Mellanox MQM9790-NS2F InfiniBand Switch Technical Solution

This technical whitepaper is designed for network architects, pre-sales engineers, and operations leaders. It centers on the MQM9790-NS2F — a 400Gb/s NDR InfiniBand switch — and provides detailed guidance on architecture design, key technologies, deployment and scaling, as well as operations and monitoring, specifically for RDMA/HPC/AI cluster low-latency interconnect optimization.

1. Project Background & Requirements Analysis

Modern AI training and HPC workloads are driving clusters from thousands to tens of thousands of GPUs. In such environments, network interconnect has become a primary bottleneck. Traditional Ethernet fabrics struggle with tail latency and CPU overhead, while legacy InfiniBand deployments may lack sufficient port density and bandwidth. Key requirements include sub-microsecond switching latency, full line-rate forwarding without packet loss, efficient RDMA support, and seamless scalability to hundreds of switches. The NVIDIA Mellanox MQM9790-NS2F directly addresses these needs with its NDR 400Gb/s capability and advanced in-network computing features.

2. Overall Network & System Architecture Design

The recommended architecture adopts a two-layer Fat-Tree (also known as folded Clos) topology, which balances bisection bandwidth, cost, and scalability. At the leaf layer, GPU servers equipped with ConnectX-7 NDR adapters connect to leaf switches. At the spine layer, MQM9790-NS2F InfiniBand switch units provide non-blocking connectivity between leaves. This design ensures full bisection bandwidth: any leaf switch can communicate with any other leaf at wire speed. For large-scale clusters, a three-layer topology (leaf-spine-super-spine) can be deployed, supporting up to tens of thousands of GPU nodes.

  • Leaf switches: 64-port OSFP models, each connecting to 32 servers (dual-port) plus uplinks to spines.
  • Spine layer: MQM9790-NS2F 400Gb/s NDR 64-port OSFP switches, with each port acting as an uplink from a leaf. A fully non-blocking design requires spine ports equal to the number of leaf switches.
  • Subnet management: A dedicated or redundant subnet manager handles path calculation, adaptive routing, and failover.

3. Role & Key Features of the NVIDIA Mellanox MQM9790-NS2F in the Solution

As the core spine and optionally leaf device, the MQM9790-NS2F delivers several critical capabilities:

  • 400Gb/s NDR line-rate performance: Each of the 64 OSFP ports operates at full duplex 400Gb/s, providing an aggregate switching capacity of 51.2Tb/s.
  • Ultra-low latency & adaptive routing: Cut-through switching keeps port-to-port latency under 130ns. Adaptive routing dynamically balances traffic across multiple paths, avoiding hot spots.
  • In-network computing (SHARPv3): Supports scalable hierarchical aggregation and reduction, offloading collective operations from the CPU/GPU and reducing data movement by up to 10×.
  • RDMA-native design: Hardware-accelerated RDMA enables direct GPU memory access, eliminating CPU involvement and dramatically lowering communication overhead.
  • Comprehensive telemetry & QoS: Fine-grained congestion control, buffer monitoring, and flow classification ensure deterministic performance for mixed workloads.

According to the MQM9790-NS2F datasheet, the switch also supports hot-swappable power supplies and fans, redundant management ports, and a full suite of diagnostics, making it suitable for 7×24 production environments.

4. Deployment & Scaling Recommendations (with Typical Topology)

A typical 2,048-GPU cluster can be built using 64 leaf switches and 32 spine switches. Each leaf connects to 32 GPU servers (dual-port) and provides 32 uplinks to spines. The spine layer consists of MQM9790-NS2F compatible units running NDR optics or DAC cables. For expansion to 8,192 GPUs, a super-spine layer is added, interconnecting multiple pods.

When scaling, consider the following:

  • Cabling and optics: Use OSFP-to-OSFP DACs for short intra-rack links, and OSFP-to-4xOSFP breakout cables or optical modules for longer distances. Verify compatibility with the MQM9790-NS2F specifications regarding reach and power budget.
  • Subnet sizing: A single subnet manager can handle up to 2,000 nodes; beyond that, deploy multiple subnets or use a distributed subnet manager design.
  • Redundancy: Dual-homed servers and redundant spine switches eliminate single points of failure. The MQM9790-NS2F InfiniBand switch solution supports hitless failover with proper SM configuration.

5. Operations, Monitoring, Troubleshooting & Optimization

Effective operations require visibility and automation. The following practices are recommended:

  • Monitoring: Use NVIDIA's Fabric Manager and telemetry APIs to track port errors, temperature, power consumption, and link utilization. Set alerts for CRC errors or symbol errors exceeding thresholds.
  • Troubleshooting: The MQM9790-NS2F provides per-port counters, buffer occupancy histograms, and congestion logs. In case of performance degradation, check adaptive routing configuration, ensure all fabric links are symmetric, and verify that SHARP aggregation is enabled for supported collectives.
  • Optimization: Tune adaptive routing parameters based on workload (e.g., latency-sensitive vs. throughput-sensitive). For large AI models, enable congestion control and set buffer limits to prevent PFC deadlocks. Regularly review the MQM9790-NS2F price vs. performance trade-offs when planning capacity additions — often, upgrading spines yields better ROI than adding more leafs.

For organizations evaluating MQM9790-NS2F for sale, ensure that your software stack (e.g., NCCL, OpenMPI) supports NDR features like SHARPv3 and hardware-based reduction.

6. Summary & Value Assessment

The MQM9790-NS2F InfiniBand switch solution delivers a clear path to building low-latency, high-bandwidth fabrics for demanding RDMA/HPC/AI clusters. Its 64-port 400Gb/s density, sub-microsecond switching, and in-network computing capabilities directly address the scalability and performance challenges of modern workloads. By adopting the architecture outlined above — Fat‑Tree topology, NDR core switches, and RDMA-native operation — organizations can achieve linear GPU scaling, reduce job completion times by over 30%, and simplify fabric management. For detailed planning, refer to the official MQM9790-NS2F datasheet and compatibility guides. To discuss a customized design or obtain MQM9790-NS2F price and availability, please contact an authorized NVIDIA partner.