Strong NVIDIA Momentum in Networking for AI in 2024, Led by NVIDIA Quantum InfiniBand, Also Surging NVIDIA Spectrum-X Ethernet Sales
2024 was an exciting year for AI. Overall spending on AI equipment in the data center blew past $200B in 2024 as larger Hyperscalers, new cloud startups, sovereigns, and pretty much every enterprise allocated capital to AI. At a market level, it was an exciting milestone as the industry moved from foundational training to the early stage of agent roll-outs, which will require over $1T of spending over the next several years. Networking plays a key role in AI/ML workloads with multiple networks from scale-up (fastest for cache coherency) to scale-out (the back-end InfiniBand/Ethernet networks connecting the cluster) to front-end networks all innovating rapidly and surging in size. All this networking infrastructure will exceed $50B a year in spending, moving from a nascent market for HPC to one of the most enormous buckets of DC infrastructure spending in less than 10 years.
Ethernet hit 100K+ GPU scale at X.AI
In the fall, X.AI announced its supercluster, appropriately named Colossus, in Tennessee. This is the first 100K Ethernet cluster that shows up in our data and is the building block for Grok 3, released just a few days ago. NVIDIA was the leading supplier of the networking gear in this cluster. A data point that helps frame the size of these clusters, one cluster ships with more networking bandwidth than all data centers for an entire year did a decade ago. Bandwidth deployed is even more impressive when we take into account that a decade ago, most data center bandwidth was never used. It was deployed for peak use cases, and data centers normally operated around 30-50% utilization. With AI, we try to achieve as close to 100% all the time.
Vendor Landscape is Changing
While we are still in the early days of AI build-outs, it is clear that vendors are taking a different approach to AI than we have seen in the past. Not only are we seeing new products purpose-built for AI, but we are also witnessing partnerships that would have been unlikely before AI. As evidence, earlier this week, Cisco and NVIDIA partnered to help customers on their journey to AI. This announcement is powerful, as we will get Cisco switches with merchant (NVIDIA Spectrum ASICs) running Cisco’s OS and a closer coupling of SiliconOne to NVIDIA’s Super NIC.
The rush to $100B+ Deployments and 1M GPU clusters
Most larger cloud operators are in the final stages or have 100K clusters already. As we look towards the rest of the year, we expect renewed commitment and plans as we advance to the next milestones in buildouts. $100B+ clusters followed by 1M GPU clusters will usher in another level of AI and also put the network front and center as the glue connecting those GPUs. Next-gen deployments will usher in a whole new level of networking spend and product offerings. We expect significant networking milestones as the market moves through 2025.
###