Higher RADIX Switching Drives Faster Training and Lowers Inference Costs
Introduction
AI cluster expansion increases the networking demand and requires a new networking scale to expand with the number of XPU/GPUs efficiently. Higher radix switches meet this need directly. Newer advanced designs integrate more ports per ASIC and deliver higher aggregate bandwidth. The industry is in the 512 to 1024 lanes transition as we move from 51.2T to 102.4T, but what happens if we move to an even higher radix? The advantages include substantial CAPEX and OPEX reductions, lower power usage, minimal latency, and maximized GPU productivity.
Radix Fundamentals Drive Network Simplification
Radix measures a switch’s connectivity capacity. Fixed form-factor switches with lower radix require multiple layers, high device counts and high interconnect counts to link large server farms. Chassis-based modular switches can offer higher radix and port counts, but with drawbacks in increased latency, cost and operational complexity.
Higher radix platforms break this dilemma by allowing massive increases in scale within a fixed form factor. Architects want to consolidate switches aggressively. This is why at a total switch capacity of 102.4T, we expect 1024x112G to become a popular option. Higher radix switches cut network tiers and eliminate performance bottlenecks. Every reduction in hops translates to faster data movement and better overall system efficiency. This approach proves critical as AI training jobs grow and we look to avoid congestion, jitter, and linkflaps.
The 51.2T Transition Replaces Twelve 12.8T Switches
The industry already demonstrates clear benefits from radix increases. One 51.2T switch replaces twelve 12.8T equivalents in fabric designs. Operators deploy fewer units overall or can keep pace with server bandwidth growth. They reduce cabling complexity and associated power reduction. If 51.2T were not available, the number of tiers would have grown so much for AI that it would not really be feasible.
The industry had similar conversations back in the early cloud days, where one cloud provider pointed out that legacy approaches would have required more racks of networking gear than compute gear. Management overhead drops while reliability climbs. Many leading hyperscalers and emerging neoclouds adopt 51.2T technology to support their expanding GPU clusters and achieve better economics than previous generations allowed. More is needed, as there are still a lot of switch-to-switch links that could easily be eliminated.
Current Pace on Innovation Fixed Form Factor Switch Won’t Expand Radix Fast Enough for AI
By late 2026, we will start to see the next generation of 102.4T switch platforms, based on silicon from leading players. We expect that their next generation ASICs will support 204.8T switches in 2028, following the recent pattern of doubling radix roughly every two years. But as we’ve discussed previously, given AI demand and xPU scaling velocity, that won’t keep up with the need for higher radix switches to build ever-larger scale-up domains and scale-out networks. Some vendors are simulating higher radix by putting combining four switch ASICs into the same box with a built in shuffle inside the box While this hides some of the complexity of a multiplane network, it is really no different than deploying 4 switches with an external shuffle cable. The bottom line is that the only way to deliver high radix is with new silicon.
A 512T Solution Replaces Thirty 51.2T Switches
In the example below, one 512T switch to handle the workload of thirty 51.2T switches. The network flattens dramatically. Single-hop domains encompass thousands of GPUs, while scale-out fabrics reach millions of endpoints. Latency falls sharply, and jitter nearly vanishes. Congestion issues fade away. GPU utilization soars because data flows unimpeded. Builders significant CAPEX savings along with major reductions in power consumption and operational costs. These gains compound across massive deployments.

The Next Leap in High Radix Technology
The demand for faster radix scaling is driving innovation by some new entrants. As an example, Eridu recently exited stealth with more than $200 million in funding to attack these challenges. The company is engineering a purpose-built, high radix switch from silicon to system that could achieve an order of magnitude improvement in radix and overall performance. If Eridu or other innovators can deliver on this bold goal, we will see support for single-hop scale-up domains of many thousands of GPUs and increasingly massive scale-out clusters. The customer benefits from lower power and faster AI workload performance. This technology perfectly suits hyperscalers, neoclouds, and sovereign cloud operators who build at unprecedented scale. We expect high radix innovations to become the foundation for future AI infrastructures.
Conclusion
Higher radix switches deliver decisive advantages in AI networking. They consolidate equipment counts, drive down costs and power demands, and enable vastly larger, efficient clusters. The progression we outline from 12.8T through 51.2T to 512T illustrates a clear path to superior economics and performance. Operators who implement these designs position themselves to handle explosive AI growth while controlling expenses. The data center networking industry will naturally advance toward high radix architectures that power the next era of AI innovation.