Taking Networking Power Consumption to Zero Won’t Allow You to Get More AI Into Your Data Centers
With the emergence of AI/ML workloads, there is an industry-wide sense of urgency and panic about how we will power all these new servers. As such, there is tremendous pressure on all equipment in the data center. Not all equipment contributes the same amount of power, nor will all equipment contribute evenly to solving the need for more power.
Power Background
For nearly twenty years, cloud operators have not been forthcoming with how much power they use. Via a mix of secondary companies owning the actual data centers and using colocation providers, the market had a tough time knowing exactly how much power was going to data centers. 1-2% of power in the United States differs significantly from 4-5% and US power consumption is still mostly a hidden number today. In general, the “cloudy” nature of the cloud makes it very difficult for policymakers to support the growth needed in the industry or for utility companies to build. A confluence of issues from very few new builds during COVID to the emerging AI demand left Cloud operators searching for new power sources that didn’t exist. This is why we see data centers in more rural areas, nuclear power plants remaining operational instead of previous plans on decommissioning, and a new push for power efficiency.
Without gains in efficiency, AI could eventually balloon to 15-25% of US power. To put it in perspective, that would be more power than Brazil or Germany today. There is a considerable push to lower power consumption everywhere. To put this in perspective, AI drives the equivalent of a new Spain, Italy, or France worth of new power each year in the US alone. The industry will get there.
Networking Does Not Use that Much Power
In traditional data centers, networking was about 10-15% of power in a facility. However, in AI data centers, that drops to closer to 5%. In absolutes, it grows, but as a percentage of power, it shrinks. To put some numbers behind this, 10% of a 15 kW rack is 1.5 kW, and 5% of a 100 kW rack is 5.0 kW. Put another way, networking’s ‘market share’ when it comes to data center power is smaller than any other computing-related source and not much bigger than miscellaneous. (See image below). While that might sound like a lot of power in an entire facility, shrinking the power budget in networking further doesn’t enable you to add more racks of servers. There are a lot of other options and higher-power devices to tackle first.
Networking Should Focus on Networking, Not Power
Many demos of fully submerged switches and exotic interconnect cages that allow network interfaces to be fully submerged exist today. This push is not necessary nor a prudent use of engineering efforts. Coldplate cooling of the ASIC and potentially putting coldplates in the high-end power-hungry modules is enough to keep the network running cooler and not force unnatural changes into networking. Networking is already doing its fair share; look at the power improvement or pico-joules per bit differences between the solutions in 12.8 Tbps and 51.2 Tbps.
Networking in the Co-packaged World
One of the main things lost in reducing networking power today is that most switching inside the data center will use co-packaged and onboard optics within a decade. Co-packaged and onboard optics remains a mammoth technology problem and adoption will likely need to be preceded by extensive testing, but it looks somewhat inevitable. So, the whole concept of connectors, modules, and ports will be reinvented anyway. At those intersections of technology, re-discussing power makes the most sense.