Retimers Needed as Distances Shrink in PCIe Gen 5 and Gen 6 Speeds
The server market remained on Gen 3 (8 Gbps/lane) for several server design cycles, leading the market to become complacent about how quickly server bandwidth could increase inside the server and at the NIC level. The market saw this as multiple generations of 25G NICs. A confluence of Artificial Intelligence / Machine Learning (AI/ML) and increased core counts in CPUs is causing a more rapid transition to higher-speed PCIe technologies. We will move from PCIe Gen 4 to Gen 6 faster than the industry was on PCIe Gen 3. Between Gen 4 (16 Gbps / lane) and Gen 6 (64 Gbps / lane), the distance of a copper trace goes down from about 8 inches to under 4. Additional silicon (retimers) is needed to get the signal inside the server as the signal can’t reach all the different components, such as memory, storage, networking, etc.
What is a Retimer?
A retimer is a piece of silicon that scales the signal to increase the distance that signal travels. Retimers regenerate the signal to compensate for the shorter distance. Adding retimers allows the server connection to go over longer distances, allowing the desired distance to be achieved without drastic board redesigns. Marvell launched there PCIe Retimer earlier this week, Figure 1 shows an example of what these chips look like.
Distance Limitation as Market Moves From PCIe Gen 4 -> Gen 5 -> Gen 6
Most server racks are 24 inches wide and up to 48 inches deep. Most servers are 19 inches wide and around 24 inches in depth when adjusting for cabling and airflow. With an intelligent board layout, 8-inch distances in PCIe Gen 4 were often enough to reach most components inside the server. However, with PCIe Gen 5, that distance shrinks to under 4 inches, and PCIe Gen 6 shrinks even further. Everything can’t connect at those smaller distances. Retimers solve the short distance issue, bringing lengths to 12-24+ inches depending on the application.
NRZ to PAM4 Modulation Change
Concurrently with the move to higher speeds, the modulation is changing with the PCIe ecosystem. PCIe Gen 5 and earlier used the NRD modulation scheme (one-bit) to PAM4 (two-bit) encoding, which allows twice as much bandwidth. Networking went through a similar transition between the 28 Gbps to 56 Gbps SERDES generation (100 Gbps port speed to 400 Gbps port speed).
Significant PCIe Retimer Market Develops
As the industry adds more GPUs/AI ASICs to each server, the need for retimers increases significantly. A typical CPU needs one retimer, and the typical GPU needs two. A fully loaded AI server (2 CPUS, 8 GPUs) would require at least 18 retimers. When we multiply this by the expected AI server growth, the PCIe retimer market will rapidly become large.
Future Markets for Retimers
Early adoption in the retimer market will focus on the connection between the CPU motherboard and AI midplane/backplane, but the market should expand beyond that in future generations. Future server designs aim to disaggregate CPUs, GPUs, memory, and storage. Retimers will play an essential role as PCIe moves out of the server chassis and begins to connect inside the rack and eventually between multiple racks. With the help of CXL, PCIe will become an additional back-end fabric network connecting these racks with customers using a mix of DAC, Active Copper, AOC, and transceivers.