How to Build a Data Center: It's All About the SerDes...and Thermal

14 Sep 2021 • 5 minute read

Okay, you are probably not going to have to build a data center. But you might well be involved in designing chips that go in data centers. Or looking at IP for data centers. Or worrying about some aspect of design tools for advanced nodes, much of which goes in data centers. So a basic understanding of how a modern data center is put together is important.

One of the biggest changes in designing chips since I've been in the industry has been the switch from low-speed parallel interfaces to high-speed serial interfaces. For long-distance wide-area-networks (WANs), we have always used serial interfaces since running lots of wires around the country or the world is expensive. These typically ran over wires supplied by the phone company at 56kbps or 64kbps, or, if you had a big budget, a T1 line at 1.5Mb/s. Today, long-distance signaling is almost all done using optical fiber, with data rates getting higher. Sometimes, these are point-to-point where the data center owner owns its own fiber, or these might just go to an internet network service provider's node, either to access the public network, or to piggy-back on the internet backbone to get from A to B.

In the late 1970s, local-area-networks (LANs) became widespread, using Ethernet, or various ring technologies (Apollo and Cambridge University both used token rings). But chips on boards used lots of pins, sometimes multiplexed to keep costs down. The signals were relatively slow. In fact, they were so slow that it was a surprise when they got faster and we suddenly had to worry about package-pin inductance. But even with the invention of ball-grid-arrays (BGAs) with lots more pins, there still were not enough, especially as buses moved from 16-bit, to 32-bit, to 64-bit, and wider. The solution was to run each datastream over a single (or a few) pins running very fast serial interfaces known as SerDes (for serializer-deserializer). The signals were parallel on the chip, but the seriallizer would transmit it as a serial stream of bits, and at the other end the signal the deserializer would do the complex equalization needed to recover the clock and data, and then parallelize it.

From a communication point of view, the two most important technologies in data center signalling are PCI Express (PCIe) and Ethernet. There are also CXL, Compute eXpress Link, and CCIX (pronounced "see six"), Cache Coherent Interconnect for Xccelerators (accelerators), but since these are based on PCIe as the underlying physical transport technology, we don't need to worry about them specially.

Both PCIe and Ethernet are serial communication technologies. Both networking technologies have gone through various versions over the many years of their existence.

We are at PCIe 5.0 right now with PCIe 6.0 expected to be formally standardized by the end of the year. For more on that, see my post The History of PCIe: Getting to Version 6. The big change between the current version 5.0 and the upcoming version 6.0 is the switch from NZR signaling (one bit per clock cycle) to PAM4 (two bits per clock cycle). This is the same technology as is used in 112G SerDes and you can read about that in my post The World's First Working 7nm 112G Long Reach SerDes Silicon.

Ethernet originated at Xerox Palo Alto Research Center (PARC) in the mid-1970s at 3Mb/s. The first commercial Ethernet standard was 10Mb/s. I think the slowest Ethernet that anyone actually uses today is 10G (so 10Gb/s), but the fastest is 400G or with 800G on the drawing board. You can read the history of Ethernet in my somewhat misleadingly titled Automotive Ethernet and there is a roadmap for Ethernet in datacenters in the 112G SerDes Silicon post linked to above. Also, in Andy Bechtolsheim's CDNLive keynote he laid out that roadmap and the implications for the silicon business. I covered that in Andy Bechtolsheim: 85 Slides in 25 Minutes, Even the Keynote Went at 400Gbps.

Both technologies use multiple serial signals to get off a chip and onto the network. PCIe can use 1, 4, 8, or 16 lanes in a single PCIe slot. High-performance Ethernet at more than 100G requires multiple serial signals to feed the high-speed optics. Both PCIe 6.0 (the upcoming standard) and current and future versions of 100+G Ethernet require 112G SerDes interfaces to the silicon. 224G interfaces have been hinted at and are in development but are not yet available.

Within the data center, signals are increasingly transmitted using optical technologies between servers, routers, attached storage, and other "boxes". Within each box, the signals run on traditional printed circuit board traces, connectors, and cables.

Thermal

Another big challenge in data centers is thermal. You may have read that over the lifetime of a data center, the operational expenses (opex) will add up to more than the capital expenses (capex) required to buy the dtata center in the first place. The two big opex line-items are electrical power for all the equipment, and electrical power for all the air-conditioning to get the heat out again. This requires specialized design of airflows inside the data center itself but also puts a premium on making the chips as low power as possible so they don't generate more heat than necessary, but are still adequately cooled by fans, heat-sinks, heat-pipes, and even sometimes water-cooling. Cadence's tool for thermal analysis of electrical systems is Celsius Thermal Solver.

You can read about Celsius in my posts:

That last post is probably the most relevant for data centers. Protium is our FPGA prototyping system, but in this context it is a full rack intended to go in an enterprise data center. It contains large FPGAs running fast and so produces a lot of heat that needs to be dumped into the data center itself by fans blowing air across heatsinks and baffles. So it is very similar to the problem of designing a rack of server "pizza boxes" and cooling them.

Learn More

See the Celsius Thermal Solver product page.

https://www.youtube.com/watch?v=awumTiOkeQY

Sign up for Sunday Brunch, the weekly Breakfast Bytes email.