Never miss a story from Breakfast Bytes. Subscribe for in-depth analysis and articles.
Way back in the 1960s, E. Rent, who was working at IBM at the time, noticed a connection between the number of pins P on integrated circuits being use and the number of gates G on the integrated circuits. It was a power law, where the number of pins was cGR where c and R are constants. Actually, traditionally a Greek rho is used instead of R. It usually has a value between 0.5 and 0.8. If R is 0.5 then the number of pins is proportional to √G, which is what I remember from my VLSI design classes. So the number of pins grows more slowly than the number of gates, but it grows inexorably. This is now known as Rent's Rule. We can combine Rent's Rule with Moore's Law, that the number of gates G doubled every two years, and so every two years the number of pins P grew by 1.4 (√2).
Until the invention of the ball-grid-array (BGA) we didn't have a huge number of pins to play with, so this was a problem. I think the largest quad-flat-packs (QFP) had 256 pins. The first trick was to multiplex pins, essentially using the same pin for two different functions.
So the next advance was to invent the BGA. This was done by Motorola in about 1990 (under the name overmoulded plastic carrier or OMPC). That meant we could get a lot more pins onto a plastic package of the same size than putting all the pins around the edge.
As address spaces went from 16 bits, to 32 bits, to 64 bits, the demand for pins was insatiable. With 64 bits of address and 64 bits of data, that's already 128 pins even before worrying about anything else. It became more and more impossible to find enough pins to have clocked parallel interfaces where data was transferred only on clock edges. One minor development was the approach taken with DDR memories where data was transferred on both edges of the clock (DDR stands for double data rate).
The final piece of the puzzle was to abandon trying to have clocked parallel interfaces, and switch to serial interfaces. Instead of using 64 pins to transfer a value, they could be transferred on a single pin at a much higher datarate. Of course, long-distance networks had always just had a single signal (well, usually two since you need a return path). For example, the first Ethernet ran serially over coaxial cable. I remember coding a device driver for a synchronous IBM box which ran over a twisted pair across to the Edinburgh University Computing Center.
So the next development was to go to serial interfaces, and make up for the fact that you only had one or two pins by running them very fast. The optical internet forum (OIF) had defined some standard datarates for use with fiber, and everyone pretty much adopted the standard rates: 3.125G, 6G, 10G, 28G, 56G, and 112G (where G is actually gigabits per second). The higher ones of these rates really are astoundingly fast. Even a single 10G signal is 156 million 64-bit transfers per second on one pin.
At the end of each connection is an on-chip component called a SerDes which stands for serializer-deserializer. On the chip, signals are usually routed around using wide buses. To go from one chip to the next, that parallel wide-bus data has to be serialized at the transmitter, and deserialized at the receiver. Note, also, that serial interfaces are point-to-point. If you want to fan out to multiple chips then you need multiple serial interfaces.
The current state-of-the-art is 56G or 112G. See my earlier post The World's First Working 7nm 112G Long Reach SerDes Silicon.
At the recent virtual TSMC OIP Ecosystem Forum, Cadence's Wendy Wu presented Not All 112G/56G SerDes Are Born Equal — Select the Right PAM4 SerDes for Your Application.
She pointed out the growing needs for 56G and 112G driven by networking, AI, and 5G:
There are different 56G/112G applications and tradeoffs, depending on the distance between the transmitter and the receiver. As you can see in the image below, there are four distances: long reach (LR), medium reach (MR), very short reach (VSR), and extra short reach (XSR).
These all have different channel losses and limitations. For example, it would make no sense to use a long-reach connection to connect two die within the same package. LR is all about performance but XSR (D2D) is all about power and beachfront (just like beaches in Malibu, there's only a certain about of "edge" to the die where the signals have to get on and off). This means that one solution will not fit all.Some solutions are analog and some are based on DSP approaches. DSP is more powerful (it can equalize 40dB+ of loss) but tend to require more area and power. But they can leverage the latest process generations such as N7 and N6. Analog approaches can equalize less than 20dB of loss, but have better density and lower power, especially in less leading-edge process nodes.
Cadence's 56G and 112G SerDes IP blocks are silicon-proven for TSMC's 16FF, N7, and committed for N5.
And on the day of the TSMC Technology Symposium, we announced that the Ultralink D2D IP was certified on the TSMC N6 process. It is also taped out on N5 but the silicon is still in the fab. See my post Cadence Certified on TSMC N3, Ultralink on N6, and 3DFabric. For more about the D2D PHY IP on N7, see my post from last year Die-to-Die Interconnect: The UltraLink D2D PHY IP.
Sign up for Sunday Brunch, the weekly Breakfast Bytes email.