HBI, a New Standard to Connect Your Chiplets

11 Dec 2020 • 4 minute read

It is not very well-known how involved Cadence is in establishing standards. Recently, in my post Cadence and Standards...and a New Codec for Your Phone, I wrote about this and about one particular standard, the new EVS (Enhanced Voice Services) codec. Today, I want to talk about another standard, this time one that is still in the works. It goes by the acronym HBI, for High Bandwidth Interconnect.

First, a little background. We all know that Moore's Law is slowing. There are lots of implications to this, but one has been the growth of what are catchily called More than Moore approaches to integration. Instead of always putting everything on a single die and then riding Moore's Law down the cost and performance track, we design a system with separate die and use advanced packaging to assemble them together. There are several motivations for this other than the slowing of Moore's Law: smaller die yield better, different die can be in different processes, not all parts of a design benefit from the most advanced node (especially analog), some designs are bigger than the maximum reticle size, and so on.

So if we design our system out of different die and then put them together in an advanced package, how do the die communicate. Sometimes these die are called chiplets. Other people reserve the term chiplet for die sold by third parties in the as-yet-nonexistent market for bare die.

High Bandwidth Memory HBM

But not-quite non-existent. There is a market for HBM, For example, AMD's latest microprocessors and gaming processors are built using multiple die and HBM. The picture above is from a Lisa Su keynote. The chips to the top right (the blue and green ones) have a big processor inside the big die, and the other die (4 in Rome, 2 in Matisse) are HBM.HBM further consists of a 3D stack of die connected using thru-silicon-vias or TSVs. There is a logic die on the bottom that handles all the interfaces, then 4-12 DRAM die on top of that. Above is an AMD diagram of how they build their systems with HBM. In the center of the image you can see the connections between the CPU and the HBM stack. The protocol used across that interface is also called HBM, and is standardized by JEDEC (of which Cadence is a member), the focal point for all memory standards. Also note the PHYs on both the CPU and HBM stack. These are the parts of the circuit that directly drive the traces (physical layer) crossing the interconnect.

But what if the die you want to communicate with is not an HBM memory?

High Bandwidth Interconnect HBI

That is where the new proposed HBI standard comes in. HBI stands for High Bandwidth Interconnect and is a general standard for die-to-die (or d2d) communication. Cadence is involved with the standard body for this, the ODSA OpenHBI Initiative. The reason I pointed out the PHYs in the earlier AMD diagram is that the HBI standard will reuse the PHYs from the HBM both to leverage existing work, and also to simplify chiplets that communicate with both other chiplets and HBM memories. An objective of the initiative is that devices conforming to OpenHBI can also support JEDEC HBM devices.

There are several reasons for leveraging the existing HBM standard, such as:

It is a proven and mature standard
It is the highest volume standard-based chiplet applications
It is broadly deployed in GPU, FPGA, networking, AI, 5G, and many more
It is high performance and low energy, with an advanced roadmap going forward

The standard for HBI has not been finalized, so the current state of the parameters are confidential until something is published. But here is a slide from Xilinx's original proposal for HBI which, presumably, is something fairly close to current discussions.

One advantage of die-to-die interconnect is that it doesn't need to deal with long, complex signal routes. The proposal is for a length of 3mm. This considerably simplifies the amount of equalization required at the receiver, and correspondingly reduces both area and power.

Another limitation with chiplets is that the communication has to take place at the edge. Just like building houses on Malibu beach, there is only a certain amount of beachfront. In fact, that same word is used for space on the edge of the chiplets. Servers and routers in hyperscale datacenters have a similar problem, they are limited by how many network connections fit along the width of the front of the server. The figure-of-meritt for this is "beachfront bandwidth density" and OpenHBI looks to have over 1.5 Tb/s (aggregating Tx and Rx), with an energy budget of under 0.8pJ/bit. Another subtlety about the edge of the chiplet is that OpenHBI needs to support both direct (signals in the same order) and rotated (signals in reverse order).

The OpenHBI standard is not yet finalized, so any numbers mentioned in this post are subject to change. Like many standards, there will likely be further improvements in the future.

Cadence Products

Cadence already has two products roughly in this space that are available today: the Ultralink D2D interface (current version has test silicon in 7nm and 6nm, and has taped out in 5nm), and the 112G XSR SerDes. They both have bandwidth of 500 Gbps/mm. Ultralink is NRZ and 112G is PAM4 encoding (with NRZ for backward compatibility at lower speeds). We also offer HBM2 and HBM2E IP blocks.

Sign up for Sunday Brunch, the weekly Breakfast Bytes email.