Chiplet Summit: Challenges of Chiplet-Based Designs

3 Feb 2023 • 9 minute read

I wrote the first post, The Chiplet Summit, from the recent Chiplet Summit in San Jose, If you have not seen that, you should probably read it first.

A leitmotiv of the conference was:

Moore's Law is dead. All we have left is packaging.

As I said in the final summary paragraph of my earlier post:

The situation today is that single-company multi-chiplet designs are shipping in volume, tentative steps are being made with some chiplets to build ecosystems of partners around them, and the dream of a chiplet store is sufficiently far off as to remain a dream for the time being.

Today, I want to look at the technical issues that will require solutions to be able to do chiplet-based designs with chiplets from multiple companies who did not pre-plan making those specific chiplets work together. The analogy is how you can buy chips from different manufacturers and put them together on a PCB and have a working system, even though the companies that designed the chips never planned that specific system.

chiplet vs soc

2.5D vs 3D

First, a clarification. We are talking about putting chiplet-based designs together on an interposer (either silicon or organic). We are not talking about true 3D design where multiple die are stacked on top of each other. Designs like this are shipping (for example, Sony's image sensors have a three-die stack with logic, memory, and the sensor itself). However, stacking multiple die typically requires through-silicon vias (TSVs), so the die need to be very carefully designed so that everything aligns. I think it will be a very long time, if ever before you can expect die from different vendors to stack in true 3D fashion. For now, any true 3D die stacks will be designed by a single company partitioning a large design into multiple die. There are also major thermal challenges, not just challenges aligning all the TSVs. See my post, Design Enablement of 2D/3D Thermal Analysis and 3-Die Stack for some preliminary work that Cadence is doing jointly with imec on just these topics.

So for the rest of this post (and for the foreseeable future) chiplet-based designs means 2.5D designs.

Interchange Formats

If you are going to do a chiplet-based design, your design tools need to be able to read in something that describes the important aspects of the chiplets. There are two important standardization efforts.

First, TSMC announced 3Dblox at OIP last October. For details, see my post, TSMC OIP: 3DFabric Alliance and 3Dblox. 3Dblox is an open standard which (quoting from that post):

3Dblox provides generic language constructs capable of representing all current and future 3D-IC structures

Modularize the 3D-IC structures such that EDA tools and design flow can be simpler and efficient

Ensure standardized EDA tools and design flows are compliant to TSMC 3DFabric technology

cadence support for 3Dblox

I won't say more about 3Dblox here. Read the earlier post for a deeper dive into some of the details. Cadence's tool portfolio supports 3Dblox (all the green dots in the above table).

cdxml

The second standard is called CDXML, which stands for Chip Data Exchange Markup Language. This standard was developed by OCP, the Open Compute Project Foundation. On the first day of the Chiplet Summit, it was announced that JEDEC is working with OCP on this standard, and it will be incorporated into JEP30, JEDEC's Part Model Guidelines.

At least for now, then, we have two open standards: 3Dblog and CDXML.

Communication Standards

I covered communication standards in my first post from the summit. But the summary is that there are two viable standards right now, Bundle-of-Wires (BoW), which is being used for designs in progress today, and UCIe (Universal Chiplet Interconnect Express), where IP is coming to market. See my post, UCIe PHY and Controller—To Die For. However, the UCIe standard was regarded by participants at the summit as "not yet completely ready," although with Intel, AMD, Arm, Google, Meta, Qualcomm, and more behind it, it is expected to be the eventual winner.

Known Good Die

2.5d chiplet-based design

Packaging multiple chiplets into a single package is different from just using a single die. If you are using a single die, there is a tradeoff between the cost of a package and the cost of doing a good job of wafer sort. Testers are expensive, and so doing "too good" a job of testing the die before the wafer is diced is wasting money. Of course, packages cost money too, so you don't want to waste too many of them.— but when you do waste a package because the die was bad, you are not wasting a die since it was already bad.

The economics are completely different with multiple die in a package. If a die escapes wafer sort and is bad, then when it is packaged with all the other die, you are not just wasting a single bad die (and the cost of the package), you are wasting all the good die in the same package too. Plus the cost of a package for multiple chiplets is a lot higher than a package for a single chiplet. There is thus a major premium on testing each die thoroughly before it enters the assembly process. These die are known as KGD for Known Good Die. For more details, see my post Known Good Die.

There are some things that can be done to optimize the packaging process, such as planning to be able to test a package with only some of the die already inserted. This allows the cheap die to be put in early and then tested, and then the expensive die (like a CPU or GPU in the most advanced node) to be put in at the end. This avoids the problem of a very expensive part being sacrificed due to the failure of a very cheap part.

Testing

Testing multi-chiplet designs (and even true 3D designs) is covered by IEEE 1838-2019 - IEEE Standard for Test Access Architecture for Three-Dimensional Stacked Integrated Circuits. For more details, see my post, IEEE 1838: Taking Test into the Third Dimension. This covers how to test designs when all you have access to is the package pins.

Security

There are a lot of issues around security. You probably know that the modern way to handle security is with a hardware root of trust. For an example, see my post, OpenTitan: Secure Boot with a Silicon Root of Trust or Putting the Bad Guys in an Arm Lock. With a chiplet-based system, the first thing that you need to decide is whether you trust all the chiplets, or if there is a possibility that a bad guy has somehow compromised one or more of the chiplets that you are acquiring from quasi-strangers. The next thing you need to decide is whether to have one chiplet in charge of security (containing the secure enclave with the keys, etc.) which then validates that all the other chiplets are secure. If a lot of the chiplets contain microprocessors that need to be booted, then this can be handled centrally, or perhaps, each chiplet has to handle its own secure boot.

As Scott Best of Rambus Security pointed out, a 5nm design is so complex it is close to impossible to design, never mind reverse-engineer. But a chiplet-based design is easier:

When you break it down into chiplets, the SiP is only as good as the least secure chiplet.

challenge-response

To make it worse, while it is pretty much impossible to monitor a lot of internal signals on a 5nm chip, monitoring the signals on the interposer in a multi-chiplet design is much more feasible. In practice, this means that communication between the chiplets of anything security-related needs to be encrypted. Of course, since these chiplets were never designed specifically to work together, this is not simple. The usual way to handle this is with some form challenge-response, but this needs to be designed into each chiplet. In practice, there will need to be some sort of security standard for chiplets developed.

Oh, and don't forget about side channels, such as Differential Power Analysis (DPA). If you don't know what that is, see my post, EDPS Cyber Security Workshop: "Anything Beats Attacking the Crypto Directly." Or glitching the power supply or the clock. See my post, Black Hat: Glitching Microcontrollers.

Finger Pointing

What happens when there is a failure? How do you find which chiplet is responsible, given that you probably don't understand all the internal details of all the chiplets you purchased?

Although some people regard this as a big problem, I'm not sure it is all that different from working out which IP block is responsible for an SoC failure or even which chip on a board is responsible for a board-level failure. One approach is to anticipate that this might happen and have ways of enabling and disabling various aspects of the system. In a microprocessor, these are known as "chicken bits" (Google insists I meant "chicken bites" and provided me with lots of recipes).

Special Markets

Two almost random things that came up during the summit that I don't really have anywhere else to put.

Supercomputers, the very highest end of HPC, have almost always used COTS parts, "commercial off-the-shelf" parts like Intel/AMD CPUs, NVIDIA GPUs, FPGAs, and so on. As Lawrence Berkeley Laboratory's John Shalf said:

We know we cannot afford to spin our own chips from scratch.

So for him, chiplets are an opportunity. They can use commercial chiplets (COTC?) and integrate them tightly into systems.

The second random thing is automotive. The automotive industry is negative on chiplets since the mechanical issues associated with all the vibration in a car can lead to reliability problems. And remember, cars are expected to work for twenty years. On the other hand, autonomous driving is going to hit up against the reticle limit like everything else is doing, and so the industry may as well "bite the bullet" since they will need to use chiplets anyway. As it happens, the leader of the Cadence Academic Network in Europe, Anton Klotz, was at an automotive conference this week and received a similar message:

The volume of autonomous driving chips is not sufficient to justify the costs. By combining chiplets from different vendors on one interposer the total costs are expected to go down

Chiplets are much more energy efficient than PCBs; therefore this kind of integration is needed in order to increase the range of E-cars and still offer maximum performance

Cadence obviously has a whole portfolio of products to design chiplets. But it also has products focused on doing chiplet-based designs. See my posts:

Sign up for Sunday Brunch, the weekly Breakfast Bytes email.

Subscriptions