How long do you expect your electronics to last?
Depends on the device, I expect. A mobile phone may last a few years; a laptop may last five years or so, a television may last maybe ten years or more. What about an automobile? Maybe fifteen years? (When I say a "device", I mean a thing like a car or a TV; when I mean a transistor, I say so.)
Now think of the circumstances we expect our electronics to work perfectly. A cell phone and laptop have to be drop-resistant and the more waterproof the better. A television has it the easiest—it only has to work in room temperature and isn’t likely to be dropped. What about a car? We expect it to work the same in Death Valley on the hottest day of the summer and in Minnesota on the coldest day of winter—probably about a 125°F difference or more. We can drive it for eight hours in a row when we take road trips, or we let it sit idle for days at a time. It has to continue working even after we take this curve a little too sharply or misjudge the depth of that pothole.
Occasionally we’ll purchase a dud. The washing machine will die after the first week of service, the refrigerator stops cooling, the transmission on the car will go poof. It happens. (That’s why there are so-called lemon laws!)
A graph, called a bathtub curve, describes these three stages of failure: early failure, due to defects in parts (often referred to as "early mortality"); consistent or random failure, due to circumstances of usage; and wear-out failures, due to the end of life of the device. For every device you have ever had that has failed, you can probably point to a position on that graph in which it failed.
The bathtub curve
Transistors follow this bathtub curve, too. They may suffer from early mortality. (If it is a critical piece, we actually run the parts at high temperature and then re-test them, known as "burn-in"). Then there is the useful life, followed by the wear-out. Transistors get old and fail, too.
Chips are tested after manufacture, of course, and confirmed to work. But we obviously can’t make sure every chip will last ten years by putting it on a tester for a decade; that has to be done during design by analyzing the reliability.
That testing is done by some of Cadence customers, the reliability engineers.
When an analog component in a device fails, it’s usually because of one of three reasons: either the part is defective and its defect wasn’t discovered during the test phase of development, thermal over-stress (caused either by the ambient temperature or amount of power being processed), or the component has reached the end of its lifetime, however long that may be.
The neat thing is that Cadence has a product called the Legato Reliability Solution, that addresses each of these concerns.
When you’re designing a device down to the integrated circuit (IC) level, you obviously can’t go around manufacturing test chips, trial and error, until you get it right. You have to design the chip, and then simulate the design, and part of that simulation is predicting where manufacturing errors may occur. This process of predicting manufacturing errors was traditionally done by hand, but the Legato Reliability Solution includes an analog defect component that automates the process.
So that’s cool.
When it comes to designing ICs, the main thing to remember is that heat = bad. Thermal overstress can happen for two reasons: either the ambient temperature is too high, or there are places on the IC where too much power is flowing through not enough transistors. Thermal overstress affects transistors by causing them to wear out faster. It’s not so much that they get too hot for a few minutes---more that the higher the temperature, the faster they wear out, and the harder it is to design the reliability in. The big issue is not so much that the transistors fail outright, but that their characteristics change and so the design needs to be designed so it still works correctly late in life when those changes have happened.
So the Legato Reliability Solution includes a component that makes a thermal model of your design so you can see the interaction between electrical and temperature, and make the proper design decisions to mitigate the electro-thermal risks.
As we talked about the different age expectations of our devices and transistors above, so the electronic components must follow. Designers must find a way to accurately predict the effect of stress over the lifetime of a device; otherwise, it can wear out and fail prematurely, or the characteristics of the transistor change too much (in analog) for that part of the chip to work correctly, so it fails.
And not all transistors are created equal. A little transistor might be used once during an operation, and another bigger one might drive a big off-chip load. They may even run at different voltages. They’re going to be operating at different temperatures, even though they’re all transistors. So we need to account for that.
There are three ways to address this aging problem. The first is to improve the models—small errors in models now make a big difference over fifteen years. Secondly, the simulations need to take into consideration aging plus temperature plus variation altogether. Finally, the stress simulations need to be appropriate for the length of usage of the device. What works for an inexpensive cell phone that has to last for a few years may not work for a car that has to last decades.
We’re always trying to come up with more realistic ways of describing how things will be used.
The next time you use a device (and you’re using one right now on your laptop or phone), think about all the unbelievable amount of work that went into the design of it. How is it possible that the camera on your phone, for example—consisting of an image sensor, a processor chip, and a memory component—can possibly work together, reliably, for the entire life of your phone?
Part of it has to do with the testing and simulation that the designers of your device performed before it was manufactured. Future products will be more reliable because now designers can rely on the Cadence Legato Reliability Solution.