• Skip to main content
  • Skip to search
  • Skip to footer
Cadence Home
  • This search text may be transcribed, used, stored, or accessed by our third-party service providers per our Cookie Policy and Privacy Policy.

  1. Blogs
  2. Breakfast Bytes
  3. Legato: Making the Bathtub Wider and Deeper
Paul McLellan
Paul McLellan

Community Member

Blog Activity
Options
  • Subscribe by email
  • More
  • Cancel
Automotive
legato
FITS
reliability

Legato: Making the Bathtub Wider and Deeper

22 Apr 2020 • 5 minute read

 breakfast bytes logo

Yesterday's post Automotive Reliability: The Bathtub Curve introduced automotive reliability and how transistors suffer from early life failure, hopefully nothing in mid-life, and then aging and wearout. This is summarized in the diagram below known as a bathtub curve.

So what can we do about it? Some things the manufacturer can do. But some, the designer, with the help of Cadence Legato Reliability Solution, can do.

In the early life failure stage, there are two primary things that the manufacturer can do. First, to ensure that the manufacturing test program catches everything and there are no test escapes that became field failures. Layout-aware test is one technique that can increase coverage. The second thing that can be done is burn-in. Many failures and effects that may eventually cause failure are accelerated at higher temperatures. Electromigration (metal migration) is a big one. So it is possible to put chips through the equivalent of years of use in a few days. So when I said that the bathtub is not to scale, it is really not to scale. The dark grey part may represent less than a week, then the middle mid-gray part may represent a couple of decades, and then another period hopefully measured in years before a transistor wears out. We don't actually care how long it takes for them all to wear out, the component will fail long before then.

In the middle, the main thing that the vehicle manufacturer can do is to prevent thermal stress by designing systems to be adequately cooled and use good power management techniques on the chip (in fact, the whole electronic control unit).

Wearout cannot be postponed indefinitely, so the goal is to extend the period as long as possible, the lifetime of the car at a minimum, and, especially, ensure no failures happen earlier than forecast. In fact, it is essential. Some analog problems require 15 years to develop with no efficient way to accelerate the stress, meaning that they have to be done by accurate up-front analysis. This is the topic of the rest of this post.

Legato Reliability Solution

 But there are things that can be done at the design stage to analyze these effects and predict the reliability for the lifetime of the part, with its expected duty cycle and environment.

Cadence's tool for this space, the Legato Reliability Solution, focuses on analyzing the performance of an analog design for potential issues, and to improve the test program. There are three main components:

Analog defect analysis is really the analysis of the early life failure stage (the left end of the bath). It has two main thrusts. One is to improve the efficiency of the tests by making designs easier to test and potentially reduce the number of tests (tester cost) required to achieve the target defect coverage. The second is to simulate the analog test, including defects, to estimate the test coverage of defective parts and ensure that a given defect would be "caught" by the test program. In a little more detail, analog defect analysis first identifies what manufacturing defects are possible, and collapses redundant faults. Then it performs analog fault simulation. Finally, it calculates the test coverage. Using the Virtuoso environment and the Spectre Accelerated Parallel Simulator, this can speed up the process by as much as 100X.

Electro-Thermal Analysis is all about preventing thermal overstress during the "bottom of the bath" during normal operation throughout the vehicle lifetime. Automotive chips often have to operate under the hood at temperatures up to 155°C, making high-power dissipation more challenging. By dynamic simulation of temperature rise and simulation of temperature protection circuits, designers can avoid thermal failures during a product's useful life. Going down a level again, this means extracting a thermal model for the die from the design data, and updating instance temperature during simulation-based on self-heating, and heat transfer from adjacent devices.

Aging Analysis is the right-hand end of the bath, end of life. It is typically focused on wear-out due to "use" of the transistor. But aging acceleration happens due to temperature and process variation (especially for FinFETs which have a nasty self-insulating property due to their shape, compared to planar transistors). This holistic approach allows designers to achieve their design lifetime targets with less over-design. Advanced aging analysis works by taking the circuit netlist and device model, adding a reliability model, but then adding a mission profile. Cars are not run 24 hours per day (heart pacemakers are at the other extreme, not stopping for many years once they are turned on). The mission profile contains the temperature, on/off time, and burn-in parameters. Spectre Native Reliability and RelXpert are then used for self-heating analysis and to run Monte Carlo analysis. They produce device/age information, lifetime/degradation information, and aged simulation (how the device will perform at any point in its life, although towards the end is the most important).

A Long and Happy Life

By doing everything described, we can accelerate early life failure (thus making the useful life start earlier), lower the failure rate through the life of the product, and postpone wearout (thus extending the useful life later). That gives a modified bathtub curve more like this:

Roman Emperors

Did you know Roman Emperors' lifetimes were a sort of bathtub curve? I happened to see an article about it at the start of the year: Roman Emperors Were More Likely Than Gladiators to Die Gruesome Deaths.

Here's one quote:

Saleh used a statistical method typically performed by engineers to see how long it takes equipment to fail. Many devices, when analyzed this way, fall into a pattern known as a bathtub curve. There are multiple failures when the device first hits the market. Then, failures taper off for a while. After devices have been around long enough to start wearing out, failures spike again, Saleh explained.

Here are the statistics for early life failure and wear out:

[Emperors'] risk of death was the highest during the first year in power. But if a ruler managed to survive his first year and stayed alive for the next seven years, his odds of dying declined significantly. However, that grace period lasted only four years. Once an emperor reached his 12th year in power, his odds of dying soared again, Saleh reported. Emperors who died after 12 years in power were more like devices suffering from wear-out failures.

We need Legato for Roman Emperors..."legato" is Italian, but it comes from the Latin "ligare", to tie.

 

Sign up for Sunday Brunch, the weekly Breakfast Bytes email.