In the past number of weeks/months we have all seen how Toyota has struggled to manage perception
around their "sudden acceleration" problems. The first fix that was
proposed was a replacement of the floor mats, under the argument that the mats
had been forcing the gas pedal down. Quickly following this first
announced that they were issuing a recall to fix the mechanics of the gas
pedal, adding a small spacer to prevent the pedal from getting physically
Even during the recall, there started to appear a number of people who were
stating that their pedal wasn't physically stuck, and that they were unable to
turn off the engine or shift the car into neutral. The implication is that
while the physical fixes may have helped, there might be a more underlying problem
associated with the electronics.
There are some good reference pages online (example
report from Frost & Sullivan) that show how the automotive industry is
expanding its use of integrated electronics.
As far back as 2003, BMW announced that their 5- and 7-series cars already
had upwards of 100 microprocessors in them, to manage functions from engine
control to opening the windows, and unless designed very carefully, these
systems could fail a number of different ways. One failure mechanism is the
software itself getting locked in a tight loop ... who hasn't had to force a
reboot on their computer to get it out of such a situation? Another failure
mechanism could come from some form of electrical interference, either in the
wiring harnesses or directly in the microprocessor and sensor chips.
With safety as their #1 goal, automotive suppliers must perform rigorous
testing and validation to prove that their components are not going to fail
under some harsh conditions, including large operating temperature ranges, high
levels of humidity, varying voltages, electro-magnetic interferences,
mechanical stress ... the list is long.
Specifically for the chips that are used in automotive systems, there is an
absolute requirement that they are validated to correctly operate under all
possible conditions and scenarios. What does this mean to the automotive chip
design team? Many weeks of simulations to ensure that the functionality is
operating correctly in all possible modes of operation, extended physical
verifications to ensure that the chips do not fail due to the high stress
environments that they execute in, and extended electrical checking to ensure
that timing, IR drop, electromigration, Joule heating, electro-static
discharge, latch-up, signal integrity and ... and ... and ..., are all fully
EDA is a critical component of design, and so we must ensure that EDA tools
and functionality used to perform such comprehensive verifications and
validations continue to keep up with the ever-advancing chip design
requirements and the ever-increasing focus on safety.
If you are interested in a similar viewpoint of the challenges of the automotive
industry, you might want to take a look at the blog
from Richard Goering on the Toyota Prius.
Is there anyone from the automotive design industry that would like to give
us the views from the design team?
Safety engineering tends to depend on preventing failure from a -single- fault. Any single fault (undervoltage, over-temp, component failure...) will not cause a unsafe condition. With 100+ processors, we have vastly increased the possibility of >1 failure. We need to not only "fail-safe" but DETECT failures so that they can be repaired before they stack up. This would also require useful diagnostics that give more than a hint at what is the problem. Keeping every system as simple as possible should also be a design goal to limit interaction.
Sorry, I think simulation at its best (is it there yet?) is a far cry from what is needed to really assure your mother that it's safe to drive that electronically controlled vehicle in all situations. The best that simulation can do is pretend its the real system. This first assumes 100% accuracy in every parameter. With that, then, all you can do is run the test vectors that you've thought of. What about the conditions you haven't considered? Everyone knows those are the real bugaboos. Some wild combination of temp, vibration, flipped Flash bits, subwoofers, distant storm, electrical noise, age, and solar flares just when the carry bit was a 1 during the 32-bit arithmetic shift and the car battery dipped to 10.8 volts - that's when the stack overflow went undetected. Those real-life sporadic virtually unpredictable circumstances can hardly be predicted, let alone simulated. And imagine the combinations! Perhaps a little better shielding on some cables would have prevented the problems, but how much overbuilding is rational? After 30 years in the processor industry, I'm astounded we've had as little difficulty as we have. Perhaps there is a design flaw in there somewhere, and we all have to be vigilant to keep the bugs at bay, but nothing manmade or in nature is foolproof or perfect.