The news headlines have been full of reports about the 2010 Toyota Prius, which apparently has a software bug that causes braking problems on bumpy roads. What most reports don’t say is that some 2004 and 2005 Toyota Prius cars also had a software bug, and that Toyota engineers did an analysis of the problem that called for a new verification methodology.
The 2004/2005 problem was a bug that caused some Toyota Prius hybrid cars to stall or shut down while driving at highway speeds. Toyota identified a “programming error” in the computer systems of 23,900 Prius cars, and advised the owners to bring the cars into dealers for a software update. I found a very interesting presentation on line, apparently written by three Toyota authors, that talks about this powertrain control bug and the verification problems behind it.
The presentation was given at a 2006 SCADA (Supervisory Control and Data
The presentation notes that automotive control systems have become large-scale control systems. They’re comprised of modules that were designed and tuned by individual engineers over the years, and then integrated into an “incrementally developed legacy structure.” The presentation cites a lack of understanding of the whole structure. As complexity grew, hundreds of modules were interacting with each other. The number of tests grew exponentially as new functionality was added.
Citing the need for model-based development, the presentation calls for:
So now it is five years later and we have another Toyota Prius software bug, along with several other highly publicized problems. I have no insight into Toyota’s verification methodology today, but I have to wonder what was learned from the 2004/2005 experience.
The central problem is that software verification has no formalized methodology. Engineers basically run ad-hoc, directed tests until the clock runs out. On the hardware side we have metric-driven verification, executable verification plans, and coverage. Cadence Incisive Software Extensions, along with the IBM Enterprise Verification Management System (EVMS) I wrote about earlier this week, are aimed at bringing some of these hardware verification capabilities into the software world.
I think the Toyota 2004/2005 experience, and the analysis that followed, provides a lesson for system-on-chip (SoC) designers. Like Toyota, SoC designers are integrating many modules built by different people, and are often trying to upgrade and test an “incrementally developed legacy structure.” As complexity grows and modifications occur, understanding and verifying the whole structure becomes exponentially more difficult. A formalized, hierarchical approach to hardware and software verification is needed.
Meanwhile, as a 2006 Toyota Prius owner who’s been very happy with the product, I hope I’m not writing a blog five years hence about a 2015 Toyota Prius bug!
We may have a semantic issue here with verification ("are we building it right") vs. validation ("did we build the right product"), but it seems to me that ASIC functional verification has many features generally lacking in SW verification. I'm not aware that executable metric-driven verification plans, functional coverage, and constrained-random test generation are typically used in the SW world. You are right, however, that many problems occur when hardware meets software. This is where advanced functional verification methodologies from the HW world can be very useful.
"The central problem is that software verification has no formalized methodology."
Richard, I don't think you meant that the way it sounds. Software validation has been formalized many times, by SEI thru CMMI, and in many other ways. It is a much more mature practice than ASIC validation.
I think the issue is when hardware meets software, like in these embedded systems, you have people outside their area of expertise. ASIC designers don't have the rigor or methods for software validation to apply to the drivers and such that they are validating. And software engineers don't understand everything about the hardware. Combine that with the task of integrating legacy control systems, like you describe, and an issue is bound to occur.
As for the car, I go back to something I was told years ago when I was working on a system that decided when to fire an airbag. "The car is just a lawsuit on wheels!"