Get email delivery of the Cadence blog featured here
In honor of
Halloween, here are some horror stories about low power bugs. These are
real bugs at real customers that would have led to real dead chips.
#1: It was a dark and stormy night… Ok, it was around lunch
time. But a customer had just spent three weeks coding a UPF file to
describe isolation between two power down domains. The customer’s
synthesis tool required that the UPF only implement either from isolation (from
a power domain) or to isolation (to a power domain), not both. So in
order to prevent redundant isolation between separate domains enabled by the
same power net, he had to specify specific nets between the domains, and
specify them as no_isolation. Only, and here’s the scary bit, he had made
a mistake, and had incorrectly identified a couple of pins that really went to
always on logic. So, when the block was powered off, the always on logic
would get nuked. *Shudder* Fortunately, one of my guys got static
checks operational in an hour or so (shameless plug – Conformal Low
Power). The static checks structurally proved that isolation was missing
between the power off block and the always on logic. The dead chip was
#2: The door creaked slowly open, revealing… Ok, he was in a
cubicle. But a customer was doing power shutoff on a processor based
design. The power controller was essentially a register bank. The
real time OS would get an interrupt from the power down timer, and would
execute the power down routine. The routine would hit the bit enabling
isolation, and would then hit the bit disabling power to the power down block.
All well and good, and pretty routine. The design was operational, and
had passed all the static checks – everything was perfectly isolated and looked
fine. But there was terror lurking – the processor’s secondary cache was
in the power down block. The processor could execute the code to turn the
power off, but would never get another instruction – ever! *Shiver*
Fortunately, one of my guys got low power simulations operational with a high
performance low power simulator (shameless plug – Incisive Design Team Simulator).
Since there wasn’t a significant overhead to the simulation, they ran a lot
more test cases. As a result, they caught the issue and moved the cache
to an always on block. The processor lived to execute another day.
#3: There was a sound, and he turned around slowly to see the maniac
raise his… Ok, it was his boss (hey, that is scary!) But a customer
had defined the power intent for a big chip, and had proven the netlist and
power intent was correct. This was a big device, requiring feedthroughs
across a power off domain. His implementation tool, however, didn’t
understand the power intent. So it didn’t understand the always on cells,
and so didn’t understand that they needed to be powered by special routes connected
to an always on supply. As a result, the feedthrough buffers in a power
down domain were powered like all the other cells in that domain – by a
switchable supply. *Eeeek* Fortunately, one of my guys got physical
netlist checks operational. These checks check that the always on power
pins specified in the LEF are actually powered by always on power rails
(shameless plug – Conformal Low Power). The always on nets remained
always on, and the chip made it.
So are you terrified
yet? As Count Floyd said on SCTV, “pretty scary, huh kids?” Again,
these are real stories – nothing made up here. My point with these
stories is not to scare you, but instead to point out the need for a complete
verification strategy that includes both static and dynamic checks, and to
continue to run these checks throughout the flow. There are some nasty
things that can happen when you do low power design, and these can quickly lead
to some very dead chips. I don’t know about you, but I find the thought
of that much scarier than goblins and monsters and kids in Sarah Palin masks.
Oh, and… BOO!