The recent Cadence Signoff Summit produced a wealth of information about the technology behind the Tempus Timing Signoff Solution, as well as some deep background about current challenges in static timing analysis. The summit provided closer looks at path-based analysis, timing closure for advanced nodes including FinFET processes, and statistical on-chip variation (SOCV). I blogged last week about SOCV, so this post will focus on path-based analysis and advanced-node timing analysis.
Ruben Molina, product marketing director for signoff at Cadence, offered a brief overview of Tempus, which was introduced in May 2013. He noted that timing signoff has become the latest bottleneck in the physical design cycle, taking possibly up to 40% of the overall design time. Tempus attacks this problem with massively parallelized computation that is scalable across hundreds of CPUs.
Tempus supports both multi-threading in multi-CPU machines and distributed processing across a network of computers. In addition to a 10X design closure speedup over existing solutions, it promises to support 100-million-plus cell designs across hundreds of timing views. Tempus provides physically-aware optimization, hierarchical or flat ECO generation, and integrated signal-integrity analysis.
PBA and GBA
Most timing analysis tools use graph-based analysis (GBA). It's more pessimistic than path-based analysis (PBA), but the latter is more computationally demanding, and that has limited the use of PBA. With its massively parallel approach, Tempus makes PBA affordable.
Ed Martinage, customer engagement architect at Cadence, discussed the differences between the two approaches. With GBA, he noted, "we take the network topology map into a graph, and we can do an efficient global timing analysis. But because we have different cells or nets on different paths, we have to use worst cases, which leads to pessimism." With GBA, a worst-case approach is taken for slew propagation, waveforms, and advanced OCV (AOCV) derating.
When multiple input gates fan into the same output using GBA, a process called "slew merging" occurs. In the example shown at the left, two slews are converging on the Z pin. GBA will use the slowest of the slow slews, which is 50ps at the output pin. GBA propagates the worst slew and is thus pessimistic when it comes to arrival times. PBA, in contrast, only propagates the actual slews that are on the path, and comes up with a more realistic slew of 20ps.
Martinage noted that the reduced pessimism of PBA has power and area benefits as well as performance advantages. He showed some Tempus data that compared GBA- and PBA-based hold fixing on in-house customer designs. Over four designs, the area gain ranged from 0.4% to 5.7%, with the latter example providing a 6.6% reduction in the number of inserted buffers. Martinage also showed an example in which PBA provided a 12% decrease in leakage power on a high-speed core design.
PBA has not been the "go to" methodology for signoff because of its computational demands. It also requires "onion peeling," as Martinage put it, because paths underneath a path that is retimed with PBA may also have unnecessary pessimism. Tempus supports PBA for "mainstream" signoff because of its massive parallelism—up to 20 million paths/hour on 32 CPUs—and tight integration with a physically-aware signoff ECO flow.
Timing analysis at 16nm /14nm and beyond
Igor Keller, senior R&D architect at Cadence, spoke about some of the challenges that appear at advanced nodes, most of which pertain to 28nm and 20nm as well as sub-20nm FinFET processes. As an EDA developer, he noted, his challenge is to maintain accuracy without increasing run times as circuits grow larger and more complex.
Advanced nodes present these challenges for static timing analysis:
Moreover, some designers are designing ultra-low voltage ICs (50% Vdd or less). This introduces modeling challenges, produces smaller currents, and causes slower transitions, leading to larger overlaps between the input and output of gates. Gates start to switch late and the output switches only at the "tail" of the input waveform. What becomes important, Keller said, is the tail of the transition, not the slew.
Keller talked about some of the technologies that Cadence has developed, including an Equivalent Waveform Model (EWM) that makes it possible to interpret a noisy waveform. With Waveform Propagation (WFP), real waveforms at receiver inputs are stored and used for a next-stage analysis. Further, Tempus timing library requirements include an 8-piece pin capacitance model, a normalized driver waveform with an active pre-driver model, and an Effective Current Source Model (ECSM) noise model for SI glitch analysis.
Below 20nm, double patterning is required. Tempus supports multi-valued SPEF files to understand timing variation due to double pattering corners. Tempus can model double patterning through two special worst/best case corners. Cadence is working on a more advanced approach that uses statistical timing.
"A more accurate analysis requires more run time, and that means you are unhappy," Keller concluded. "To alleviate your unhappiness we use our massive parallelism to deliver run times similar to that of previous nodes, and we do simulation a lot more accurately." Here's hoping that the technology discussed at the Signoff Summit will produce happy customers for many years to come.
Related blog posts
Signoff Summit: An Update on OCV, AOCV, SOCV, and Statistical Timing
Tempus—Parallelized Computation Provides a Breakthrough in Static Timing Analysis
Electronic Design article
EDA Vendors Should Improve the Runtime of Path-Based Analysis