Get email delivery of the Cadence blog featured here
One of the new experiences that Cadence brought to DAC this year were the Experience Rooms in the booth. In these themed rooms, technical experts from a variety of companies shared use cases and design techniques on specific topics. On Monday, June 6, Shashi Visweswaraiah from NXP stopped by the Digital Experience Room to weigh the differences in results between static timing analysis (STA) and distributed STA (DSTA).
NXP uses Cadence’s Tempus Timing Signoff Solution, tapping into a use model consisting of (DSTA) with concurrent multi-mode, multi-corner (MMMC) analysis. In this model, a master partitions and distributes information and commands to each of the clients, which perform delay calculation, signal integrity analysis, intercommunication with other clients, and multithreaded operation on up to 16 CPU cores. The master reads input files like Verilog netlist, constraints, and libraries.
Visweswaraiah compared two designs where NXP used both STA and DSTA with concurrent MMMC. Design 1 consisted of 40nm instances and 1400 clocks on a 28nm process.
In the single view, DSTA had a runtime of 112 minutes on 40 CPUs. Multiple views (3) had a runtime of 136 minutes on 40 CPUs with DSTA. In the end, Visweswaraiah noted, applying additional views required minimal overhead and using DSTA versus STA yielded a 2X runtime improvement.
Design 2 consisted of 6.5 instances on 16nm technology. Here, DSTA made a big impact with regard to runtime and memory usage. In the single view, STA on 6 CPUs took 30 minutes, compared to 13 minutes with DSTA on 18 CPUs. With multiple views (16 views/mode), STA on 6 CPUs took 350 minutes versus 127 minutes using DSTA on 24 CPUs.
Using the DSTA commands in Tempus Timing Signoff Solution “was fairly straightforward once we worked with Cadence because we could plug it into the existing flow itself.” So, as Visweswaraiah noted, it’s not complicated to convert an STA script into a DSTA script.
On large designs, STA is less practical in terms of runtime and memory usage. NXP plans to move to DSTA for its next design, which is over 50 million instances. “The number of views per mode is exploding, so the ability to use existing resources on a LSF (load sharing facility) farm is paramount,” said Visweswaraiah.
He also had some best practices to share: