Quantus FS Field Solver for the FinFET Era

6 Sep 2017 • 4 minute read

For any parasitic extraction tool, there is always a tradeoff between performance and accuracy. If SPICE simulations were billions of times faster then we would use circuit simulation for large designs. In extraction, we typically use what is known as 2.5D algorithms that do pattern matching in order to avoid using a full 3D field solver, which is too slow for many types of design. Most people use a 3D-based extraction solution for selective nets, e.g., timing sensitive and critical nets only. I would like to be able to tell you that the problem has been solved and we magically now have 3D field solvers that are faster than 2.5D extraction tools, but that isn't the case. But it is getting a lot closer.

For FinFET and Sensitive Designs

As we go down to 7nm, 5nm, and below, the pattern matching approach loses accuracy, and really isn't good enough for extracting standard cells prior to characterization. Everything is getting so small that it is not just the most obvious nearby features that affect capacitance and so performance. Another separate challenge is to keep the size of the netlist down by reducing the network of resistors and capacitors without affecting accuracy. So many corners are required to characterize a library cell today that overlarge netlists can significantly increase the time required to characterize a full library or a memory. At 28nm, we needed "only" about 100 corners but for 7nm that has grown to over 300 and will presumably increase again for subsequent nodes. To make the problem worse, the number of cells has grown from about 1200 at 28nm to closer to 2000.

Customers today are forced to use 2.5D extraction tools for characterization of these designs since they can’t afford the performance hit of a 3D field solver. This is a potentially huge accuracy sacrifice in order to complete characterization in a timely manner and meet time-to-market goals.

For SRAM designs, how much SoC performance are you leaving on the table? 3-5ps difference in timing between using 2.5D versus 3D is huge for high-speed and ultra-low-power designs. It could impact 30-50MHz performance to be left on the table for 1GHz designs.

So the three big items for extraction of standard cells, memories, and IP library cells are:

Accuracy
Run time of the extraction tool
Netlist size (which drives runtime of the simulation tool)

Quantus FS

Cadence has been working with foundries on creating a field solver solution that has the required accuracy, a run-time that is the same order of magnitude as 2.5D extraction, and keeps the netlist size under control. The product is called Quants FS. Key features are:

Cloud ready—Scales to 1000s of cores
Low memory consumption leading to high capacity
Best-in-class accuracy versus foundry golden data
Smallest netlist size that speeds up characterization and simulation runtimes
Certified at all major foundries
Integrated into other Cadence tools (Innovus, Virtuoso, Liberate, Spectre...)

Performance

The performance of Quantus FS is good in the three dimensions called out earlier. The accuracy is closely matched to the foundries' golden data. The above on the left shows that Quantus FS is about three times as fast as a third-party field solver on a selection of cells from a 14nm standard-cell library. The graph on the right shows that the smaller netlists that Quantus FS produces result in 3X faster simulation runtimes compared to other third-party field solvers (all the netlists being run in Spectre).

Cloud Readiness

Running in the cloud is not just about the infrastructure to make this possible, but also having the code created in such a way that the tool can take advantage of as many cores as are available. The graph above shows the run time going from 320 cores to 1600 cores, working on a large SRAM with 4.5M nets in 16nm. There is a speedup of 4X going from 320 cores to 1600 cores, which is only a little less than the 5X theoretical maximum (5 x 320 = 1600).

DRAM

One design used during the evaluation was a DRAM, with about 300K transistors. The design was run on 8 CPUs, resulting in faster 5X better turnaround time than other field solvers, a reduction of 1.4X is netlist size, and a reduction of 1.7X in the number of resistors in the netlist—and all at the same level of accuracy. The DRAM also includes 1-2um-tall vias, which are challenging to model accurately, but Quantus FS had very small residual errors.

Summary

The three criteria of goodness were:

Accuracy—Matches foundries' golden data
Run time of the extraction tool—5X faster than other field solvers, scalable into 1000+ cores in the cloud for large designs like memories
Netlist size (which drives runtime of the simulation tool)—Reduction of 30% versus other solvers

For More Information

More information is available on the Quantus QRC product page.

Sign up for Sunday Brunch, the weekly Breakfast Bytes email.