Get email delivery of the Cadence blog featured here
We are continuing the physical verification blog posts with more questions we hear from advanced node customers. Paul McLellan had a nice blog post Dracula, Vampire, Assura, PVS: A Brief History on the history of the tools in physical verification. Christen Decoin posted Have DRC Tools Run Out of Steam? – Part 1, which outlined DRC market history by the timelines. DRC1.0, which was the single CPU DRC run era, followed by DRC2.0, which was the multi-threaded CPU run era, and the need for DRC 3.0 – what the market needs to address the enable designers regain the productivity loss with the market DRC tools.
We were recently talking with a customer who completed DRC signoff on a large 16nm design and their turnaround time using the market leading tool was over 4 days on 100 to 200 CPUs. They did not bother adding even more CPUs because the scalability of the existing tool was poor, and it would not improve the turnaround time. Hence, it was not economically viable to add more cores and expect faster runtime.
Due to the performance limitation, the customer’s strategy was to break up the runs by the deck switches or sub-decks and run them in parallel. The sub-decks were runs separately on 100-200 CPUs and it took 24 plus hours instead of 4 days. This helped them reduced the turnaround time by half! This process removed the flexibility of checking full chip DRC during design and impacted their productivity. Any engineering change order (ECO) change can lead to significant delays which can significantly raise the overall cost of the project. The market leading tools have lacked in innovating DRC performance to meet the market needs and continue to patch 20 year old engines to meet the advanced node technology needs.
Customers are always looking for alternative tools to save license cost or for other reasons. In the DRC market with no other alternative solutions in sight other than the exiting tools, CAD engineers set the benchmark criteria for another vendor’s DRC tool within the known constraints of the current market leading DRC software (incumbent).
The look at benchmark constraints that are well defined for running on 8 threads benchmark for small blocks and 16-32 threads benchmark for a large block or chip. This is a well-defined benchmark criteria for the tools in the multi-threaded processing DRC 2.0 era. But, if a truly new DRC software with massively scalable distributed processing technology would become available, the current benchmark criteria will become counterproductive defeating the objective of a massively parallel distributed processing tool.
The advantage of such a distributed technology is that it would be able to run full chip DRC on several hundreds or even thousands of CPU cores to enable overnight runtime again. If you benchmark this type of distributed technology versus a current multi-thread solution on one host with 8 thread, it will clearly give the advantage to the current solution. The multi-threaded infrastructure will be more efficient memory-wise in this this configuration, and the distributed technology will not be able to demonstrate any performance improvement. This would be like setting a race between a canoe and a powerboat, but not turning on the power boat engine. It is like asking the captain of the power boat to use paddles and race versus the canoe.
Designers that are looking to get performance back from their DRC solution, will need to adapt and be open to new metrics to evaluate a new, massively scalable distributed processing DRC solution as defined in DRC3.0 era. When such a tool becomes available it should be easy to define, especially these days where tapeout schedules are increasingly compressed and the cost of hardware resources is not high. Another option could be cloud computing? Will cloud computing be a good alternative for a company to quickly ramp up CPUs needed for a fast DRC run?