Get email delivery of the Cadence blog featured here
This is the third post in a series on computation in EDA and adjacent markets. The previous two were Computation and the Data Explosion and Computational Hardware.
All EDA algorithms are computationally intractable, in that it is impossible to calculate the true optimum. This means that all EDA algorithms are full of heuristics, rules of thumb that usually work (although occasionally don’t do so well). Increasingly, the rules of thumb, or at least which rules to use, are decided using machine learning. Under the hood of many EDA tools, there are many approaches that could be tried, and only one of them works, or works best. With enough runtime (and EDA tools literally run for a week or more sometimes) the tool will find the one that works, but using a more intelligent approach to go straight to the best solution speeds up the design and so shortens the time to market.
At the next level up, the design-flow level, there is another area where machine learning can be used. A lot of semiconductor design consists of running a tool, looking at the output, tweaking a few parameters, and then running the tool again. The expertise of what parameters to tweak is in the designer’s head, based on their experience over the year, along with their experience in previous runs of the tool on the design. This is another area that can be optimized with machine learning, both by training on a broad portfolio of similar designs, and also by incrementally learning over successive runs of the same tool on the same design. This uses the unlimited compute power available from the cloud to substitute for the limited number of expert designers available to a project, meaning fewer humans in the loop. The ultimate goal is no humans in the loop, where the EDA system handles all the iteration automatically and delivers a result without explicit iteration driven by the design team.
Because of the unparalleled size of designs, and the large numbers of different algorithms involved, EDA teams have always been experts on computational software. The scale on which they work is beyond other industries. Facebook has 2.5 billion users, so it is a big task to analyze the relationship graph. But a chip has hundreds of billions of pieces of interconnect, so it is is at least couple of orders of magnitude more complex to analyze the voltages and currents to ensure the chip will work correctly. What other industries consider aggressively large are often trivially small examples in an EDA context. Under the hood, many algorithms—from the voltage and current analysis just mentioned, to circuit simulation, to placement, to synthesis—are efficient manipulation of very large sparse matrices. It turns out that many of the approaches used in machine learning are, to a first approximation, the same algorithms in a new context.
The graph above shows a lot of domains where computation is important. The x-axis indicates increasing computational difficulty, the y-axis indicates the amount of data involved. So, for example, bitcoin mining is computationally hard (by design) but doesn't involve a lot of data. Artificial intelligence typically involves enormous labeled datasets for the training—for example, ImageNet contains 14M annotated images. It turns out, in fact, that much of the computation required for deep learning training and inference is very similar to many EDA algorithms.
The table above shows many of the algorithms used under the hood in EDA tools. The challenge is to implement these algorithms efficiently for the enormous size of modern semiconductor designs. This requires concurrent implementation either in cloud data centers or on-premises data centers.
Some algorithms are naturally concurrent. For example, doing a design rule check of a chip can check all the rules in parallel since they are mostly independent. Or different parts of the chip can be checked in parallel (with care taken how to join up the patches). Other algorithms, such as simulation, are much more difficult to parallelize since there are global values (time, in simulation) that all the parts of the simulation need to use. This limits the amount of parallelization that is possible before the communication of the global values requires more time and computer power than is gained from adding more cores.
Cadence's expertise in computation software for semiconductor design has enabled it to expand into the nearby domain of system analysis. For details, see my posts Bringing Clarity to System Analysis and Celsius: Thermal and Electrical Analysis Together at Last. These system analysis tools scale to use huge numbers of cores to do finite element analysis, computational fluid dynamics, and electrical analysis on an enormous number of grid elements.
Sign up for Sunday Brunch, the weekly Breakfast Bytes email.