# CS + Math

In most domains, a million is a large number. In EDA, a hundred billion is normal, and predictions are for trillion-transistor chips by 2030. If you design a trillion-transistor chip in 1000 days (about 2½ years), that's a billion transistors a day, over a thousand per second. That takes a lot of computer science and mathematics, or CS+math.

## The Mathematical Ideal

If you studied physics in school, you will have encountered things like frictionless pullies, inelastic wires, and incompressible fluids. Objects that are perfectly black. Of course, in the real world, there are no frictionless pullies. No paint absorbs absolutely all light (although Vantablack is pretty close).

It's not just physics either: economics has the idea of a perfectly competitive market, where all companies sell identical products, companies are able to enter or exit without barriers, and buyers have perfect information.

Of course, one reaction to all these non-existent ideal concepts is to say that we can't do physics or economics because it's a waste of time to work with things that cannot exist. But most of the time, the world is "linear," and the values we derive from working with these idealized objects are very close to the real-world values when pulleys have friction, and black bodies reflect a little light. For the perfectly competitive market, perhaps the closest we get to that is the DRAM market in semiconductors.

It doesn't always work like this, though. Sometimes the world is not linear but chaotic. You may have heard the story about Edward Lorentz. He created a primitive weather model, but it ran really slowly on the computers of the time (1961). He takes up the story:

At one point I decided to repeat some of the computations in order to examine what was happening in greater detail. I stopped the computer, typed in a line of numbers that it had printed out a while earlier, and set it running again. I went down the hall for a cup of coffee and returned after about an hour, during which time the computer had simulated about two months of weather. The numbers being printed were nothing like the old ones. I immediately suspected a weak vacuum tube or some other computer trouble, which was not uncommon, but before calling for service I decided to see just where the mistake had occurred, knowing that this could speed up the servicing process. Instead of a sudden break, I found that the new values at first repeated the old ones, but soon afterward differed by one and then several units in the last decimal place, and then began to differ in the next to the last place and then in the place before that. In fact, the differences more or less steadily doubled in size every four days or so, until all resemblance with the original output disappeared somewhere in the second month. This was enough to tell me what had happened: the numbers that I had typed in were not the exact original numbers, but were the rounded-off values that had appeared in the original printout. The initial round-off errors were the culprits; they were steadily amplifying until they dominated the solution.

This was the first known example of what has become known as the "butterfly effect", where a butterfly flapping its wings in Paris can affect a tornado in Nebraska.

## Electronic Design Automation

In EDA, one of my favorite ideas in the vein of frictionless pullies is force-directed placement. We assume all standard cells are points and represent the wires connecting them as springs. There needs to be some additional force pushing the cells apart, or the springs will all pull them to the center of the region. But, in principle, you can calculate where all the cells will end up. Are standard cells points? Of course not. Is a spring that can go on a diagonal a perfect representation of a rectilinear interconnect? No. But you can use an algorithm like this to get a good starting point and then start to worry about how cells cannot really overlap, and wires have to go horizontally and vertically. By the way, we have much better placement algorithms than force-directed placement these days, but I still like the elegance of the idea. Plus I can explain it in a paragraph.

Force-directed placement is a good example of how EDA is so often a combination of computer science and mathematics, or CS+math. Almost all algorithms in EDA are mathematically intractable, in the sense that they are NP-complete or exponential, so it is simply not possible to get the optimal result. Instead, EDA programmers aim to get an acceptably good result (and also avoid ever producing catastrophically bad results). Like with the frictionless pullies, we can see the underlying mathematical model, but we can't solve it exactly, we need to use what is known as "heuristics".

I Googled "heuristic," and the dictionary definition that I got was:

proceeding to a solution by trial and error or by rules that are only loosely defined

That is a terrible definition! I much prefer Wikipedia's:

heuristic technique, is any approach to problem solving or self-discovery that employs a practical method that is not guaranteed to be optimal, perfect, or rational, but is nevertheless sufficient for reaching an immediate, short-term goal or approximation

Or CS+math for short.

Neural networks, especially predictive language models, have made tremendous progress in the last few years. I wrote about this recently in my post ChatGPT: "A New Technology Adjusts Your Thinking About Computing" and then I got ChatGPT to write my blog post for the following day, Using Clarity 3D Solver to Analyze 3D Packaging. Under the hood, these models rely on a neural network with weights (in some cases with literally billions of weights). These weights are calculated during what is known as the training phase, and then used during what is known as the inference phase, when the neural net is used to do whatever it was set up to do—classify images, write prose, recommend TV shows, or whatever. However, the training phase is not exact, it is a statistical process that comes close to getting the "right" answer, like those frictionless pulleys getting close to the real values for real-world pulleys. In most cases, with lots of training data, it seems to converge rather than suffer from butterfly effect behavior.

## Computational Software

Cadence uses the words "computational software" for much of what we do. This highlights the fact that there is a lot of commonality between circuit simulation, neural network training, placement, fluid dynamics, molecular modeling, and any of the computationally demanding aspects of what we do. Another area of commonality is that, historically, we worked on algorithms that ran well on a single server, and waited for Moore's Law to make things faster. During the period of happy scaling, when processors increased in speed by 50% per year, this worked well. But processors are no longer increasing in speed (much). Instead, we can have a lot of them. At the level of cloud services like AWS or Microsoft Azure, we can have as many as we want. So the focus is much more on algorithms that work well when parallelized onto many cores, hundreds or even thousands of them.

Using many cores potentially has another advantage, lots of memory to store the data. EDA operates on staggeringly large amounts of data. When you hear that the latest GPU contains a hundred billion transistors, that means that the EDA has had to deal with that much data, or in the case of physical layout, perhaps literally trillions of polygons.

As I said at the start of this post, that takes a lot of CS+math.

**Sign up for Sunday Brunch, the weekly Breakfast Bytes email.**

.