Computational Hardware

1 Nov 2019 • 4 minute read

This is the second part of a three-part series of posts on computation. The first part, Computation and the Data Explosion, appeared yesterday.

Nothing is really as complex as a modern SoC. The AMD Zen2 chiplet contains nearly 4 billion transistors, and there are eight of them in a high-end system. Its I/O die is over twice the size. There are probably well over a trillion polygons at the layout level, all of which have to be manufactured perfectly for the design to function. I don't know how many moving parts there are in other processes that involve a lot of computation, such as drug-discovery or protein folding, but I find it hard to imagine they are operating at the level of trillions of things, and that number is doubling every couple of years, which is what EDA has to contend with.

EDA has always involved a lot of computation. Of course, “a lot” has changed over the years—when I started working in EDA, 10,000 gates was the largest chip we needed to deal with—but semiconductor design always faces the problem of designing the next generation of systems using the current generation or compute infrastructure.

Computers for Semiconductor Design and EDA Development

Compute infrastructure has changed a lot. When I started at VLSI Technology, we had exactly three computers. A VAX-11/780 that we ran several graphics terminals on, and two Apollo DN100 workstations (that's one in the image at the start of this post). The VAX was roughly a 1 MIPS machine (and yes, the S is correct, it is not a plural) shared among all the designers in the company. I don't know how many that was but my badge number was 130, so there must have been over 100 people there. We hadn't ported our EDA software to the workstations yet, so those were reserved for the software engineers tasked with making that happen. In that era, one VAX was shared among all the IC designers. The big chip we were working on when I arrived in the US was Bagpipe, a chip to go in the future Macintosh. To give you an idea of how primitive the tools were still in that era, the first silicon didn't function because it had a power-ground short. We had no circuit extractor, nor what we called netcompare but is usually called LVS (layout-versus-schematic) today. After about six months in the company, the task fell to me to create the necessary circuit extractor. Plus, without extracting any parasitics, our simulation tool took no account of timing. You had to guess what was likely to be the critical timing path and run SPICE to check it ran fast enough.

In the next era, pretty much every designer and software engineer had their own workstation beside their desk (or on it, depending on the era). There were additional file-servers around. These were faster computers, 10 MIPS and more. In this era, every engineer had their own machine, and perhaps access to a few more on-demand.

When I finally left VLSI/Compass and went to Ambit, we built a test farm containing 40 Sun workstations and 20 HP workstations. The idea of a server farm was still so weird that Sun wouldn't sell us workstations without graphics displays, which we just piled up in a storeroom. Our test farm was the first server farm I know of, although I'm sure there were others, ideas like that tend to be in the air. Of course, it didn't take long for manufacturers to start building "blade servers" so that server farms could be constructed more cheaply and densely than putting actual workstations in cases on shelves. In that era, we used a lot of servers for running regression tests in parallel, but each EDA tool was still running on a single CPU.

Then processors ran into the power wall and stopped getting faster at around 3GHz. Instead, they went multicore. The hardware designers, in my opinion, completely underestimated the difficulty of making most algorithms parallel. it had been a research goal since the invention of the microprocessor to be able to build powerful systems out of a large number of cheap components. But for most applications, slow networks and Amdahl's law limited them to using four or eight cores. Paradoxically, using huge numbers of cores would run slower than a small number, as the software burned lots of cycles heroically, but unsuccessfully, trying to make use of them. In this era, a typical engineer would have a single multicore workstation, but gradually the computer power was moving from the engineers' desk behind the walls into farms. Over a 10- to 15-year period, gradually more parallel algorithms came into existence.

Now, we are in the era of large on-premises server farms and cloud data centers. There is effectively infinite computer power available, and so the aim of EDA tools is to trade computer power, which is widespread, for engineer manpower, which is limited. Even if a design team doesn't have financial constraints, and they always have at least some, it is often impossible to hire enough engineers fast enough. But with the cloud, it is easy to hire enough machines.

The availability of cloud (and 100K+ on-prem datacenters) is making analyzing the large electronic systems of today tractable. But it is the improvement in the algorithms, especially improvements in the way that they can scale to cloud levels of parallelism, that is making the biggest difference.

Part 3 will be next week.

Sign up for Sunday Brunch, the weekly Breakfast Bytes email.

Computational Hardware

Computers for Semiconductor Design and EDA Development

More