CDNLive China 2019

28 Aug 2019 • 6 minute read

A couple of weeks ago it was CDNLive China in Shanghai. (To read about how I got into town from Shanghai Pudong Airport, see my post Maglev Trains.) Once again, I was giving away copies of A Year of Breakfasts 2018 and signing people up for Sunday Brunch, the weekly email that gives you a teaser and a link to all the posts from the preceding week. If you are not signed up already, then just click here and type in your email.

The conference was opened by Sherry Xu, Cadence's GM for China and Southeast Asia. She welcomed everyone to the day and then introduced Anirudh Devgan, Cadence's President.

Anirudh Keynote

Anirudh opened by pointing out that Cadence has a huge presence in China, not just customer-facing but R&D, too. In fact, almost 10% of Cadence employees are based in China and it is a critical market for Cadence.

Intelligent systems of today and tomorrow are about intelligence performance (AI and deep-learning providing unique user experience), device performance (software througput, compliance to EM and thermal, etc), and silicon performance (analog and digital compute acceleration, silicon reliability, advanced semiconductor nodes).

As an example, Anirudh related meeting with an automotive company. The average cost of a car is $30K, and about 100M are sold annually worldwide. So that's a $3T market. The company was trying to decide whether to do their own silicon or buy from another company at 7nm. They estimated the cost of a 7nm design at $50M and the cost to manufacture about $50 per chip. So with 1M cars, that's about $50M for the chips and $50M to do the design. But to buy the chips in would cost $300M. So if you have a market that can support 1M chips then it makes sense to design your own chip and do overall system optimization.

He moved onto Cadence's Intelligent System Design strategy. Building up from the bottom there is Design Excellence, Cadence's core business of EDA tools and IP. But the next layer up is System Innovation, since electronics must increasingly be analyzed in the environment in which it operates, including mechanical, thermal, antennas, and more. There's a lot more in a cell phone than just a big SoC. At the top is Pervasive Intelligence. The development of deep learning over the last five or ten years is altering everything. Most system companies are incorporating AI algorithmic know-how into their systems, and Cadence is incorporating deep learning into its tools.

Anirudh pointed out that the core business is more than 90% of Cadence revenue and the company is doing very well. He highlighted a couple of areas:

World-class PPA results with Cadence digital flow resulting in production deployment in 17 of the top 20 semiconductor and system companies. Full flow delivers 10-20% better PPA with the best turnaround time. The Genus and Innovus solutions are merged using the iSpatial technology (for details on this, see my post Genus and Innovus: Compus and iSpatial) further improving the unified physical optimization flow.
The Spectre X simulator provides an up-to-10X speedup while maintaining the golden accuracy, as well as up to 5X increased capacity. The Spectre family is the most widely used simulators in the industry: "This is the best-in-class simulator in the industry." For more details, see my post Spectre X: Same Accuracy, New Speed.

Anirudh switched gears to system innovation and system analysis. EDA is $8-10B worldwide, depending on how much IP you throw in. System analysis is a $4.5B market and growing in double digits. It is used by semiconductor companies but also system companies and other verticals. Cadence put a full team together to address this market and the first product is Clarity. The focus on EM was picked since there is a lot of requirement with automotive and 5G. Further, the existing solutions are pretty old, technology from the 1990s, that either can't simulate big designs at all, or it takes too long. We have a lot of experience solving sparse matrices from our chip experience, and there we have to handle 10B elements. For more details, see my post Bringing Clarity to System Analysis.

For example, Anirudh said, a system company gave us a phone with 30M elements. That is considered big, but from an EDA standpoint, this is not hard. Clarity is the first product that can do that and also runs massively parallel. We are seeing big speedups such as 7.2X for LPDDR5, 9.5X on a complex connector, and 12.3X on fanout wafer-level package-on-package.

Cadence's Palladium and Protium hardware platforms provide emulation and FPGA prototyping. The Palladium platform is used in most of the top companies for RTL debug and hardware development. But increasingly, they want more for doing software development before hardware is available. The Protium X1 platform is perfect for this since it works with the Palladium platform but is targeted at software development. It is 3-5X faster than the Palladium platform, with the same front end, and also cheaper. It is multi-user with a granularity of a single FPGA for very high utilization. For more details, see my post Protium X1: FPGA Prototyping for the Enterprise.

Anirudh went up another level to machine learning (ML). As he said:

This is a big area so I’ll talk about some things we are doing in our own products. We break out AI into three areas. First, lots of companies are doing AI chips, with 50 startups and a lot of the big companies, too. One part we call ML Enablement, hardware and software co-design, Tensilica for ML. In our own products, we have ML inside and ML outside. Inside, the interface to the user is same but using ML under the hood. For example, in the JasperGold platform. It is transparent to the user, they get a faster tool or better PPA. We have several of those. ML outside means the flow is changed, fewer iterations. Self-driving methodology is ML outside. Improving the performance of the engine is ML inside.

Anirudh wrapped up with a summary of Intelligent System Design:

Foundational EDA and IP
Investment in system analysis
AI and machine learning

Shao-jun Wei Keynote

Shao-jun Wei is a professor at Tsinghua University. He presented at DAC a couple of years ago on the status of the semiconductor industry in China, which I covered in my post China's IC Industry: Today and Tomorrow. This time his topic was Software-Defined Chips: Architecture Innovation for Intelligent Computing. He was talking in Chinese with English slides, but I think I have the main points.

Runtime reconfigurable hardware has the potential to get near ASIC performance without sacrificing programmability. There are two parts to this:

A processor reconfigurable at runtime
A programming language that optimizes both hardware and software at runtime

He calls this CGRA, or coarse-crained reconfigurable architecture. The chip changes dynamically in real time with changes in the software. When there is a high diversity in applications, that makes ASIC unattractive. The software may be very large but the hardware is always limited. So the software should be partitioned into modules and executed one-by-one with the hardware changing dynamically to react to the modules. The sweet spot is algorithm and hardware co-design.

So C/C++ is input language. C Syntax check, code analysis, code transformation, code optimization, then HLS (high-level synthesis) in the middle of the flow.

We see this with AI chip development. The zero stage is Intel CPU, Xilinx FPGA, NVIDIA GPU, etc. The first stage is Google TPU and MIT eyering. The second stage is TsingMicro and WaveComputings DPU. The third stage is intelligence. So this runs from classic, to domain-specific, to reconfigurable, to intelligent.

Looking to the future, Shao-jun pointed out that once chips are intelligent, then they have the ability to learn and differentiation can be strengthened with an ability to learn, an ability to change structure, and to continuously improve.

Sign up for Sunday Brunch, the weekly Breakfast Bytes email.