Never miss a story from Breakfast Bytes. Subscribe for in-depth analysis and articles.
Something I've been writing about for literally years is the trend toward system companies designing their own silicon. It started in the mobile industry. Today, all the leading smartphone vendors design their own application processors and, in some cases, additional chips. Automotive companies have started to follow. For example, see my post, HOT CHIPS: The Tesla Full Self-Driving Computer. Then the hyperscalers started to build some of their own processors and networking chips. For example, see my post, HOT CHIPS: The AWS Nitro Project.
At the recent DesignCon, there was a panel session held in the exhibit hall at the ChipHead theater, titled Bespoke Silicon: How System Companies Are Driving Chip Design. The panel was moderated by Rich Goldman, who last showed up in Breakfast Bytes in my post An Illuminating Chat with Lumerical's CTO. Since then, Lumerical has been acquired by ANSYS.
The panel consisted of:
Rich: What is bespoke silicon?
John: It is a bit like bespoke suits, something we haven’t had to worry about for a couple of years. Hyperscalers are good example.
Rich: Who is doing it?
Prashant: Who isn’t? Lots of application specific silicon. Automotive, IoT applications, 5G. All areas where people realize general purpose may not be an exact fit.
Rich: Let me name names. Amazon, Microsoft, Meta, Google. Automotive like Tesla. Apple.
Rob: The economics of a bespoke chip are different from a market chip. For a market chip, it is 'Can I make a profit selling the chip?' But for a system company, it is 'How much more money can I make enabled by this chip?'
John: There are obviously PPA benefits, but it's not just about silicon, it's a bespoke system. The real value comes when you tie it into the software. That bespoke suit needs a shirt and a tie too!
Rob: It can even change where the hardware-software boundary goes.
Rich: Why did Microsoft do bespoke silicon?
Prashant: Just go back 5 years, all industry verticals were wondering how to burst out into the cloud, and their on-prem infrastructure was running out of steam. Now things have shifted across all industry verticals. There is a huge industry compute requirement. Hyperscalers need to look at cost-performance and security. If I am going to move 80% of a workload into the cloud what is the economics going to look like? Google, AWS, and Microsoft have all done it. Once you start to look at how can you optimize for cost, you have to look at software all the way down. So not just optimizing optimize cost-performance, but also optimizing end to end.
John: Power can also be limited.
Prakash: Yes, datacenters are severely limited in what you can put per rack. A lot comes down to the custom silicon being application-specific, so you can offload the central CPU, and efficiency improves.
Richard: Microsoft spends a zillion dollars per year on electricity and security.
Prakash: Everyone is making these investments.
John: When we talk about power, thermal is a big byproduct. Not only is thermal management critical with bespoke silicon, and with on-die sensors can manage thermal much better by bringing it up into software. Emulation is part of the bespoke revolution.
Rob: There is a fixed power budget so how do you remain inside it? Layered on top of that is, even if Moore’s Law is slowing down, the compute needs are not slowing down, so things need to be distributed across multiple chips. A lot of physics and system considerations.
Rich: What are things?
Rob: 3DIC, 2.5D, not just a chip. Layering am I doing wafer-on-wafer. At this point, there isn’t any one-size-fits-all. Different approaches are more or less optimal for different systems.
John: Why are we using 3DIC and advanced packaging? Translating data is one. The world is no longer digital, and at these speeds signal integrity and crosstalk matter. Photonics and co-packaged silicon photonics is of growing importance. Also, a key challenge is how do we find enough designers?
Rich: Other issues like strain, etc. are not usually things designers understand.
John: Thermal and mechanical stress and strain are no longer independent. If you do reticle size AI processors, the power integrity is also an immense problem. I talk about these problems being separate, but they are not, so it needs to be looked at in a multiphysics platform. This also brings organizational challenges since the teams can’t be separate. Signal integrity, packaging, board...all need to work together.
Rob: It goes up more. Designing microarchitecture for a CPU needs to plan for how power can be delivered. We can even go further up to workload dependencies.
Rich: We all spent most of our careers in IC industry but now we are talking about packaging and other stuff. How’s that going to work.
Rob: An interesting thing with bespoke silicon is that you get to find out all these things that the silicon companies have done for years that now the system companies are having to deal with.
John: In the valley, Bangalore, Shanghai the, biggest challenge is to get the team together to be able to get started. This has also created a lot of demand for high-end design services companies. And demand for chieftains of design to move companies to start new teams. We need easier-to-use tools that identify what action you need to take to improve your design.
Rich: When I look at companies doing bespoke silicon, they are all system companies, not silicon companies moving up. Is hiring as difficult as John said?
Prashant: Yes, since I joined Microsoft, the silicon teams have been trying to hire. We acquired a team from Qualcomm, but the rest was hard. For Microsoft, we were only doing Xbox before and a few custom datacenter applications, but now it has exploded. We had to seed many areas for stuff we were never doing before.
Rich: The cultural challenges must be immense. All these walls have to be torn down.
Rob: System companies never had that building before so they can avoid building up the walls.
Rich: Has there been an impact on Arm?
Rob: Over the last 5 or so years we’ve carved out a special infrastructure group to deliver what the hyperscalers are looking for. It was “just a high performance CPU, how hard can it be?” but there are differences in systems that go into hyperscaler datacenters with different memory hierarchies and we are certainly seeing all of that.
Rich: I recently saw an ad from Airbnb looking for chip designers. Let’s talk about chiplets.
Rob: At the simplest level it is a piece of silicon designed to fit in with others in the same package. It is something designed from the group up to communicate with other chips in the same package. So things like a PHY decision is done for short distance. Chiplets were originally motivated by the fact that large die don’t fit in the reticle. But if you are just getting started, your first design should probably not be chiplet-based while you learn what you are doing. But with chiplets, you will discover you can get the memory closer, and get more cores.
Prashant: Also, you can reuse IP and build IP in different process nodes.
Rich: Stacking stuff on top of each other is conceptually simple but implementationally challenging.
John: The opportunity of taking silicon from different vendors is a huge challenge. One aspect is standardization with TSMC Samsung and Intel foundries all involved in a chiplet announcement.
Rich: UCIe you are talking about?
Rich: So you are supporting Intel?
Rob: Of course we are!
Q: What about other technologies than chiplets? Monolithic? ReRam?
Rob: It is still dice rolling for monolithic stack structure will take over to get density once Moore’s Law stops. So it is the technology of the future, not the present. Time frame for logic on logic objects will not appear in the next decade.
Q: What about IP?
Rob: A lot of interest in multiple nodes. SRAM is not scaling as fast as logic so maybe good to back off. From IP standpoint things like UCIe may be going to drive new pieces of IP.
John: We will need silicon lifecycle management so on-die identification
Q: Will chiplets come from multiple vendors?
John: Yes, I believe so. But it needs bespoke silicon, seize the opportunity.
Q: How do failures get handled?
Prashant: You are the general contractor, so you have to blame yourself when something goes wrong. The industry will have to figure this out. Managing that is an interesting challenge.
Rich: Rob, are we going to see Arm chjplets?
Rob: We are an IP company, so I think I can say this is a question that will be answered above my pay grade.
With that Rich and the audience thanked the panel.
For your 2023 planning, save the date for DesignCon 2023, which is January 31st to February 2nd.
Sign up for Sunday Brunch, the weekly Breakfast Bytes email.