Never miss a story from Breakfast Bytes. Subscribe for in-depth analysis and articles.
Last week was TSMC's Open Innovation Platform Innovation Forum (aka OIP). Dave Keller welcomed everyone and then introduced Cliff Hou who gave the update on everything technical. Here's what he said. Or rather, here's what I think he said. I will give my usual caveat at the start of posts like this: TSMC does not allow photography, video, or recording the presentations, and they don't provide the slides. So this is entirely from the notes I took during the presentation.
I realize that this post is fairly dry and dense with numbers and dates, but since TSMC is far and away the largest foundry, these details are important, and they are not available anywhere else. You can try and find out from TSMC's website something specific like when N6 risk production is planned to start...but you will discover the information is not there.
Whenever processes are compared below I say things like "NX is 15% faster than NY, 30% less power". This means that you can get a speed increase of 15% at the same power, or a power reduction of 30% at the same performance. It does not mean you can get both at the same time. Obviously, you can also blend the two and take a part of the improved process as a performance increase and a part as power reduction.
N5 (5nm) value proposition is that it is optimized for both mobile and HPC with innovative scaling features. Risk production started in March 2019. It will be followed by N5P (P for performance) with risk production starting a year after N5, so that would be about March 2020.
Comparing N5 to N7, it is 15% faster, 30% lower power. Logic density is 1.8X, SRAM scaling is 0.75, analog scaling around 0.85.
Key features of N5 is that it is fully-fledged EUV adoption, which reduces cycle time due to fewer steps that multi-pattern everything. It is enhanced with a transistor with a high-mobility channel for both performance and power. There are other architectural features that enable logic and SRAM density scaling. There are low-resistance contacts and vias. The I/O transistor can be either 1.5V or 1.2V.
N5 HPC has extreme Vt LVT device that is 25% faster than N7. The HPC standard-cell library has optimized metal and a via pillar array to further boost performance by 10%. There is a special device offering to enable 112Gbps SerDes. For decap capacitors, there is a super high desngith MIM (SHDMIK) that provides 4X more decap than HDMIM (MIM stands for metal-insulator-metal, since these capacitors are built in the BEOL in the metal stack). The increase in decap results in an extra 4% speed boost at the same voltage.
For analog, TSMC have taken a new approach with N5 Restricted Design Rules (RDR). They have provided an analog cell library that goes beyond just providing transistors in the PDK. This improves the manufacturing window.. The analog cells are transistors in abuttable layout templates with predefined cell-heights and pre-drawn layout patterns for the m0 layer and below. Each transistor is surrounded by predefined and validate technology. It achieves much better SPICE to silicon correlation than the non-RDR approach. To build an analog block, you build the schematic using these cells instead of transistors, then route the metal layers on top after placement. Finally, add the guard-ring and boundary cells for a final DRC-clean design. Cliff had some nice pictures of some blocks designed this way for LPDDR4, PHY, PLL, DAC, and so on. But since I wasn't allowed to take pictures I can't show you.
N5 IP status is that all the foundation for mobile and HPC has been silicon validated. Key mobile IP has been silicon validated. HPC is having first tapeouts.
N7 compared to 16FF+ is 30% faster, 55% reduction in power, 3.3X increase in logic density, 0.38X SRAM area. Mass production started in April 2018. It was the fastest technology ramp in TSMC's history, beating even the N10 technology ramp. All IP is available.
N6 is a die cost reduction from N7 by increased use of EUV to reduce process complexity (and improve cycle time), and improve logic density by 18%, making use of CPODE (continuous poly on diffusion edge). Risk production starts in Q1 2020. There is a yield improvement from the smaller die size and the reduction in layers.
If you simply want to take an N7 design and get the benefits of N6, you can just do that. It has compatible design rules, SPICE, and IP libraries with N7. Alternatively, you can use improve N6 logic blocks along with reusing N7 IP for everything else (in particular SRAM). N6 logic density is 1.18X versus N7 (using an Arm Cortex-A72 as a test vehicle).
Next, Cliff invited Dhiraj Mallick of Cerebras Systems to come and talk about the wafer-scale deep learning "chip". I put chip in quotes since this is the largest square die it is possible to manufacture on a 300mm wafer. As Dhiraj said "we announced this last month at HOT CHIPS". As it happens, I was there and I wrote about it a few days later. Most of what he said at OIP was a repeat of what Cerebras said at HOT CHIPS, so read my post HOT CHIPS: The Biggest Chip in the World. The chip was manufactured in TSMC 16nm. There was very tight coupling of the engineers in TSMC's Fab 14 with Cerebras's physical design team, where they tried various new ideas and built test chips to qualify the process. The big challenge was to build reliable connections from die to die across the scribe lines since the manufacturing process still required a reticle to be repeated across the wafer, even though it was going to be left whole and not cut up later.
Cliff came back to talk about low power and RF.
22ULL low Vdd (note: 22ULL low Vdd is a process name, not the same as 22ULL) is a solution with Vdd down to 0.6V for IoT applications. 28HPC+ power is down 20% to 22ULL and then another 45% down to 22ULL Low Vdd. The design flow is the same but there are large variation challenges for library characterization and timing signoff, so a deep understanding of variation is required during synthesis and physical design.
For RF transceivers and WiFi there is 28HPC/RF and 22ULP/ULL/RF already in production. 16FFC/RF enhancement 1 is ready today, N7/RF will come in 2H 2020. There is a big power reduction from 28 to 16 and another big one to 7. N16FFC/RF enhancement 1 has new RF transistors and design kits are developed with higher Ft and Fmax. It goes up to 300GHz.
For mmWave, 28 and 22 are in production, and there will be 16FFC/RF enhancement 1. There is also a second 16FFC/RF enhancement 2 with new transistors that takes Fmax up to 400GHz. It will be ready in Q2 2020 (SPICE and PDK in Q1).
For RF SOI there is 0.13SOI in production today, with a coming enhancement 1 ready in Q1 2020. After that there will be N40SOI. With 0.13 can go to 120GHz, with enhancement 1 it will go to 150GHz, and with N40 to 220GHz. N40RF is SOI with air gap.
Next Cliff moved on to packaging. TSMC has two basic technologies called InFO (integrated fanout) and CoWoS (chip on wafer on substrate).
CoWoS is targeted at very large designs. Currently they can do designs 1.5X the reticle size, in 2020 that will go to 2X and in 2021 to 3X reticle size.
They can do wafer on wafer (WoW) and chip on wafer (CoW). Obviously WoW requires the die size to be identical and the yield to be very high (since any bad die takes out its opposite number too). For CoW they can only use known good die (KGD) that have already been tested.
There is a whole 3D-IC design flow to stack die during verification, perform IR/EM analysis, thermal, signal and power integrity, and more.
On the day of OIP, TSMC had announced a 4GHz CoWoS HPC chiplet-based design, proven in silicon, around an Arm Cortex-A72 and using low-swing 0.3V I/O design that achieved 8Gb/s per pin, giving a total of 320GB/s.
Sign up for Sunday Brunch, the weekly Breakfast Bytes email.