Get email delivery of the Cadence blog featured here
At the recent TSMC Technology Symposium, various speakers gave details of the various TSMC processes. Since the rules of the technology symposium are that you can take notes but not record the presentation, nor photograph anything (and they don't hand out slides), the day is a bit like drinking from a firehose. Here's the important stuff I managed to note.
Yield is ahead of their targets despite this being the fastest ramp in TSMC history. Volume shipments started in Q3 2015.
This process is the new lower cost version of 16FF+. It has fewer masks and an optical shrink resulting in costs lower by 10-20% (per die). Operating voltage can go as low as 0.5V and, under some circumstances, lower. It is in volume production as of this quarter (Q1 2016). It has the same rules as 16FF+ for easier IP porting, although there are also different standard cell libraries available with fewer tracks. TSMC expects 100 tapeouts from 40 customers during 2016.
Risk production early Q2. Apparently the first customer tapeout took place the day before the symposium. Key differences (from N16) are a MD2 local interconnect layer that compensates for the 1D rules in the BEOL, and a spacer-based BEOL process (as opposed to LELE in 16), which reduces variability but also means that the two different colors in the double patterning have different characteristics.
To go into more detail, N10 uses spacer technology to manufacture the lower levels of metal. In N16, double patterning was done using litho-etch-litho-etch, which means that coloring could actually be done very late in the design process. There are some issues with LELE about variability since the two patterns are not self-aligned. The spacer approach results in self-aligned layers but the layers have different characteristics. This means that all the patterning needs to be done earlier in the design process (as early as the schematic for analog). This is known as "full coloring". For digital, the routing is done on pre-defined color tracks, a sort of underlying grid. Since the two colors are different they have different R/C and EM rules. Even dummy fill needs to be color aware. Cadence's Virtuoso and Innovus technologies already fully support all this.
15-20% better performance than 10FF or 35-40% lower power at the same performance. 1.63X density. Currently getting 30% yield for 128Mb SRAM, which is ahead of their plan. Qual is planned for Q1 2017. Risk production Q1 2017. Standard cells are 6X density and 2X performance per watt, GPIO cells are about 3X. The N7 SRAM for mobile is 4X area scaling and about 4X power reduction.
HPC on N7 is a different process. It is 10-15% faster than mobile. Better clock tree. Can do 28/56Gbps SerDes for datacenter Ethernet. There are taller standard cells and larger vias. The normal I/O voltage is 1.5V but can do 1.8V for legacy interfaces such as USB and HDMI.
Both the mobile and HPC versions of N7 will be available at the same time. N7 leverages 95% the same tools (equipment) as N10 so capacity can fairly easily be migrated. Moshe Gavrielov, CEO of Xilinx, in the opening keynote said that they are skipping N10 and going straight to the high-performance version of N7. EDA and foundation IP is scheduled for Q3 2016. CoWoS for N7 scheduled for 2018.
In a density analysis done using a "typical" mobile chip with SRAM, analog and digital the scale factors are 16FF+ 1X (the reference), N10 is about 0.6X, and N7 is about 0.43X. A test design using the ARM Cortex-A72 has a 35% speed increase versus 16FF+ or a 65% power reduction (at the same performance).
All the above remarks about spacer technology for N10 apply to N7, too. But it gets more complex still. For N7, the lower levels of metal are built as continuous one dimensional (either vertical or horizontal but not both) grids and then a cut mask is used to divide them up, since that gets much better end-to-end spacing. In fact this is done in N10 too, but only in standard cells, so it is hidden from most users. The routers need to be aware of this.
InFO is TSMC's 3D packaging technology that doesn't require an interposer (as is required for CoWoS, their older 3D technology). Compared to interposer-based 3D it is 20% thinner, 20% faster, and 10% better thermal performance. It is also cheaper. However, for the largest, highest-performance designs, the CoWoS technology is required since there are size limitations with InFO. There is more info on InFO in my recent blog post.
After N7 FinFET, there is a lot of research going on. They have demonstrated a best-in-class (10X better) Ge n+ contact (the p-type is easy but n-type has always had issues). Work ongoing with III-V FinFET (Indium-Gallium-Arsenide). Nanowire transistors (gate all round). Can get better performance and power. InAs narrow wire surrounded by a Hi-K metal stack. TunnelFET (T-FET) can achieve very steep turn-on characteristics (low leakage).
Litho: With self-aligned quad patterning (SAQP), they can get 193i down to less than 30nm pitch with an overlay accuracy of <4nm, giving very precise line-end cuts.
Directed self-assembly (DSA): They have demonstrated pitch down to 20nm. Hole defects less than five per wafer which is the biggest challenge, although still improving.
EUV: Planned in N7 and beyond. EUV has better depth of focus and energy latitude so should yield better. 2D fidelity is better so can enable more complex designs that would be unmanufacturable with 193i. For more on TSMC's experience with EUV, see my blog from IEDM in December.
New memory technology: Two areas, eRRAM (resistive RAM) as an eFlash replacement, and eMRAM (magnetic RAM) as both eFlash, SRAM, and eDRAM replacement. Achieved functional yield with good endurance.
CMOS Image Sensor (CIS): Near infra-red in development to enable cameras that can see in the dark (needed for ADAS and autonomous vehicles). Can also be used in phones/cameras to do the focusing (in the dark) and then take the actual picture with the visible light CIS.
Today, availability is 90nm, 55nm, 40nm. InFO for 28HPC+, 16FFC. The focus process for ADAS is 16FFC with high-reliability qual planned for Q3 2016 (high temperature, low failure rate). There will be a new set of foundation IP for EM and functional safety requirements. Certified Q3 2016 with third-party IP in 2017.
The requirements for automotive for ADAS between 2015 and 2017 are that GPU needs to improve 20X from 400 GFLOPS to 8000 GFLOPS, sensors go from 10 to 30, the number of ECUs (actuators) from 30 to 90.
Low-end devices use 55ULP and 40ULP. Mid-range devices are on 28HPC+. High end is 16FFC. Requirement for very low voltage. 16FFC can go to 0.5V and maybe even lower depending on the IP required. Need to validate tool accuracy at very low voltages. There will be a new ultra-low voltage standard cell library and a new SRAM in Q3 2016.