Andy Bechtolsheim: 85 Slides in 25 Minutes, Even the Keynote Went at 400Gbps

29 Apr 2019 • 5 minute read

Andy Bechtolsheim likes to go fast. He famously had to rush off to a meeting but wrote Sergey and Larry $100K check to fund Google anyway, with no paperwork. At the keynote at CDNLive, he presented The Road to 400G Networking. He had 85 slides and 25 minutes, so he had to go fast.

As it happens, I saw an earlier version of this presentation called Datacenter Networking Market Transitions, which I wrote about in my post Andy Bechtolsheim Keynote on the Future of Networking.

But things were just getting underway a couple of years ago. In Q1 2017 during that earlier transition, cloud datacenter revenue was $22B. In Q4 of last year it was more like $50B, as you can see from Andy's first slide below.

It's All About the Cloud

The light blue bars in the chart are incremental revenue from the same quarter of the prior year, so Q4 2018 was $18B larger than Q4 2017. In fact, 2018 was the year that cloud server shipments (in units) passed server shipments to enterprise on-premises datacenters. There are just a small number of suppliers (the biggest being Amazon Web Services (AWS), Microsoft Azure, Google, and IBM) but they each have over a million servers.

Andy's perspective is that the thing driving this is that there are increasing returns to scale: large cloud datacenters are more cost-effective, with shared infrastructure (buildings, cooling), shared staffing costs, and location in places where electricity is cheap. I asked Linley Gwenapp at the recent microprocessor conference whether he knew how much aggregate compute power is in mobile versus cloud. He hadn't done the calculation but clearly it is the dominant way to deliver computer power, especially outside the US where most internet access is from smartphones. Of course, connecting all the infrastructure up requires networking, which is where Andy and Arista Networks come in.

This is the year that the 400G transition will start, driven by the availability of 112G SerDes (see my post The World's First Working 7nm 112G Long Reach SerDes Silicon). Because 400G is 4X the size of 100G (of course), Andy says it will only take until 2021 for it to be dominant in bandwidth terms. The transitions now go very fast. In 2000, when 1G was launched, it was expensive and took a decade for prices to come down and it be adopted. But in cloud, it's all about scale. If 400G is too expensive, everyone will just stick with the cheaper solution. But the moment it is cheaper, everyone wants to adopt immediately. At 100G, most cloud providers switched in one year. It's similar to process-node transitions, where if the big mobile companies want wafers in the new node, they want all the wafers because there are literally billions of smartphones per year.

Another change is the transition from proprietary solutions to using merchant silicon. Merchant chips deliver faster time to market, leading performance, and leverage the entire industry R&D. So by 2016 or so, the entire network stack had switched to using merchant silicon.

The secret of the networking industry is that they haven't been very aggressive process-wise. Most merchant silicon is using 28nm, which Intel had in 2011. But that means that Moore's Law isn't slowing for them. As Andy put it, "there are advantages in coming from behind." They can go from 28nm, to 16nm, to 7nm, to 5nm. All these are available today, so if 3nm turns out to be a long time coming, they can still get 30X smaller die but using 5nm, which is in risk production already. Each generation of silicon enables more buffers and bigger routing tables.

But there are economic clouds on the horizon after the 1 billion percent gain in density from 1971 to 2018. Since 7nm fabs cost $t10B to $15B, it takes longer to recapture the investment. Now only TSMC, Intel, and Samsung have built 7nm fabs. GLOBALFOUNDRIES refocused on specialty processes such as FD-SOI at non-EUV nodes. If you want more color on that, see my post GLOBALFOUNDRIES Executive Team Explains the Pivot).

In networking, it's all about open standards and so people don't want proprietary features. The single biggest issue is time to market, and the single biggest challenge is signal integrity. Lip-Bu had just announced Clarity (see my post Bringing Clarity to System Analysis):

But I made these slides before I saw today's announcement

Futures

So what's beyond 400G? It's all about cost, dollars per gigabit per second. The optics people are most excited about is 400G-16QAM DSP + coherent laser. The spec was for 100km but may get up to 1000km reach, which would allow customers to use the same for datacenter optics and long-haul optics. 800G optics can support 2x400G or 8x100G.

100G lambda optics can run at 100G per wavelength on the optics. So we can achieve 800G by doubling the baudrate to get it. The short reach from the gearbox to optical drivers it is not a problem. We'll see that in the 2-3 year timeframe. 800G adoption driven by cost-saving potential. Same number of components running at 2X the speed.

Beyond 100G SerDes, you might expect 200G. But channel loss means you can only get a couple of inches. The best use case connects the silicon direct to the optics. Or at least co-packaged optics. It will take a couple of years to solve, and in the meantime 100G will remain mainstream since it's the only thing that works with conventional PCB technology.

1600G Ethernet has a new issue. The packet rate is just 333 picoseconds so that needs wide Ethernet ports. But that, in turn, mean fewer ports per chip and so it's impossible to scale the optics cost-effectively. Maybe in 2024-25.

The Driver

This is very important for the service providers in mobile, since they are converting capex and opex into revenue, but can't just charge more money for 5G and 4G. They need lower costs.

So in the short term, it's just that fatter pipes are lower cost. 400G is more than 100G, but not four times as much. That's the driver.

Sign up for Sunday Brunch, the weekly Breakfast Bytes email