HOT CHIPS: The AWS Nitro Project

2 Oct 2019 • 6 minute read

In 2016, Amazon acquired the Israeli company Annapurna Labs. Since they were in stealth mode, doing something to do with Arm processors, nobody really knew why. At the time, press reports called them "secretive chip maker Annapurna". Last year, at CDNLive Israel, the CEO of Annapurna, Hrvoye "Billi" Bilic gave one of the keynotes (see my post CDNLive Israel 2018) and revealed a few details, but it was still unclear why it was so critical to Amazon. Since then AWS has revealed some details at their user conferences. But the first deep dive I have seen was at HOT CHIPS.

At HOT CHIPS in August, AWS's Anthony Liguori gave a lot of details on The Nitro Project—Next-Generation AWS Infrastructure. He started with some statistics. Every other piece of AWS infrastructure is built on top of EC2. There are over 60 availability zones (datacenters or groups of datacenters) many of which have over 100,000 servers. There are millions of servers worldwide. They launched Nitro in November 2017, although some of the groundwork started back in 2013. All new launches in EC2 since 2017 are built on Nitro.

Nitro is the thing that powers everything we do.

AWS had originally built their cloud up on commodity hardware, then later added some Annapurna chips. But it was time to think big. As Antony put it:

After ten years of Amazon Elastic Computer Cloud (EC2), if we applied all our learnings, what would a hypervisor look like?

He went on to give a tutorial on virtualization, which I will skip. If you need more background, see my post How Does Virtualization Work? One challenge is that Intel's pre-2004 processors didn't meet all the Popek and Goldberg requirements for virtualization. One example is that you can read privileged registers in user mode without the hypervisor gaining control. When EC2 launched, they used the Xen hypervisor which does what Anthony called "paravirtualization", rewriting the guest operating system to make direct hypercalls.

The early days of Nitro were before they started working with Annapurna labs. An EC2 instance in January 2013, pre-Nitro, looked like in the diagram below, without the chip in the dotted-orange box in the lower right. This image, and all the others in this post, are from Anthony's presentation.

Later that year, in what Anthony called "early Nitro", they added Nitro chips to enhance the networking, which is the chip in the lower right. This boosted their bandwidth from 100K packets per second to 1 million packets per second. There was also a big reduction in tail latency. Often, networking performance doesn't depend on the average latency but on the worst latency, known as tail latency. It is only a rough analogy, but if you put a truck on the freeway going at 25mph it doesn't really matter what the average speed is, that's going to cause congestion.

After that, they started working with Annapurna labs, and used the A15 32-bit Arm processor. They continued to bump performance of in-the-wire networking. However, they also introduced storage virtualization and a remote block device. Up to and including C3 they had to hold back about 20% of cores for the devices models. In C4, in the above picture, dating from January 2015, they reduced that to 10% so immediately made more cores available to customers. Then, as Anthony said:

We liked Annapurna so much that we acquired them shortly after we launched C4. We started to work with them to build truly custom silicon.

The next step was I3 a couple of years later in February 2017. This used the next generation of Annapurna technology, built after they became part of AWS. It is based on the Arm A57. One very important feature in the cloud is encryption. This new chip allowed them to do encryption of remote block storage at the line rate. They also changed the system to remove the restriction to a single card. They can use 4 separate Nitro cards. These controllers allow them to do PCI passthrough, provide normalized performance even with drives from different vendors, encryption, and managing the underlying drives. This is the platform they used to introduce ENA Elastic Network Architecture. So at this point they have offloaded local storage, remote storage, and networking. So the question was "what next?"

The next step was to move the data plane and the control plane onto the Nitro cards, using the Arm cores inside. This gave the C5 in November 2017. But then there is nothing much for the Xen hypervisor to do, so AWS introduced the Nitro hypervisor, which is really only responsible for dividing up memory and dividing the cores among the guests. As you can see from the above diagram, most of the functionality has migrated into the Nitro chips, apart from the server processor itself.

Nitro consists of three parts, two hardware and one software:

Nitro cards (of which there are 4 types):
- Elastic Network Adapter (ENA) PCIe controller, Virtual Private Cloud (VPC) data plane
- NVMe (non-volatile memory express) PCIe controller, transparent encryption
- NVMe PCIe controller, Elastic Block Storage (EBS) data plane
- System control, root of trust
The Nitro security chip, integrated into the motherboard, that protects hardware resources and the hardware root of trust.
The Nitro lightweight hypervisor, that provides bare-metal type performance since it is only doing memory and CPU allocation.

Anthony pointed out that all through his talk he has said Intel, to keep things simple. But they have AMD instances too, and they have launched their own Arm SoC too, called Graviton, for Arm-based servers. The focus of his talk was not so much the processor, more what the rest of the system is.

We can support any processor that supports PCI.

One big advantage of this approach is jitter. True real-time systems are very hard to get right with a hypervisor. The above graph compares i3.metal (red) which has a tiny uptick right at the end with something taking a few cycles. The yellow line is one of the last Xen-based systems. The dotted green line is the service level agreement (SLA) that packets will be handled in 150us. But that yellow line going up at the right shows that some packets are requiring milliseconds. The C5 (blue) instance looks almost the same as bare metal, just a little higher due to interrupt delivery delay in a virtualized system, but no uptick at the right end at all.

Not every customer can bring everything they have into the public cloud, so AWS announced Outposts where the same Nitro hardware is offered to customers to run in their datacenters or back offices. It uses the same underlying hardware and software as AWS does themselves. It is then accessed through the standard AWS API and console.

In the Q&A, Anthony was asked about side-channel attacks (such as Spectre and Meltdown). He said that they never allow two instances to occupy the same core simultaneously. A lot of the side-channel issues in the last couple of years were L1-cache-sharing which AWS has never done. He said they have also done a lot of mitigation for things like Rowhammer, which doesn't really work with ECC memory that they use. "We are on top of the latest research for things like this."

He was asked if they do anything for GPU virtualization. He said that as of today they only support dedicated GPUs. "It is a hard problem since GPU interconnects are so fast and no GPUs are designed to be multi-tenant. At last year's Reinvent we announced Inferentia". Inferentia is Amazon/Annapurna's neural network accelerator chip (see image).

Another question was on congestion management. Anthony said that they always want to make sure that congestion is not a problem for customers who need high-performance. "Our network is big and not oversubscribed, so it's not a problem at the high end." For lower-end, when there are a lot of instances on a single machine, they take a more statistical approach with a lot of limits (like how many packets each instance send, and so on).

Sign up for Sunday Brunch, the weekly Breakfast Bytes email.