Xcelium Is 50% Faster on AWS's New Arm Server Chip

9 Dec 2019 • 3 minute read

At Re:invent, Amazon AWS announced Graviton 2, their second-generation Arm server chip. In the AWS news blog, written by Jeff Barr, AWS Chief Evangelist, he provides details.

Today I would like to give you a sneak peek at the next generation of Arm-based EC2 instances. These instances are built on AWS Nitro System and will be powered by the new Graviton2 processor. This is a custom AWS design that is built using a 7nm (nanometer) manufacturing process. It is based on 64-bit Arm Neoverse cores, and can deliver up to 7X the performance of the A1 instances, including twice the floating point performance. Additional memory channels and double-sized per-core caches speed memory access by up to 5X.

They didn't say in the blog post, but from the logo on the package on the slide above we can see that the chip was designed by their Annapurna Labs subsidiary based in Israel, like the rest of the Nitro Project.

For some more details on AWS's Nitro Project and their chip strategy, see my post HOT CHIPS: The AWS Nitro Project.

For some experience of running EDA on Arm servers, see my post EDA in the Cloud: Astera Labs, AWS, Arm, and Cadence Report.

Graviton 2 Performance

There are three new instance types based on Graviton 2:

General Purpose (M6g and M6gd) – 1-64 vCPUs and up to 256GiB of memory
Compute-Optimized (C6g and C6gd) – 1-64 vCPUs and up to 128GiB of memory
Memory-Optimized (R6g and R6gd) – 1-64 vCPUs and up to 512GiB of memory

The instances with a "d" have NVMe local storage.

Later in the post, they compare these Graviton 2 instances to the M5 instance. I had no idea what an M5 instance is, so I went and looked. From the AWS website:

M5 and M5d instances feature the Intel Xeon Platinum 8000 series (Skylake-SP) processor with a sustained all core Turbo CPU clock speed of up to 3.1 GHz, and deliver up to 20% improvement in price/performance compared to M4 instances. M5 instances provide support for the Intel Advanced Vector Extensions 512 (AVX-512) instruction set, offering up to 2X the FLOPS per core compared to the previous-generation M4 instances.

(They also have M5 instances based on Intel's Cascade Lake and on AMD Epyc 7000.)

For us in the EDA and semiconductor businesses, the most interesting comparison is:

EDA simulation with Cadence Xcellium: +54%

OK, we'll forgive them for spelling Xcelium wrong because...they mentioned us. Woohoo. Plus 54%, what's not to like?

It turns out that the Xcelium simulation speedup is the biggest speedup of any of the benchmarks that they mention. Here are the rest:

SPECjvm® 2008: +43% (estimated)
SPEC CPU® 2017 integer: +44% (estimated)
SPEC CPU 2017 floating point: +24% (estimated)
HTTPS load balancing with Nginx: +24%
Memcached: +43% performance, at lower latency
X.264 video encoding: +26%

Those speedups seem surprisingly large. In fact, they obviously impressed AWS themselves, since they have decided to use these Graviton 2-based instances to provide some of the services that underlie AWS itself:

Based on these results, we are planning to use these instances to power Amazon EMR, Elastic Load Balancing, Amazon ElastiCache, and other AWS services.

I don't think there will be more information for a couple of months. Jeff Barr's post ends:

I will have more information to share with you in 2020.

Although AWS didn't announce any pricing for use of these instances, based on the pricing of the current Graviton 1 A1 instance, this should offer customers a lot of compute power at a lower price than they can get from the existing fifth-generation x86 instances. I'd need to see more benchmark data, but if this is truly an Arm server chip that is the fastest processor on the market today (well, not really on the market since you can't buy one), this could be a very big deal.

It is also a validation of Cadence's Intelligent System Design approach, since AWS is clearly using its own chip-design capability to create differentiation from its cloud competitors in terms of the price/performance that it can deliver. Just like the leading smartphone companies who all design their own application processors, and leading automotive companies doing their own chips.

Sign up for Sunday Brunch, the weekly Breakfast Bytes email.