Persistent Memory at Twitter

3 Feb 2020 • 3 minute read

A couple of weeks ago was the Persistent Memory Summit 2020. See my post Persistent Memory: We Have Cleared the Tower for an overview. This week I am going to cover two presentations, one from Twitter and one from Oracle. There were a number of other presentations of proof-of-concept systems, but these two actually have systems with persistent memory in production use (in Twitter's case, still fairly minimally). Look for Oracle later in the week.

Once again, one reason that I think persistent memory is important is that as software teams work out how to use it, the approaches and standards that result will get used on SoCs. Further, as cloud instances with huge amounts of persistent memory become available, this is likely to create opportunities for handling EDA problems differently, especially being able to recover from crashes of servers without needing to start from scratch.

Twitter

Yao Yue of Twitter presented Pelikan with ADP, which stands for "application direct persistence" and Pelikan is a cache (I think all Twitter components are named after birds). In any case, it means having applications that are aware of the existence of the persistent memory, the most difficult case to handle but also the one with the most potential upside.

It was not the main focus of her presentation, but to give you an idea of the scale they operate at, she started with some interesting statistics about Twitter's compute fabric:

Over 400 clusters in production (single-tenant) with many thousands of host and tens of thousands of instances of Pelikan
Job size is 2-6 core, 4-48GB
Queries per second (QPS) max 50M on a single cluster
Service-level objective (SLO) p999<5ms
Data structures: key-value, counter, list, hash, sorted map, and more

She did point out that they are more concerned with throughput than latency:

We're a web company. Nobody is in a hurry to go anywhere. They are there to kill time!

The big problem they are trying to address with persistent memory is that they cache a lot of data and

...whenever we lose a warm cache, it causes problems for the backend.

They use Optane in two different modes, which are basically exactly what I talked about at the start of this post, and shown in the diagram below. On the left, they just use Optane as a big, affordable, volatile memory (even though it isn't actually volatile). In App Direct, they make use of the persistence.

Using persistent memory just for capacity is important, too, though:

Memory mode is a very good gateway drug. Software engineers don't know what is going on with the hardware at all, we just don't pay attention. So I don't think using persistent memory without the persistence is worthless, as some of the earlier speakers said.

I'm not going to go through all the details of the implementation. She had lots of graphs from benchmarks that you can see in the presentation (see the end of this post for the link).

The big difference, of course, is restarting after shutdown or failure. In their current implementation, a full warmup of the caches takes from minutes to days. Restarting takes 20 minutes by default, and restarting a large cluster takes days. With persistent memory:

Single instance, 100GB of database data. Complete rebuild, 4 minutes
Concurrent (18 instances per host). Complete rebuild: 5 minutes
Can potentially speed up maintenance by 1-2 orders of magnitude

Conclusion

The conclusions were that the changes to add ADP were modest and worked with all data structures. The serving performance in normal use was comparable to DRAM for the Twitter workloads. The recovery performance was good. The main bottleneck is still the network.

Are they using it?

We've been running in canary mode with one server since before the holidays. If you tweet enough, you'll get it!

They are studying the canary to assess two things: do they see the same performance in production, and how does the larger heap (from the larger persistent memories) affect hit-rate. Plus, presumably, making sure that the gains in recovery performance also "persist".

More Information

All the presentations are available in the SNIA Educational Library. In particular Twitter.

Sign up for Sunday Brunch, the weekly Breakfast Bytes email.