SANTA CLARA, Calif.—The promise of flash memory storage
systems—despite robust market success in recent years—has yet to be realized.
And the unique technical challenges and changing market requirements in that
segment require a new approach to storage processing architectures.
That was the word from Chris Rowen (right), CTO Cadence IP Group, who
delivered a keynote address to the Flash Memory Summit here Thursday (Aug. 7).
Rowen described a unique design opportunity for flash storage systems
at a time of increasing processor specialization: Over the past
two decades, the need to offload specialty functions from the microprocessor
has flowered to include graphics, audio, video, network processing, and more.
It's now time to explore a fresh approach to storage processor architectures, he
"There are enough unique characteristics about
the kinds of computation needed in a flash-management system that justifies new
structures, new instructions sets, new storage models, new ways for the
processors to interact with these very high bandwidth interfaces," such as
PCIe, NVMe, SATA, DDR controllers, and the like.
designers have certain subsystem needs, including the storage processor, interface
controllers and PHYs, and software.
They also require:
architect also demands faster command queue processing, more memory bandwidth,
and reduced latencies, particularly DDR latencies, he added.
architect needs "to be able to take a tool that says ‘I want to be able to
select or describe all of the key features of my processor'...and from that, in
minutes, generate the complete hardware design, RTL, test environment, EDA
scripts for physical implementation—all without manual intervention," Rowen
told the audience.
has been the key to the wide proliferation of these data plane processors
because it becomes dramatically cheaper to make a data plane processor and to
make a highly tuned processor for these environments," Rowen said.
Rowen—who studied at Stanford, worked on the RISC architecture, and helped found both
MIPS and Tensilica (the latter of which Cadence
bought in early 2013)—described how, as
part of a reference platform development he showed the audience, Cadence has
built a storage processor instruction set built on the Xtensa processor family
and implemented in the Tensilica Instruction Extension (TIE) code.
This storage processing unit (SPU) includes:
noted that that some of the key tasks on storage data structures—creating
table hashes, doing lookups in linked-lists, inserting and deleting elements
from linked lists, parsing command and packet structures—are slow, often painfully
sequential, tasks that generally defy significant architectural speed up on
architecture routinely shows a 3X to 4X performance advantage over a good RISC
processor for these kinds of structures," Rowen said. Implementing the
processor's core logic takes between just 0.1 and 0.2 mm2 of
silicon, Rowen added.
He closed by
"We really are just at the beginning of a kind
of revolution of smart, cost-effective, but highly scalable flash storage
systems. [This approach to storage processing architectures] unleashes the
creative potential of the architect and unleashes more of the bandwidth and
transaction rate potential of the flash devices."
Drive's Days Might be Numbered: Woz
Memory Summit: 3D NAND Flash Faces Cost, Reliability Challenges