• Skip to main content
  • Skip to search
  • Skip to footer
Cadence Home
  • This search text may be transcribed, used, stored, or accessed by our third-party service providers per our Cookie Policy and Privacy Policy.

  1. Blogs
  2. Breakfast Bytes
  3. The Latest Addition to the Tensilica Family Is a Baby Neural…
Paul McLellan
Paul McLellan

Community Member

Blog Activity
Options
  • Subscribe by email
  • More
  • Cancel

The Latest Addition to the Tensilica Family Is a Baby Neural Network Engine

20 Apr 2022 • 2 minute read

 breakfast bytes logoIncreasingly, AI processing is being done on-device rather than uploading to the cloud for inference to be done there. This also requires "always-on" to make the human interaction simpler, without the need to perform some physical action such as pushing a button or picking up the device. In 2021, only 20% of data was processed at the edge, but by 2030 this is forecast to grow to 80%. Along with this growth is a fast-growing revenue picture, with 37% CAGR growing to $50B by 2025:

It is not efficient to do all this processing on the application processor on the device. It makes sense to offload all the DSP functions, such as codecs, far-field microphone processing, and so forth onto a dedicated offload DSP processor. It further makes sense to offload AI operators to a dedicated neural network engine (NNE). Offloading to the DSP saves about 20X in energy efficiency, and offloading the AI saves about a further 10X.

Announcing the NNE110

Today, at the Linley Processor Conference, Cadence revealed details of the NNE110, an energy-efficient AI engine. It was originally announced last September. See my post On-Device Artificial Intelligence the Tensilica Way. Today, it operates with the Tensilica HiFi DSPs as in the block diagram below.

Some features of the NNE110 are:

  • Integrated DMA and 128-bit AXI support for efficient data transfer
  • Scalable MAC engine from 32 to 128 8x8 MACs
  • Hardware support for sparsity acceleration (suppressing zero weight multiplications)
  • Built-in lossless weight compression and decompression
  • Native support for most common layers: FC, Conv, DS-conv, LSTM, pooling, and more
  • Full software support with the NN Library, TensorFlow Lite for Microcontrollers (TFLM), Xtensa Audio Framework (XAF), Xtensa SystemC (XTSC)

Smart Speaker Example

In the diagram below, the blue boxes are signal processing operations handled on the DSP, and the red box is the AI operations handled on the NNE110. On the left, the green circles are microphones, and the speakers are indeed speakers.

Noise suppression has traditionally been handled with active filtering in the DSP, but there has been a lot of recent research in using AI networks to do this, including both LSTM-based (long short term memory) and CNN-based (convolutional neural network). Implementing the LSTM approach on HiFi 5 plus NNE110, compared to just HiFi 5 saves about 12X on the number of cycles (and hence a big reduction in latency) and saves about 15X of the energy.

Summary

I discussed AON devices at the end of last year in my post, Always-On Devices. That post did not discuss the NNE110, obviously, since we are only announcing details today. But the combination of a Tensilica DSP with NNE110 is ideal for AON applications and, indeed, for any very low-energy AI application.

 

Sign up for Sunday Brunch, the weekly Breakfast Bytes email.

.