• Home
  • :
  • Community
  • :
  • Blogs
  • :
  • Breakfast Bytes
  • :
  • The Latest Addition to the Tensilica Family Is a Baby Neural…

Breakfast Bytes Blogs

  • Subscriptions

    Never miss a story from Breakfast Bytes. Subscribe for in-depth analysis and articles.

    Subscribe by email
  • More
  • Cancel
  • All Blog Categories
  • Breakfast Bytes
  • Cadence Academic Network
  • Cadence Support
  • Computational Fluid Dynamics
  • CFD(数値流体力学)
  • 中文技术专区
  • Custom IC Design
  • カスタムIC/ミックスシグナル
  • 定制IC芯片设计
  • Digital Implementation
  • Functional Verification
  • IC Packaging and SiP Design
  • In-Design Analysis
    • In-Design Analysis
    • Electromagnetic Analysis
    • Thermal Analysis
    • Signal and Power Integrity Analysis
    • RF/Microwave Design and Analysis
  • Life at Cadence
  • Mixed-Signal Design
  • PCB Design
  • PCB設計/ICパッケージ設計
  • PCB、IC封装:设计与仿真分析
  • PCB解析/ICパッケージ解析
  • RF Design
  • RF /マイクロ波設計
  • Signal and Power Integrity (PCB/IC Packaging)
  • Silicon Signoff
  • Solutions
  • Spotlight Taiwan
  • System Design and Verification
  • Tensilica and Design IP
  • The India Circuit
  • Whiteboard Wednesdays
  • Archive
    • Cadence on the Beat
    • Industry Insights
    • Logic Design
    • Low Power
    • The Design Chronicles
Paul McLellan
Paul McLellan
20 Apr 2022

The Latest Addition to the Tensilica Family Is a Baby Neural Network Engine

 breakfast bytes logoIncreasingly, AI processing is being done on-device rather than uploading to the cloud for inference to be done there. This also requires "always-on" to make the human interaction simpler, without the need to perform some physical action such as pushing a button or picking up the device. In 2021, only 20% of data was processed at the edge, but by 2030 this is forecast to grow to 80%. Along with this growth is a fast-growing revenue picture, with 37% CAGR growing to $50B by 2025:

It is not efficient to do all this processing on the application processor on the device. It makes sense to offload all the DSP functions, such as codecs, far-field microphone processing, and so forth onto a dedicated offload DSP processor. It further makes sense to offload AI operators to a dedicated neural network engine (NNE). Offloading to the DSP saves about 20X in energy efficiency, and offloading the AI saves about a further 10X.

Announcing the NNE110

Today, at the Linley Processor Conference, Cadence revealed details of the NNE110, an energy-efficient AI engine. It was originally announced last September. See my post On-Device Artificial Intelligence the Tensilica Way. Today, it operates with the Tensilica HiFi DSPs as in the block diagram below.

Some features of the NNE110 are:

  • Integrated DMA and 128-bit AXI support for efficient data transfer
  • Scalable MAC engine from 32 to 128 8x8 MACs
  • Hardware support for sparsity acceleration (suppressing zero weight multiplications)
  • Built-in lossless weight compression and decompression
  • Native support for most common layers: FC, Conv, DS-conv, LSTM, pooling, and more
  • Full software support with the NN Library, TensorFlow Lite for Microcontrollers (TFLM), Xtensa Audio Framework (XAF), Xtensa SystemC (XTSC)

Smart Speaker Example

In the diagram below, the blue boxes are signal processing operations handled on the DSP, and the red box is the AI operations handled on the NNE110. On the left, the green circles are microphones, and the speakers are indeed speakers.

Noise suppression has traditionally been handled with active filtering in the DSP, but there has been a lot of recent research in using AI networks to do this, including both LSTM-based (long short term memory) and CNN-based (convolutional neural network). Implementing the LSTM approach on HiFi 5 plus NNE110, compared to just HiFi 5 saves about 12X on the number of cycles (and hence a big reduction in latency) and saves about 15X of the energy.

Summary

I discussed AON devices at the end of last year in my post, Always-On Devices. That post did not discuss the NNE110, obviously, since we are only announcing details today. But the combination of a Tensilica DSP with NNE110 is ideal for AON applications and, indeed, for any very low-energy AI application.

 

Sign up for Sunday Brunch, the weekly Breakfast Bytes email.

.

Tags: