NXP Glows in Tensilica HiFi

29 Sep 2020 • 2 minute read

One trend that many people have remarked on, is that neural network inference is moving to edge devices. This means that instead of running on big multicore cloud servers with attached GPUs or FPGAs, the network has to run on something like your smartphone, or, for the Internet of Things (IoT) on some sort MCU with even less processing power. Typically these are pre-trained networks, with the training consuming done in the cloud in the usual way. The challenge is then two-fold: how to reduce the neural network so that it isn't so resource-heavy, since edge devices have limited compute power and memory, and also the practicalities of how you take your pre-trained network and "compile" it so that it runs on the processors available in the device.

One approach, recently reported on in NXP Bets on Neural Network Compiler Glow to Push ML to Edge Devices is to use Glow. As it says there:

In 2018, Facebook released Glow as an open-source, community project to herald a "community-driven approach to AI infrastructure." Cadence, Esperanto, Intel, Marvell, and Qualcomm Technologies immediately hopped on board, pledging their support to Glow in future silicon hardware.

The nickname "Glow" is derived from "graph lowering compiler" because it creates code for a number of hardware accelerators that each have their own memory configurations.

NXP recently released ML software support, eIQ, for Glow. NXP claims this collaboration was an industry first for implementing a NN compiler for a high-performance, low-memory footprint on select NXP MCUs—namely, the i.MX RT Crossover devices. Under the hood on the chip is a Tensilica HiFi 4 DSP which can be used to execute the inference. Since there is an Arm processor and a HiFi DSP on the chip, it is possible to directly compare the performance of inference on the two cores. Using the HiFi DSP, the inference runs 25X faster.

The diagram above shows the flow with ONNX (open neural network exchange), the Glow AOT neural network compiler, and the i.MX RT chip which contains both an Arm Cortex-M and a HiFi 4, either of which can be the target of Glow. The model is trained in the cloud (or in a datacenter). The pre-trained model is in a standard ONNX format. Glow then targets either the CMSIS-NN library or the Cadence HiFi NN library.

TensorFlow

A few months ago at the Linley Processor Conference Yipeng Liu presented Efficient Machine Learning on HiFi DSPs Using TensorFlow. TensorFlow is another neural network framework, originally developed by Google but, like Glow, also open source. As it happens, in her presentation the same NXP device is used as the target device. You can see her presentation here, or read my blog post reporting on it at the time HiFi DSPs - Not Just for Music Anymore.

Learn More

See the NXP white paper How the Glow Compiler Optimizes Neural Networks for Low-Power NXP MCUs.

See the Tensilica HiFi Audio DSP page.

Sign up for Sunday Brunch, the weekly Breakfast Bytes email.

"+ res.PreviousPostTitle); // //NextPostUrl // //Previousposturl // } // }); }); if ( $('.blog-post.nextweb-blog-post .ifrmesrc').length ) { iframeattr = $('.blog-post.nextweb-blog-post .ifrmesrc'); markup = ''; $('.blog-post-content .ifrmesrc').html(markup); $('.blog-post.nextweb-blog-post .ifrmesrc').show(); } -->