Various factors are fueling the continued growth of True Wireless Stereo (TWS) earbud shipments; an important one is the jettisoning of the audio jack on leading smartphones. Growth is expected in all segments, spanning base models to mid-tier and premium models.
It is important that OEMs and SoC providers of TWS earbuds prepare to offer the quality and features the market expects of next generation products. Here, we explore the architecture of the signal path and the approach to scaling it across different product tiers.
The signal processing requirements across these tiers/segments will vary significantly. This poses a conundrum to an OEM or SoC provider, “How do I address these variegated requirements—would these entail different architectures, each with disparate system and software frameworks? How many vendors do I need to work with to create, say three SKUs, call them Good-Enough, Better, Best? Finally, how do I get all this done with the limited resources available? Or do I need to simplify my problem, limit my TAM by betting on one segment and just go with that?”
It’s about scaling.
Let’s illustrate this with the following hypothetical requirements for the three SKUs, as Table 1 shows:
Music Playback Time
LC3, SBC, OPUS
Glass break, Emergency Vehicle approaching, and more
No → Always-on
Table 1: Product Requirement for three SKUs
A basic requirement of all three SKUs is long music playback and talk time with support for the applicable codecs. Subsequently, the requirements begin to diverge in feature availability and sophistication. Captured here is the trend that consumers want to interact with their earbuds through voice commands and would appreciate being able to talk somewhat naturally when issuing commands, even if in a limited way. Hence, the SKUs support different voice commands with different vocabulary sizes. The premium SKU supports limited Automatic Speech Recognition (ASR). Cost tolerance and battery capacity also varies across our imaginary SKUs. Considering these requirements, how do we arrange an architecture that will scale across the SKUs?
Let’s start by targeting the “Good Enough” TWS earbuds design. As agreed in the table, the last row eliminates the power button, making the earbuds an always-listening accessory for all SKUs. Take them out of the charging cases and they are all ears from then on—everyone deserves hands-free convenience.
Figure 1 shows a single DSP in an always-on configuration performing sensor fusion and voice processing continuously.
Figure 1: Always-on DSP Configuration
It keeps the rest of the system in power down mode while it looks for commands from the microphone or through the other sensors. When the user wants music playback or call processing, the DSP wakes up the rest of the system and enters active mode. In this case, HiFi 1 also serves as the main application processor, as it can run control code efficiently. This eliminates the need for a separate CPU, leading to a smaller design and lower power.
In the always-on mode, the DSP may be running at low frequency to further conserve energy. When entering active mode (sometimes called turbo mode), HiFi 1 can raise its clock frequency to accommodate the higher workload—music playback or call processing. It may raise clock frequency even further if also performing noise suppression or cancelation, such as ANC. The latter is important to maintain good audio quality. This meets the noise reduction requirement in Table 1.
The clock frequency can be raised even higher to support simple voice commands beyond wake-word detection. Dynamic Voltage-Frequency Scaling or DVFS can be used to raise or lower the frequency depending on the use case currently active. This keeps energy consumption to the necessary minimum. HiFi 1 can be synthesized to run up to 1GHz frequency, depending on the technology to accommodate various levels of concurrent workloads, many of which may be AI based or hybrid DSP/AI.
This brings us to AI. Many modern applications, such as voice commands and noise suppression are AI based. How do we do that in an always-on DSP? Well, despite being compact, HiFi 1 supports Neural Network workloads through the inclusion of NN instructions and architectural support. These run with high energy efficiency. Benchmarks such as OK Google Keyword spotting, and TFLM person detect show significant energy savings over other HiFi DSPs; therefore, many NN applications can run in the always-on mode.
An additional cost-optimization step might include subsuming the BT Controller functionality in the DSP, especially, if it is efficient at running control code. This can help eliminate yet another block, reducing area, power, and energy, further. This would leave HiFi 1 as the only processor in the system—to get to the smallest configuration. As this also means eliminating memory for the controller (ROM/RAM) from the controller or CPU subsystem, this accrues significant PPA benefit. Where cost and battery size are most important, these measures help achieve those goals.
Fewer processors also mean fewer tools and disparate environments, making software development a lot easier. So, let’s talk software.
Software should not be an afterthought, as it is typically, the long pole in the tent. For this reason, HiFi 1 maintains software compatibility with current and previous generation HiFi’s. Product Managers will appreciate that the large array of software developed by Cadence/Tensilica® partners on the other HiFis is immediately available on HiFi 1. These software applications, developed by domain experts, are optimized to run on HiFis, using the Cadence Neural Network libraries (NNlib) and DSP libraries (NDSP lib). Fast time to market for OEMs by employing partner applications is a cornerstone of HiFi success.
We have embarked on what we hope will be a scalable signal processing architecture of TWS earbuds. For Good-enough earbuds, we up-featured from current generation base models and fleshed out the architecture economically supporting always-on and active modes of operation to maximize battery life.
Having pared the TWS SoC design down to a single processing entity—HiFi 1 in this illustration—one might be tempted to go home and exclaim, “Honey, I shrunk the TWS earbud SoC”!
Next time, we will look at what it takes to scale this architecture for the mid- and high-end product segments. Stay tuned.
Smallest, ultra-low-energy DSP for always-on sensor fusion, always-listening voice trigger, and Bluetooth/BLE codecs
Scaling up the TWS design