Endpoint MCU Implementation of Voice User Interface

Article 5 of Bringing Intelligence to the Edge Series: Integrating voice user interface technology into microcontroller units for offline, edge-based voice recognition is set to redefine the landscape of home automation and smart industrial applications.

01 Aug, 2023. 6 minutes read

Endpoint MCU Implementation of Voice User Interface

Topic

A.I.

Overview of Voice User Interface Implementation

VUI is speech recognition technology that enables users to interact with a computer, smartphone, or other daily use device with voice commands. The unique feature of VUI is the use of voice as primary mode of interaction, in contrast with traditional keyboard, mouse, display, or touch screen.

The new, easy-to-use Renesas hardware platform for VUI solutions is based on the high-performance Renesas Advanced (RA) family of 32-bit microcontroller units (MCUs).

The RA family delivers key advantages compared to competitive Arm Cortex®-M MCUs by providing stronger embedded security, superior CoreMark® performance, and ultra-low-power operation. Platform Security Architecture (PSA) certification provides customers the confidence and assurance to quickly deploy secure IoT endpoint and edge devices, and smart factory equipment for Industry 4.0

The RA family currently includes three product series: RA6, RA4, and RA2. Each of these series has a unique feature set, making them ideal for various applications and market needs. The RA6 Series offers the widest integration of communication interfaces, with integrated Ethernet and TFT display drivers. Flash memory densities range from 256KB to 2MB. The RA6 Series offers up to 240MHz performance running on the Cortex-M4 or Cortex-M33 core with TrustZone®. The RA6 Series supports full security integration, making these devices widely desired for security applications.

The RA4 Series balances the requirements for low power consumption with the demand for connectivity. It offers up to 1MB of flash and a wide range of communication interfaces. The utilized core is the Cortex-M4 or Cortex-M33 with TrustZone and additional security IP integration. Memory densities range from 256KB to 1MB of flash. These devices provide a CPU frequency of up to 100MHz. The RA2 Series are ideal for designs where the low power requirements of an application matter most. To achieve the best performance, special power-down modes are provided, making these devices well-suited for battery-powered applications. The RA2 Series provides memory densities of up to 256KB of embedded flash and a wide single-voltage supply range of 1.6 to 5.5V. These devices use the Cortex-M23 core running at up to 48MHz.

The Renesas Flexible Software Package (FSP) is an enhanced software package designed to provide easy-to-use, scalable, high-quality software for embedded system designs using Renesas RA Family microcontrollers (Figure 2).

renesas-flexible-software-package Fig. 2: Renesas Flexible Software Package. Source: Renesas Electronics

It uses an open software ecosystem and provides flexibility in using bare-metal programming, including Azure RTOS, FreeRTOS, other preferred RTOS, legacy code, and third-party ecosystem solutions. The combination of the flexible open architecture of the FSP plus the wide choice of third-party solutions as part of the Arm ecosystem increases the range of choice for application development. This means that developers can choose the software model that best suits their needs while utilizing Renesas’s excellent Arm-based silicon solutions as well as speed up the implementation time of complex areas like connectivity and security.

Voice Recognition Engine

Based on the Renesas ecosystem, Cyberon DSpotter (Figure 3) is a local voice trigger and command recognition solution with robust noise reduction that consumes very low resources and provides high accuracy performance.

cyberon-dspotter-voice-recognition-engine Fig. 3: Cyberon DSpotter voice recognition engine. Source: Renesas Electronics

It supports multiple languages as well as many connectivity functions and security capabilities depending on the selected MCU. The major features are listed below:

Local voice recognition algorithm, no network connectivity needed
Phoneme-based modeling
Quick command customization—removes the need to collect speech data in advance
Optimization by model adaptation just with small amount of speech data
Global language support: 44+ languages worldwide
Small footprint and cost-effective (single DMIC + RA6E1 or RA4E1)
DSMT Tool: wake-word and commands customization, performance tuning, testing, no prior neural network knowledge needed.
Separation of recognition engine and command model, switching commands dynamically
Low-power, high-efficiency RA MCU with strong security function

Results

Hit rate has been captured with voice commands mixed with different types of noise in levels suitable to create distinguished signal-to-noise ratios. The test bench shown in Figure 4 is utilized and the results are summarized in Table 1.

evaluation-test-bench Fig. 4: Evaluation test bench. Source: Renesas Electronics

SNR	Background noise	Distance	Hit-Rate	Alexa Requirements
(Clean)	none	1.5m	100.00%	90%
(Clean)	none	3m	100.00%	90%
10dB	Babble	1.5m	98.55%	80%
10dB	Babble	3m	98.84%	80%
10dB	Music	1.5m	98.26%	80%
10dB	Music	3m	98.55%	80%
10dB	TV	1.5m	98.84%	80%
10dB	TV	3m	98.55%	80%
5dB	Babble	1.5m	98.84%	80%
5dB	Babble	3m	96.24%	80%
5dB	Music	1.5m	98.84%	80%
5dB	Music	3m	97.08%	80%
5dB	TV	1.5m	93.37%	80%
5dB	TV	3m	90.72%	80%

Table 1: Results of hit rate

Conclusions

The complete implementation of an endpoint voice commands recognition system has been presented that is capable of executing on a simple MCU device. The reference design enables local voice recognition without a network connection and allows one to quickly start building an enhanced VUI in minutes that recognizes voice commands to trigger the corresponding operation.

This article is based on an e-magazine: Bringing Intelligence to the Edge by Mouser Electronics and Renesas Electronics Corporation. It has been substantially edited by the Wevolver team and Electrical Engineer Ravi Y Rao. It's the fifth article from Bringing Intelligence to the Edge Series. Future articles will introduce readers to some more trends and technologies shaping the future of Edge AI.

This introductory article unveils the "Bringing Intelligence to the Edge" series, exploring the transformative potential of AI at the Edge

The first article examines the challenges and trade-offs of integrating AI into IoT devices, emphasizing the importance of balancing performance, ROI, feasibility, and data considerations for successful implementation.

This second article delves into the transformative role of Endpoint AI and embedded vision in tech applications, discussing its potential, challenges, and the advancements in processing data at the source.

The third article delves into the intricacies of TinyML, emphasizing its potential in edge computing and highlighting the four crucial metrics - accuracy, power consumption, latency, and memory requirements - that influence its development and optimization.

The fourth article delves into the realm of data science and AI-driven real-time analytics, showcasing how AI's precision and efficiency in processing big data in real-time are transforming industries by recognizing patterns and inconsistencies.

The fifth article delves into the integration of voice user interface technology into microcontroller units, emphasizing its transformative potential.

The sixth article delves into the profound impact of edge AI on system optimization, maintenance, and anomaly detection across diverse industries.

About the sponsor: Mouser Electronics

Mouser Electronics is a worldwide leading authorized distributor of semiconductors and electronic components for over 1,200 manufacturer brands. They specialize in the rapid introduction of new products and technologies for design engineers and buyers. Their extensive product offering includes semiconductors, interconnects, passives, and electromechanical components.

mouser-electronics

Search for articles and topics on Wevolver

"artificial intelligence"

"embedded machine learning"

Explore topics

Topic

A.I.