For over a decade there has been a surge of interest in Artificial Intelligence (AI). This interest is largely driven by the explosion in the amount of generated data, which enables the development of accurate Machine Learning (ML) models. The generated data volumes will continue to grow at an exponential pace due to the ongoing increase in the adoption of the internet and the proliferation of the Internet of Things (IoT) devices.
At the same time, the growth of AI is propelled by the rising availability of cheap storage and computing resources. The latter facilitates the effective processing of large data volumes by ML frameworks for performing complex tasks such as automated analysis of visual scenes and other computer vision applications. Computer vision systems perceive their environment and perform actions based on visual data and are among the most prominent examples of AI.
For nearly a decade, most AI systems were cloud-based. They leveraged the capacity and scalability of cloud infrastructures towards analyzing large numbers of data points by means of computationally expensive ML algorithms. Nevertheless, cloud-based systems have proclaimed limitations for certain classes of AI applications, including most computer vision applications.
For instance, they can hardly offer real-time performance as transferring data to the cloud involves high-latency wide area networks. Also, there are cases where enterprises are not willing to share data outside their local networks for privacy and data protection reasons. Moreover, executing ML models on high-end CPUs (Central Processing Units) and GPUs (Graph Processing Units) requires significant compute cycles and exhibits poor environmental performance.
Power efficiency is a very important requirement for computer vision systems, which are very computationally intensive. This is also the reason why computer vision systems are usually benchmarked in terms of their energy performance, which is typically measured in Frames Per Second Per Watt (FPS/Watt).
In recent years, these limitations of cloud AI systems are driving a shift of AI functions from cloud to edge systems. Edge AI systems collect, manage and process data close to the field i.e., within local clusters and field devices. In the case of computer vision applications, this shift has given rise to the Embedded Vision AI systems. The latter deploys and executes complex machine learning models and other AI functions within embedded devices. Embedded vision processing is a fast-growing computer vision technology, which is encapsulated in computer chips and is eventually embedded into various devices.
Benefits of Embedded Vision AI
Embedded vision systems come with a host of benefits for both users and AI systems operations. These benefits include:
- Power Efficiency and Higher Sustainability: Embedded vision is much “greener” than conventional cloud-based computer vision. This is because local data processing minimizes I/O (Input-Output) and data transfer operations, which reduces CO2 emissions. Apart from being a general requirement, power efficiency is critical for many computer vision applications, where AI tasks must be run on devices with quite low energy autonomy.
- Low Latency and Real-Time Performance: Embedded vision systems process collect data through a high-speed local area network or even directly on the data source. This reduces application latency and enables real-time performance. The latter is crucial for a significant number of applications in areas like security, smart homes, and industry. For instance, in the case of security applications, it enables the identification of suspicious activity in real-time. Likewise, in industrial applications embedded vision AI facilitates real-time detection of defects in products or production processes.
- Cost Savings and Economical Performance: The processing of visual signals on embedded devices reduces the amount of data that needs to be transferred to the cloud. Rather than transferring raw data (e.g., video, images) to the cloud, embedded vision systems convey the outcome of their processing only. In this way they economize on network bandwidth and cloud storage costs.
- Privacy and Data Protection: Embedded vision applications obviate the need for transferring data outside an organization to a cloud provider. This is particularly important for applications that process privacy-sensitive data such as security and healthcare applications. By keeping and processing data locally, embedded vision technology boosts privacy and data protection. It also facilitates compliance to relevant regulations such as the General Data Protection Regulation (GDPR) for European organizations.
Embedded AI Challenges
The development and deployment of embedded vision applications is not an easy task. Specifically, system developers, deployers, and operations are confronted with the following technological challenges:
- Compute Capacity and Embedded Performance Limitations: The high computational cost of AI computation processing makes it a challenge for embedded vision applications to run real-time in embedded systems. Most embedded devices have quite limited computational capacity, along with energy constraints. Thus, there is a need for developing computationally efficient ML functions and models.
- ML Models Size and Storage Capacity Limitations: State of the art ML models have many parameters, features and interdependencies. This is for example the case with deep neural networks, which comprise many computational layers with several 100s of parameters. Therefore, the storage of sophisticated ML models requires considerable storage. This pushes the storage capacity of many embedded devices to their limits.
- Model and Devices Heterogeneity: There is a variety of embedded vision applications, which feature different requirements. Therefore, embedded vision AI is deployed over multitude of devices with heterogeneous characteristics. Moreover, it leverages a wide range of ML models with different features, while new ML algorithms are constantly emerging. This heterogeneity of models and devices makes it quite difficult to standardize the application development and deployment process, which makes it challenging to deploy embedded vision AI in cost-effective ways.
- Energy Efficiency Challenges: Most embedded devices have limited energy autonomy, even when resorting to external energy sources (e.g., batteries). Embedded vision stakeholders are therefore in need of novel and effective ways to develop power-efficient applications.
- Data Collection and Availability Challenges: The training of ML models for embedded vision applications hinges on the availability of datasets suitable for embedded machine learning. In many cases such datasets are hardly available and need to be collected using the embedded visual signal processing device (e.g., an embedded HD (High Definition) camera)).
The Renesas AI Accelerator and Tools Ecosystem
Renesas offers an ecosystem of novel embedded vision AI platforms and tools, which facilitate developers and deployers to address the above-listed challenges. Specifically, Renesas offers:
- The DRP-AI accelerator, which is a special-purpose hardware module that boosts embedded AI processing for vision applications. The accelerator consists of two sub-modules, namely the AI-MAC (multiply-accumulate processor) and the DRP (reconfigurable processor). Leveraging the DRP-AI accelerator, embedded version developers can execute ML/DL operations at very high speed. To this end, they assign AI-MAC for operations on the convolution layer of the ML model and use DRP for other types of complex processing such as data preprocessing and pooling.
- The DRP-AI translator is a software module that generates executables optimized for DRP-AI. In this direction, the translator employs techniques like graph optimization and FP16 quantization to reduce the model size. For example, post-training IEEE FP16 quantization is used to significantly reduce deep neural network models (e.g., up to 50%), without any essential loss in the accuracy of the model. Moreover, the translator supports the standardized ONNX (Open Neural Network Exchange) format, which facilitates access to different hardware devices for deployment and acceleration. Overall, the translator module minimizes memory requirements and improves the computing efficiency of embedded vision deployments.
Moreover, Renesas offers a complete software development environment, which runs on Personal Computers (PCs). This environment enables developers to access the functionalities of the DRP-AI translator towards converting the trained AI model to object code suitable for the DRP-AI accelerator.
Leveraging the capabilities of the DRP-AI accelerator, Renesas solutions achieve exceptional power efficiency, which sets them apart from state-of-the-art computer vision solutions that use higher-power GPUs (Graph Processing Units). Specifically, Renesas solutions end-up performing vision AI computations using 1/3 of the power consumed by state of the art GPU-based solutions when performing the same computations. Renesas is therefore providing solutions with the higher FPS/Watt in the market.
Overall, Renesas delivers an integrated, high-performance, and power-efficient embedded vision AI solution, which meets stringent application requirements and real-time constraints. The combination of the DRP-AI accelerator and translator modules provides an innovative embedded vision infrastructure that goes far beyond what current AI technology can support. In this way, Renesas paves the wave for a future generation of high-performance, energy-efficient and cost-effective computer vision applications at the edge of the network. Read more on the Renesas DRP-AI here.
About the sponsor: Renesas
At Renesas we continuously strive to drive innovation with a comprehensive portfolio of microcontrollers, analog and power devices. Our mission is to develop a safer, healthier, greener, and smarter world by providing intelligence to our four focus growth segments: Automotive, Industrial, Infrastructure, and IoT that are all vital to our daily lives, meaning our products and solutions are embedded everywhere.