2.2.0
Definitions, Glossary, and Trademark


ST Edge AI Core

Definitions, Glossary, and Trademark


ST Edge AI Core Technology 2.2.0




Glossary

The following article describes the main terms used in various articles related to the AI solution.

Activations buffer

The activations buffer designates a memory region or a set of memory regions used by the inference engine to execute the deployed model. This buffer temporarily stores intermediate data and activations generated during the model’s inference process, ensuring efficient data flow and computation. See also: Weights/Params buffer

Ahead-of-time (AOT) compilation

Ahead-of-time (AOT) compilation compiles a program from a high-level programming language into machine code or native code before execution. This is in contrast to just-in-time (JIT) compilation, where code is compiled during runtime.

AI Runtime environment

The term “AI runtime environment” is used in relation to the inference engine. It indicates the required software and/or hardware components, also called the AI stack, needed to execute the deployed model. This environment includes all the necessary libraries, frameworks, drivers, and hardware accelerators that work together to ensure that the model runs efficiently and accurately. The runtime environment is crucial for optimizing performance, managing resources, and providing the necessary support for various operations involved in model inference.

Compiler/Code generator

A compiler is an application like ST Edge AI Core that converts a high-level representation of a deep neural network or model into a low-level representation (such as C-code or bitstream), enabling its execution on the target hardware. It is based on an Ahead-of-Time (AOT) technology, which allows for optimization of the deployment in terms of code size, startup time, and reduced runtime overhead.

Embedded runtime library

The embedded runtime library designates the optimized library implementing the operators/kernels for a given target/toolchain.

Embedded Inference Client API

The “Embedded inference client API” refers to a set of programming interfaces designed to enable embedded systems to perform inference tasks using pretrained machine learning models. These APIs allow developers to integrate machine learning capabilities into embedded devices, such as microcontrollers or edge devices.

Feature maps

Feature maps are a fundamental concept in the field of machine learning, particularly in convolutional neural networks (CNNs). They represent the output of a convolutional layer, which has been processed by applying a set of filters (kernels) to the input data. Each filter detects specific features or patterns in the input data, such as edges, textures, or other relevant characteristics. They are generally located in the activation buffer. See also: Activations buffer.

Quantization scheme

Quantization scheme refers to the method of reducing the precision of the weights or activations from floating-point (typically 32-bit floating-point) to lower bit-width representations such as 8-bit integers. This reduces the model size and computational requirements, making it more efficient for use on devices like mobile phones, embedded systems, and edge devices.

Inference engine

The inference engine is a set of software components and/or hardware IP that allows the execution of the deployed/compiled model. It interprets the model’s instructions and performs the necessary computations to produce the desired outputs. The inference engine can leverage specialized hardware accelerators, such as NPUs (neural processing units), to enhance performance and efficiency, especially for deep learning tasks. It is an essential part of the deployment pipeline, ensuring that the model runs correctly and efficiently on the target hardware. See also: Runtime environment

ISPU

The ISPU is a true integrated digital signal processor (DSP). It is optimized with respect to a general-purpose MCU and can be used to run complex AI algorithms (see “ST Edge AI Core for ISPU” article).

MLC

The Machine Learning Core (MLC) is an AI engine integrated into a variety of STMicroelectronics MEMS sensors. It can run multiple decision trees for classification tasks using motion data (accelerometer, gyroscope) or, depending on the device, even data coming from external sources (for example, magnetometer, vertical analog front end, etc.) (see “ST Edge AI Core for MLC” article).

Neural ART accelerator™

Neural-Art accelerator™ is a branded family which is a specialized hardware component designed to accelerate the execution of deep neural networks in constrained environments. These components are optimized for performance and efficiency, making them ideal for applications where resources such as power, memory, and processing capabilities are limited. They enable faster inference times and lower latency, which are critical for real-time applications like edge computing, IoT devices, and mobile platforms. See also: NPU

NPU

An NPU, or neural processing unit, is a specialized hardware component designed to accelerate the execution of deep neural networks.

NPU compiler

See ST Neural-ART compiler

ST Edge AI Core

ST Edge AI Core designates the core technology and associated components. They are used to deploy/compile a pretrained deep learning (DL) or machine learning (ML) model, enabling its execution on an ST target.

ST Edge AI Core CLI

The ST Edge AI Core CLI designates the stedgeai application exporting a command-line interface to use the ST Edge AI Core technology. It is also used as a driver or front end to call the specialized compilers (or back-end engines) associated with the different ST targets

ST Neural-ART compiler

ST Neural-ART compiler designates the compiler (or back-end) allowing to deploy a model to a platform based on the ST Neural-ART NPU.

ST Neural-ART NPU

ST Neural-ART NPU designates the hardware component or IP based on the Neural ART accelerator™ technology which is embedded in the ST devices.

Weights/Params/Kernels buffer

The weights buffer designates a memory region or a set of memory regions used to store the parameters of the deployed model. This buffer holds the model’s weights and biases, which are essential for making accurate predictions during inference.

X-CUBE-AI

X-CUBE-AI is an STM32Cube Expansion Package, which is part of the STM32Cube.AI ecosystem. It extends STM32CubeMX capabilities with automatic conversion of pretrained artificial intelligence algorithms, including neural network and classical machine learning models. It integrates also a generated optimized library into the user’s project.

Trademarks

STMicroelectronics trademark list - this contains a list of STMicroelectronics’s word trademarks with key information about how to use them.