How-to integrate generated code
for ISPU target, based on ST Edge AI Core Technology 2.2.0
r1.2
Template
The template for integrating the code generated by ST Edge AI Core on ISPU can be obtained from ST’s GitHub repository for ISPU: https://github.com/STMicroelectronics/st-mems-ispu.
The ispu folder of the template may be copied by itself (no dependencies other than the CLI toolchain or IDE environment that must be installed) and used as a starting project to integrate any model converted using ST Edge AI Core.
For how to setup the development environment for ISPU, refer to the GitHub repository linked above.
Model implementation generation
The template project is structured so that ST Edge AI Core can automatically populate it with the necessary files in the right places. In order to achieve that, run the following command:
stedgeai generate --target ispu --no-workspace --no-report -m <nn_model_file> --output <ispu_folder>
where nn_model_file
is the file containing the model
of the neural network to convert and ispu_folder
is a
copy of the ispu folder of the template. Of course,
additional options can be added to the generate
command
above as needed.
Alternatively, MEMS
Studio’s ISPU Model Converter can be used to generate the
network code, specifying ispu_folder
as the output
directory. ST
Edge AI Developer Cloud can also be used to generate and
download the converted code.
Integration of the model
Once the code has been generated, the project will build, but it will output a number of warnings. Their purpose is to alert to the fact that, in order to complete the integration of the neural network model in the ISPU code, a few modifications specific to the model and the use case are necessary in ispu/src/main.c:
Implement the logic necessary to fill the input data buffer and run the inference when ready.
void __attribute__ ((signal)) algo_00(void) { #warning "Fill the input data buffers contained in input_buffers." #warning "Each buffer in input_buffers must be cast to the appropriate type before accessing it." #warning "The network inference must be run when the input data buffers are ready." = stai_network_run(net, STAI_MODE_SYNC); stai_return_code res
Note that each element of
input_buffers
is a pointer to void and contains the memory address of one input buffer. In order to access an input buffer to fill it with data, it is recommended to apply a cast to reinterpret it as a multidimensional array of the correct shape and type for that input.For example, if the first input is of type float and has shape (1, 52, 3), the following cast can be applied:
float (*input)[52][3] = (float (*)[52][3])input_buffers[0];
Note that the first dimension (in this case equal to 1) does not need to be expressed explicitly in the variable definition or the casting expression.
The format and shape of the inputs can be retrieved from the macros defined in ispu/inc/network.h. Continuing with the example, these would be the macros:
#define STAI_NETWORK_IN_1_FLAGS (STAI_FLAG_INPUTS|STAI_FLAG_PREALLOCATED|STAI_FLAG_CHANNEL_LAST|STAI_FLAG_HAS_BATCH) #define STAI_NETWORK_IN_1_FORMAT (STAI_FORMAT_FLOAT32) #define STAI_NETWORK_IN_1_SHAPE {1,52,3} #define STAI_NETWORK_IN_1_BATCH (1) #define STAI_NETWORK_IN_1_HEIGHT (52) #define STAI_NETWORK_IN_1_CHANNEL (3)
Please note that, in this case, the input is in channel-last format, as confirmed by the
STAI_FLAG_CHANNEL_LAST
flag in theSTAI_NETWORK_IN_1_FLAGS
macro, but, based on the model and the options specified when generating the network, it may be in channel-first format, in which case the shape would be (1, 3, 52) and the macros as follows:#define STAI_NETWORK_IN_1_FLAGS (STAI_FLAG_INPUTS|STAI_FLAG_PREALLOCATED|STAI_FLAG_CHANNEL_FIRST|STAI_FLAG_HAS_BATCH) #define STAI_NETWORK_IN_1_FORMAT (STAI_FORMAT_FLOAT32) #define STAI_NETWORK_IN_1_SHAPE {1,3,52} #define STAI_NETWORK_IN_1_BATCH (1) #define STAI_NETWORK_IN_1_HEIGHT (52) #define STAI_NETWORK_IN_1_CHANNEL (3)
and the reinterpretation of the buffer would become:
float (*input)[3][52] = (float (*)[3][52])input_buffers[0];
In a nutshell, the C array shape must be, in any case, based on the
STAI_NETWORK_IN_X_SHAPE
macro, whereX
is the input number. The array must then be indexed based on the format (channel-last vs channel-first) in order to fill it with the input values.For channel-last:
[0][height_index][channel_index] = input_value; input
For channel-first:
[0][channel_index][height_index] = input_value; input
In this case, since the batch size is the first dimension and it is equal to 1, the first index is always
0
.How to compute
input_value
for the specific model must be know by the developer integrating it. For example, in the case used above for explanation, the input array could represent a buffer of 52 accelerometer samples, each composed of 3 float values (x-axis, y-axis, and z-axis) in g unit. Considering the accelerometer configured at 16 g full-scale and the input in channel-last format, the code could be something like this:[0][sample_index][0] = cast_sint16_t(ISPU_ARAW_X) * 0.488f; input[0][sample_index][1] = cast_sint16_t(ISPU_ARAW_Y) * 0.488f; input[0][sample_index][2] = cast_sint16_t(ISPU_ARAW_Z) * 0.488f; input
This would be executed every time a new accelerometer sample is generated by the sensor, the sample thus being added to the buffer. When the buffer is full, the model inference can be executed with the call to the
stai_network_run
function.(Optional) Implement a logic to handle errors returned by the inference function call
stai_network_run
.= stai_network_run(net, STAI_MODE_SYNC); stai_return_code res if (res >= STAI_ERROR_GENERIC) { #warning "Handle inference error as deemed appropriate." }
Error codes are defined in inc/ai/stai.h.
Retrieve the neural network results from the output data buffer and generate the interrupt as needed.
#warning "Get the inference results from the output data buffers contained in output_buffers." #warning "Each buffer in output_buffers must be cast to the appropriate type before accessing it." // interrupt generation = int_status | 0x1u; int_status
Note that each element of
output_buffers
is a pointer to void and contains the memory address of one output buffer. In order to access an output buffer to retrieve the data, it is recommended to apply a cast to reinterpret it as a multidimensional array of the correct shape and type for that output.For example, if the first output is of type float and has shape (1, 4), the following cast can be applied:
float (*output)[4] = (float (*)[4])output_buffers[0];
Note that the first dimension (in this case equal to 1) does not need to be expressed explicitly in the variable definition or the casting expression.
The format and shape of the outputs can be retrieved from the macros defined in ispu/inc/network.h. Continuing with the example, these would be the macros:
#define STAI_NETWORK_OUT_1_FLAGS (STAI_FLAG_OUTPUTS|STAI_FLAG_PREALLOCATED|STAI_FLAG_CHANNEL_LAST|STAI_FLAG_HAS_BATCH) #define STAI_NETWORK_OUT_1_FORMAT (STAI_FORMAT_FLOAT32) #define STAI_NETWORK_OUT_1_SHAPE {1,4} #define STAI_NETWORK_OUT_1_BATCH (1) #define STAI_NETWORK_OUT_1_CHANNEL (4)
Please note that, in this case, there is no height or width dimension, but, in general, the same considerations made for the inputs about channel-last vs channel-first format also apply to the outputs.
The C array shape must be, in any case, based on the
STAI_NETWORK_OUT_X_SHAPE
macro, whereX
is the output number. The array must then be indexed based on the format (channel-last vs channel-first) in order to retrieve the output values. In this case:[0][channel_index] output
In this case, since the the batch size is the first dimension and it is equal to 1, the first index is always
0
.How to utilize the output values is up to the developer integrating the model. The options are:
- Use them for further processing inside the ISPU.
- Copy them to the ISPU output registers.
Still using the same example, the following code could be used to copy the values to the output registers:
(ISPU_DOUT_00) = output[0][0]; cast_float_t(ISPU_DOUT_02) = output[0][1]; cast_float_t(ISPU_DOUT_04) = output[0][2]; cast_float_t(ISPU_DOUT_06) = output[0][3]; cast_float_t
For example, the network could be implementing a classification model and each value of the output array would thus represent the detected probability (number between 0 and 1) for one of 4 classes that the model is able to recognize.
The interrupt can be generated as needed. Some options are:
- Generate the interrupt at every execution of the
algo_00
function (that is, every time a new sample of the sensor is processed by the ISPU). - Generate the interrupt every time a new inference has been run and the output values have been copied to the output registers.
- In the case of a classification model, generate the interrupt only when the detected class changes, so that the host can go as long as possible without being woken up by the sensor.
As always, ispu/conf.txt, ispu/meta.txt, and ispu/shub.txt should also be modified as required. For more information on these files, please refer to the regular (not specific for ST Edge AI) ISPU template.
Note: the template supports one single model generated with the default name “network”. Integrating multiple networks or one network with a different name would require modifying the code to a greater extent than what was shown here.
For more information on the code generated by ST Edge AI Core and how to use it, please refer to the other articles in this documentation.