How-to integrate generated code

for ISPU target, based on ST Edge AI Core Technology 2.2.0

r1.2

ISPU target

Template

The template for integrating the code generated by ST Edge AI Core on ISPU can be obtained from ST’s GitHub repository for ISPU: https://github.com/STMicroelectronics/st-mems-ispu.

The ispu folder of the template may be copied by itself (no dependencies other than the CLI toolchain or IDE environment that must be installed) and used as a starting project to integrate any model converted using ST Edge AI Core.

For how to setup the development environment for ISPU, refer to the GitHub repository linked above.

Model implementation generation

The template project is structured so that ST Edge AI Core can automatically populate it with the necessary files in the right places. In order to achieve that, run the following command:

stedgeai generate --target ispu --no-workspace --no-report -m <nn_model_file> --output <ispu_folder>

where nn_model_file is the file containing the model of the neural network to convert and ispu_folder is a copy of the ispu folder of the template. Of course, additional options can be added to the generate command above as needed.

Alternatively, MEMS Studio’s ISPU Model Converter can be used to generate the network code, specifying ispu_folder as the output directory. ST Edge AI Developer Cloud can also be used to generate and download the converted code.

Integration of the model

Once the code has been generated, the project will build, but it will output a number of warnings. Their purpose is to alert to the fact that, in order to complete the integration of the neural network model in the ISPU code, a few modifications specific to the model and the use case are necessary in ispu/src/main.c:

Implement the logic necessary to fill the input data buffer and run the inference when ready.
```
void __attribute__ ((signal)) algo_00(void)
{
    #warning "Fill the input data buffers contained in input_buffers."
    #warning "Each buffer in input_buffers must be cast to the appropriate type before accessing it."

    #warning "The network inference must be run when the input data buffers are ready."
    stai_return_code res = stai_network_run(net, STAI_MODE_SYNC);
```
Note that each element of input_buffers is a pointer to void and contains the memory address of one input buffer. In order to access an input buffer to fill it with data, it is recommended to apply a cast to reinterpret it as a multidimensional array of the correct shape and type for that input.

For example, if the first input is of type float and has shape (1, 52, 3), the following cast can be applied:
```
float (*input)[52][3] = (float (*)[52][3])input_buffers[0];
```
Note that the first dimension (in this case equal to 1) does not need to be expressed explicitly in the variable definition or the casting expression.

The format and shape of the inputs can be retrieved from the macros defined in ispu/inc/network.h. Continuing with the example, these would be the macros:
```
#define STAI_NETWORK_IN_1_FLAGS       (STAI_FLAG_INPUTS|STAI_FLAG_PREALLOCATED|STAI_FLAG_CHANNEL_LAST|STAI_FLAG_HAS_BATCH)
#define STAI_NETWORK_IN_1_FORMAT      (STAI_FORMAT_FLOAT32)
#define STAI_NETWORK_IN_1_SHAPE       {1,52,3}
#define STAI_NETWORK_IN_1_BATCH       (1)
#define STAI_NETWORK_IN_1_HEIGHT      (52)
#define STAI_NETWORK_IN_1_CHANNEL     (3)
```
Please note that, in this case, the input is in channel-last format, as confirmed by the STAI_FLAG_CHANNEL_LAST flag in the STAI_NETWORK_IN_1_FLAGS macro, but, based on the model and the options specified when generating the network, it may be in channel-first format, in which case the shape would be (1, 3, 52) and the macros as follows:
```
#define STAI_NETWORK_IN_1_FLAGS       (STAI_FLAG_INPUTS|STAI_FLAG_PREALLOCATED|STAI_FLAG_CHANNEL_FIRST|STAI_FLAG_HAS_BATCH)
#define STAI_NETWORK_IN_1_FORMAT      (STAI_FORMAT_FLOAT32)
#define STAI_NETWORK_IN_1_SHAPE       {1,3,52}
#define STAI_NETWORK_IN_1_BATCH       (1)
#define STAI_NETWORK_IN_1_HEIGHT      (52)
#define STAI_NETWORK_IN_1_CHANNEL     (3)
```
and the reinterpretation of the buffer would become:
```
float (*input)[3][52] = (float (*)[3][52])input_buffers[0];
```
In a nutshell, the C array shape must be, in any case, based on the STAI_NETWORK_IN_X_SHAPE macro, where X is the input number. The array must then be indexed based on the format (channel-last vs channel-first) in order to fill it with the input values.

For channel-last:
```
input[0][height_index][channel_index] = input_value;
```
For channel-first:
```
input[0][channel_index][height_index] = input_value;
```
In this case, since the batch size is the first dimension and it is equal to 1, the first index is always 0.

How to compute input_value for the specific model must be know by the developer integrating it. For example, in the case used above for explanation, the input array could represent a buffer of 52 accelerometer samples, each composed of 3 float values (x-axis, y-axis, and z-axis) in g unit. Considering the accelerometer configured at 16 g full-scale and the input in channel-last format, the code could be something like this:
```
input[0][sample_index][0] = cast_sint16_t(ISPU_ARAW_X) * 0.488f;
input[0][sample_index][1] = cast_sint16_t(ISPU_ARAW_Y) * 0.488f;
input[0][sample_index][2] = cast_sint16_t(ISPU_ARAW_Z) * 0.488f;
```
This would be executed every time a new accelerometer sample is generated by the sensor, the sample thus being added to the buffer. When the buffer is full, the model inference can be executed with the call to the stai_network_run function.

(Optional) Implement a logic to handle errors returned by the inference function call stai_network_run.

    stai_return_code res = stai_network_run(net, STAI_MODE_SYNC);
    if (res >= STAI_ERROR_GENERIC) {
        #warning "Handle inference error as deemed appropriate."
    }

Error codes are defined in inc/ai/stai.h.

Retrieve the neural network results from the output data buffer and generate the interrupt as needed.
```
    #warning "Get the inference results from the output data buffers contained in output_buffers."
    #warning "Each buffer in output_buffers must be cast to the appropriate type before accessing it."

    // interrupt generation
    int_status = int_status | 0x1u;
```
Note that each element of output_buffers is a pointer to void and contains the memory address of one output buffer. In order to access an output buffer to retrieve the data, it is recommended to apply a cast to reinterpret it as a multidimensional array of the correct shape and type for that output.

For example, if the first output is of type float and has shape (1, 4), the following cast can be applied:
```
float (*output)[4] = (float (*)[4])output_buffers[0];
```
Note that the first dimension (in this case equal to 1) does not need to be expressed explicitly in the variable definition or the casting expression.

The format and shape of the outputs can be retrieved from the macros defined in ispu/inc/network.h. Continuing with the example, these would be the macros:
```
#define STAI_NETWORK_OUT_1_FLAGS       (STAI_FLAG_OUTPUTS|STAI_FLAG_PREALLOCATED|STAI_FLAG_CHANNEL_LAST|STAI_FLAG_HAS_BATCH)
#define STAI_NETWORK_OUT_1_FORMAT      (STAI_FORMAT_FLOAT32)
#define STAI_NETWORK_OUT_1_SHAPE       {1,4}
#define STAI_NETWORK_OUT_1_BATCH       (1)
#define STAI_NETWORK_OUT_1_CHANNEL     (4)
```
Please note that, in this case, there is no height or width dimension, but, in general, the same considerations made for the inputs about channel-last vs channel-first format also apply to the outputs.

The C array shape must be, in any case, based on the STAI_NETWORK_OUT_X_SHAPE macro, where X is the output number. The array must then be indexed based on the format (channel-last vs channel-first) in order to retrieve the output values. In this case:
```
output[0][channel_index]
```
In this case, since the the batch size is the first dimension and it is equal to 1, the first index is always 0.

How to utilize the output values is up to the developer integrating the model. The options are:
- Use them for further processing inside the ISPU.
- Copy them to the ISPU output registers.
Still using the same example, the following code could be used to copy the values to the output registers:
```
cast_float_t(ISPU_DOUT_00) = output[0][0];
cast_float_t(ISPU_DOUT_02) = output[0][1];
cast_float_t(ISPU_DOUT_04) = output[0][2];
cast_float_t(ISPU_DOUT_06) = output[0][3];
```
For example, the network could be implementing a classification model and each value of the output array would thus represent the detected probability (number between 0 and 1) for one of 4 classes that the model is able to recognize.

The interrupt can be generated as needed. Some options are:
- Generate the interrupt at every execution of the algo_00 function (that is, every time a new sample of the sensor is processed by the ISPU).
- Generate the interrupt every time a new inference has been run and the output values have been copied to the output registers.
- In the case of a classification model, generate the interrupt only when the detected class changes, so that the host can go as long as possible without being woken up by the sensor.

As always, ispu/conf.txt, ispu/meta.txt, and ispu/shub.txt should also be modified as required. For more information on these files, please refer to the regular (not specific for ST Edge AI) ISPU template.

Note: the template supports one single model generated with the default name “network”. Integrating multiple networks or one network with a different name would require modifying the code to a greater extent than what was shown here.

For more information on the code generated by ST Edge AI Core and how to use it, please refer to the other articles in this documentation.

How-to integrate generated code - r1.2
ST Edge AI Core Technology 2.2.0

ST logo Information in this document is provided solely in connection with ST products. The contents of this document are subject to change without prior notice. © Copyright STMicroelectronics 2025. All rights reserved. www.st.com