Embedded Inference Client ST Edge AI Client APIs (Migration Guidelines)

ST Edge AI Core Technology 2.2.0

r1.0

Introduction

This article describes the main guidelines to migrate an AI Client Application from former Legacy Client APIs to the new ST Edge AI embedded inference client APIs. An example of a simple client application is presented and commented.

Overview

The ST Edge AI embedded inference client APIs have been designed to be simple, portable and unified across the majority of ST devices.

New Features of ST Edge AI Embedded Client APIs

New STAI_XX macros added to build and link a model using only compile time initializations.
New stai runtime with its own local context that must be initialized / de-initialized at the beginning/end of the client application to allocate/release resources related to the AI middleware.
Support to multiple network contexts by explicitly managing the allocation of the network context.

Migrate an Legacy Client APIs App to ST Edge AI Embedded APIs

In this section the various steps required to initialize, setup and run a network model will be shown. For each step the legacy code will be presented followed by the equivalent one using new ST Edge AI Client APIs.

- Step 1: Initial Declarations and headers inclusion

This is a typical example of a beginning of a Legacy Client APIs application:

/**
  ******************************************************************************
  * @file    ai_main_app.c
  * @author  AST Embedded Analytics Research Platform
  * @date    2024-05-23T09:23:23+0200
  ******************************************************************************
  * Copyright (c) 2024 STMicroelectronics.
  * All rights reserved.
  *
  * This software is licensed under terms that can be found in the LICENSE file
  * in the root directory of this software component.
  * If no LICENSE file comes with this software, it is provided AS-IS.
  ******************************************************************************
  */
#include <inttypes.h>
#include <stdio.h>
#include <string.h>

#include "ai_platform.h"         /* include ai macros, types and data structures */
#include "network_inputs.h"      /* where input1 buffer is defined */
#include "network.h"             /* include Legacy Client APIs generated network model files */
#include "network_data_params.h"

#define LOG_PRINT(fmt, ...) \
  { printf(fmt, ##__VA_ARGS__); fflush(stdout); }


int main(int argc, char* argv[])
{
  ai_handle network = AI_HANDLE_NULL;
  ai_error error;

  LOG_PRINT("name:%s\n", AI_NETWORK_MODEL_NAME);

  LOG_PRINT("n_inputs:%d\n", AI_NETWORK_IN_NUM);
  LOG_PRINT("n_outputs:%d\n", AI_NETWORK_OUT_NUM);

  LOG_PRINT("activations:%d\n", AI_NETWORK_DATA_ACTIVATIONS_SIZE);
  LOG_PRINT("weights:%d\n", AI_NETWORK_DATA_WEIGHTS_SIZE);
  LOG_PRINT("runtime_name:X-CUBE-AI.AI\n");

And this is the equivalent section using ST Edge AI Embedded Client APIs:

/**
  ******************************************************************************
  * @file    stai_main_app.c
  * @author  AST Embedded Analytics Research Platform
  * @date    2024-05-23T10:34:45+0200
  ******************************************************************************
  * Copyright (c) 2024 STMicroelectronics.
  * All rights reserved.
  *
  * This software is licensed under terms that can be found in the LICENSE file
  * in the root directory of this software component.
  * If no LICENSE file comes with this software, it is provided AS-IS.
  ******************************************************************************
  */
#include <inttypes.h>
#include <stdio.h>
#include <string.h>

#include "stai.h"             /* include ST Edge AI macros, types and data structures */
#include "network_inputs.h"   /* where input1 buffer is defined */
#include "network.h"          /* include ST Edge AI generated network model */


#define LOG_PRINT(fmt, ...) \
  { printf(fmt, ##__VA_ARGS__); fflush(stdout); }


int main(int argc, char* argv[])
{
  stai_return_code return_code = STAI_SUCCESS;

  LOG_PRINT("name:%s\n", STAI_NETWORK_MODEL_NAME);

  LOG_PRINT("n_inputs:%d\n", STAI_NETWORK_IN_NUM);
  LOG_PRINT("n_outputs:%d\n", STAI_NETWORK_OUT_NUM);

  LOG_PRINT("activations:%d\n", STAI_NETWORK_ACTIVATIONS_SIZE_BYTES);
  LOG_PRINT("weights:%d\n", STAI_NETWORK_WEIGHTS_SIZE_BYTES);
  LOG_PRINT("runtime_name:STM.AI\n");

Changes summary:

stai.h header instead of ai_platform.h is included.
Inclusion of network_data_params.h header is not required anymore since defines has been moved into network.h
ST Edge AI Macros are used instead of former Legacy Client API ones. As a generic rule, ST Edge AI Macros are prefixed by STAI_ instead of AI_.

- Step 2: Runtime, model initialization (including activations allocation)

This code section will take care of defining and initializing the network context and allocate the needed activations buffers. This example allocates two memory heaps since the model has been generated enabling the multi-heap option.

Legacy Client APIs application code snippet:


  /*  Get activations buffer pointers array using generated macro */
  ai_handle* activation_buffers = AI_NETWORK_DATA_ACTIVATIONS_TABLE_GET();

  /*  Allocate and set activation buffer #1  */
  AI_ALIGNED(4)
  ai_u8 activation1[AI_NETWORK_ACTIVATION_1_SIZE] = {0};
  activation_buffers[0] = (ai_handle)(activation1);

  /*  Allocate and set activation buffer #2  */
  AI_ALIGNED(4)
  ai_u8 activation2[AI_NETWORK_ACTIVATION_2_SIZE] = {0};
  activation_buffers[1] = (ai_handle)(activation2);

  /*  Get weights buffer pointers array using generated macro */
  ai_handle* weights_buffers = AI_NETWORK_DATA_WEIGHTS_TABLE_GET();

  /* A single network context is created and initialized  */
  error = ai_network_create_and_init(&network, activation_buffers, weights_buffers);
  if (error.type != AI_ERROR_NONE) {
    LOG_PRINT("  ## Test Failed executing init: type(0x%x) code(0x%x).\n\n", error.type, error.code)
    return -1;
  }

ST Edge AI Embedded Client APIs application equivalent code snippet:

  /*  !New!: Declare and allocate memory for private network context instance */
  STAI_NETWORK_CONTEXT_DECLARE(network, STAI_NETWORK_CONTEXT_SIZE)

  /* !New!: Runtime initialization */
  return_code = stai_runtime_init();

  /*  Initialize network context  */
  return_code = stai_network_init(network);
  if (return_code != STAI_SUCCESS) {
    LOG_PRINT("  ## Test Failed executing stai init: 0x%x.\n\n", return_code)
    return -1;
  }

  /*  Declare activations buffer pointers array  */
  stai_ptr activation_buffers[STAI_NETWORK_ACTIVATIONS_NUM] = {0};

  /*  Allocate and set activation buffer #1  */
  STAI_ALIGNED(STAI_NETWORK_ACTIVATION_1_ALIGNMENT)
  uint8_t activation1[STAI_NETWORK_ACTIVATION_1_SIZE] = {0};
  activation_buffers[0] = (stai_ptr)(activation1);

  /*  Allocate and set activation buffer #2  */
  STAI_ALIGNED(STAI_NETWORK_ACTIVATION_2_ALIGNMENT)
  uint8_t activation2[STAI_NETWORK_ACTIVATION_2_SIZE] = {0};
  activation_buffers[1] = (stai_ptr)(activation2);

  /*  !New!: Set network activations buffers  */
  return_code = stai_network_set_activations(network, activation_buffers, STAI_NETWORK_ACTIVATIONS_NUM);
  if (return_code != STAI_SUCCESS) {
    LOG_PRINT("  ## Test Failed executing stai set activations: 0x%x.\n\n", return_code)
    return -1;
  }

  /* NOTE: This step is no more required now in ST Edge AI Client APIs since weights buffers are generated and bind 
     directly to the C model */
#if 0
  /*  !New!: Declare activations buffer pointers array  */
  stai_ptr weight_buffers[STAI_NETWORK_WEIGHTS_NUM] = {0};
  stai_size n_weights = 0;
  return_code = stai_network_get_weights(network, weight_buffers, &n_weights);
  if ((return_code == STAI_SUCCESS) && (n_weights==STAI_NETWORK_WEIGHTS_NUM)) {
    return_code = stai_network_set_weights(network, weight_buffers, n_weights);
  } else {
    LOG_PRINT("  ## Test Failed executing stai set weights: 0x%x.\n\n", return_code)
    return -1;
  }
#endif

Changes needed for this section:

Context allocation is now explicit: the application must always allocate a buffer thanks to the provided STAI_ macros that will be initialized using the stai_network_init()
ST Edge AI C runtime must be initialized to bootstrap once the AI middleware using the stai_runtime_init()
Context initialization, activations and weights setting now are managed thanks to three different C APIs, ideally replacing the former ai_network_create_and_init() API.

- Step 3: Allocation and setting of the network I/O buffers

Legacy Client APIs application code snippet:

  /* Inputs Buffer Setup */
  /* Retrieve pointers to the model's input tensors */
  ai_buffer* ai_input = ai_network_inputs_get(network, NULL);
  if (ai_input) {
    ai_input[0].data = AI_HANDLE_PTR(input1) /* defined in network_inputs.h */
  }

  /* Outputs Buffers Setup */
  /* Allocate and declare output buffer #1  */
  AI_ALIGNED(4)
  float output1[AI_NETWORK_OUT_1_SIZE];
  
  /* Retrieve pointers to the model's output tensors */
  ai_buffer* ai_output = ai_network_outputs_get(network, NULL);
  if (ai_output) {
    ai_output[0].data = AI_HANDLE_PTR(output1);
  }

  /* No specific I/O setters since ai_input and ai_output as passed as argument of the ai_network_run() */

ST Edge AI Embedded Client APIs application equivalent code snippet:

  /* Inputs Buffer Setup */
  /* C-Table declaring inputs buffer pointers array and set inputs addresses */
  stai_ptr input_buffers[STAI_NETWORK_IN_NUM] = {
    (stai_ptr)input1 /* defined in network_inputs.h */
  };

  /*  !New!: Set network inputs buffers  */
  return_code = stai_network_set_inputs(network, input_buffers, STAI_NETWORK_IN_NUM);
  if (return_code != STAI_SUCCESS) {
    LOG_PRINT("  ## Test Failed executing stai set inputs: 0x%x.\n\n", return_code)
    return -1;
  }

  /* Outputs Buffers Setup */
  /*  Allocate and declare output buffer #1  */
  STAI_ALIGNED(STAI_NETWORK_OUT_1_ALIGNMENT)
  float output1[STAI_NETWORK_OUT_1_SIZE];
  
  /*  Declare outputs buffer pointers array and set outputs addresses */
  stai_ptr output_buffers[STAI_NETWORK_OUT_NUM] = {
    (stai_ptr)output1
  };

  /*  Set network outputs buffers  */
  return_code = stai_network_set_outputs(network, output_buffers, STAI_NETWORK_OUT_NUM);
  if (return_code != STAI_SUCCESS) {
    LOG_PRINT("  ## Test Failed executing stai set outputs: 0x%x.\n\n", return_code)
    return -1;
  }

Changes summary:

The allocation of the buffers is similar between the two snippets: what changes is the removal of ai_buffer C struct from ST Edge AI Client APIs replaced by the concept of a C-Table of pointers.
In ST Edge AI Client APIs explicit setters are now available to set input/output buffers pointers. In Legacy Client APIs instead an array of type ai_buffer representing the I/O buffers is passed as argument to the ai_network_run() API.

- Step 4: Logging info about generated model

Legacy Client APIs application code snippet:

  ai_network_report report;
  const ai_bool res = ai_network_get_report(network, &report);
  if (!res) {
    LOG_PRINT("  ## Test Failed executing ai get network report.\n\n")
    return -1;
  }

  LOG_PRINT("* Runtime version   : %d.%d.%d\n",
    report.runtime_version.major, report.runtime_version.minor, report.runtime_version.micro)
  LOG_PRINT("* Tool version      : %d.%d.%d\n",
    report.tool_version.major, report.tool_version.minor, report.tool_version.micro)
  LOG_PRINT("* APIs version      : %d.%d.%d\n",
    report.api_version.major, report.api_version.minor, report.api_version.micro)
  LOG_PRINT("* Network nodes     : %d\n", report.n_nodes)
  LOG_PRINT("* Network macc      : %d\n", report.n_macc)
  LOG_PRINT("* Network inputs    : %d\n", report.n_inputs)
  LOG_PRINT("* Network outputs   : %d\n", report.n_outputs)
  ...

ST Edge AI Embedded Client APIs application equivalent code snippet:

  stai_network_info info;
  return_code = stai_network_get_info(network, &info);
  if (return_code != STAI_SUCCESS) {
    LOG_PRINT("  ## Test Failed executing stai get network info: 0x%x.\n\n", return_code)
    return -1;
  }

  LOG_PRINT("* Runtime version   : %d.%d.%d\n",
    info.runtime_version.major, info.runtime_version.minor, info.runtime_version.micro)
  LOG_PRINT("* Tool version      : %d.%d.%d\n",
    info.tool_version.major, info.tool_version.minor, info.tool_version.micro)
  LOG_PRINT("* APIs version      : %d.%d.%d\n",
    info.api_version.major, info.api_version.minor, info.api_version.micro)
  LOG_PRINT("* Network nodes     : %d\n", info.n_nodes)
  LOG_PRINT("* Network macc      : %d\n", info.n_macc)
  LOG_PRINT("* Network inputs    : %d\n", info.n_inputs)
  LOG_PRINT("* Network outputs   : %d\n", info.n_outputs)
  ...

Changes summary:

The two APIs are similar: the ai_network_get_report() returns a ai_bool, instead the stai_network_get_info() returns a stai_return_code like all the stai APIs.
The stai_network_get_info() returns information about network I/O properties as an array of stai_tensor types, whereas former Legacy Client APIs conveys the same kind of information using the ai_buffer C struct.
The stai_network_get_info() now returns also information about activations/states/weights using always the stai_tensor C struct

- Step 5: Run of the inference (sync mode)

Legacy Client APIs application code snippet:

  /*  Execute network model inference on sample test (ALWAYS synchronous mode)  */
  LOG_PRINT("Starting inference\n");

  /* The run API supports only sync mode: it is always blocking */
  ai_i32 batch = ai_network_run(network, ai_input, ai_output);

  LOG_PRINT("Completed inference\n");
  if (batch != 1) {
    LOG_PRINT("  ## Test Failed executing ai network run: %d.\n\n", batch)
    return -1;
  }

ST Edge AI Embedded Client APIs application equivalent code snippet:

  /*  Execute network model inference on sample test (synchronous mode)  */
  LOG_PRINT("Starting inference\n");

  /* The run API now supports both sync and async modes */
  return_code = stai_network_run(network, STAI_MODE_SYNC);

  LOG_PRINT("Completed inference\n");
  if (return_code != STAI_SUCCESS) {
    LOG_PRINT("  ## Test Failed executing stai network run: 0x%x.\n\n", return_code)
    return -1;
  }

Changes summary:

The stai_network_run() now has only two parameters: the network handler and the mode. The informations about network I/O in ST Edge AI Client APIs are handled thanks to the setters APIs named stai_network_set_inputs() and stai_network_set_outputs()
The return code is now homogeneous with all stai APIs
The stai_network_run() now supports two modes: STAI_MODE_SYNC (default one) that is implementing a blocking API and ‘STAI_MODE_ASYNC’ (not yet supported) that is implementing a non-blocking API.

- Step 6: Dump of the output results

Legacy Client APIs application code snippet:

  LOG_PRINT("__START_OUTPUT1 __\n");
  for(ai_i32 o = 0; o < AI_NETWORK_OUT_1_SIZE; o++) {
    const ai_float value = output1[o];
    if (o != 0 && o % 10 == 0) {
      LOG_PRINT("\n");
    }
    LOG_PRINT("%f, ", value);
  }
  LOG_PRINT("\n__END_OUTPUT1 __\n");

ST Edge AI Embedded Client APIs application equivalent code snippet:


  LOG_PRINT("__START_OUTPUT1 __\n");
  for(int32_t o = 0; o < STAI_NETWORK_OUT_1_SIZE; o++) {
    const float value = output1[o];
    if (o != 0 && o % 10 == 0) {
      LOG_PRINT("\n");
    }
    LOG_PRINT("%f, ", value);
  }
  LOG_PRINT("\n__END_OUTPUT1 __\n");

Changes summary: Basically the two routines are the same except:

AI_NETWORK_OUT_1_SIZE has been renamed in STAI_NETWORK_OUT_1_SIZE like all the macros using the STAI_ prefix
ST Edge AI Client APis now uses types from <stdint.h> library instead of ai_ prefixed types from ai_platform.h header

- Step 7: De-initialization of the model

Legacy Client APIs application code snippet:

  /*  Network de-initialization  */
  const ai_handle ret = ai_network_destroy(network);

  return (ret == NULL) ? 0 : -1;
}

ST Edge AI Embedded Client APIs application equivalent code snippet:

  /*  Network de-initialization  */
  return_code = stai_network_deinit(network);

  /* Runtime de-initialization */
  return_code = stai_runtime_deinit();

  return (return_code == STAI_SUCCESS) ? 0 : -1;
}

Changes summary:

ai_network_destroy() has been replaced by stai_network_deinit(). The return code now is always of type stai_return_code like all the ST Edge AI Client APIs
The ST Edge AI Client APIs now has also a specific API to de-initialize the AI Middleware named stai_runtime_deinit()

Links & References

Embedded Inference Client ST Edge AI Client APIs (Migration Guidelines) - r1.0
ST Edge AI Core Technology 2.2.0

ST logo Information in this document is provided solely in connection with ST products. The contents of this document are subject to change without prior notice. © Copyright STMicroelectronics 2025. All rights reserved. www.st.com