How to change the I/O data type or layout (NHWC vs NCHW)

ST Edge AI Core Technology 4.0.0

r1.0

Purpose

The ST Edge AI Core CLI offers two advanced options allowing to override the original input and/or output data format:

They facilitate the deployment of a model in a software stack that may have data format constraints such as:

The output data layout/type of a camera module pipeline
The input data format for the specific postprocessing

Original data layout is conserved

Contractually, the ST Edge AI Core CLI maintains the same input format (respectively output format) as the original model. So that the same input data can be reused in the converted/deployed model without alteration. The implementation of the deployed kernels is primarily channel-last (or NHWC) for performance reasons. As a result, for an imported model that is channel-first, a transpose or reshape operation is added if necessary. It maintains the data layout of the I/O tensors so that the converted model can correctly work with the data. For instance, TensorFlow commonly uses NHWC, whereas PyTorch often uses NCHW for its operations.

Warning

Be aware that for ONNX quantized models or Deep Quantized Neural Network DQNN models, the I/O data type of the imported model (commonly float32) is not conserved but optimized to be aligned with the used activation type for the quantization scheme (int8 or binary type). This optimization avoids unnecessary quantize/dequantize operations (float32 to/from int8).

To override this default behavior, the ‘–inputs-ch-position chlast’ option can be used to change the data layout and to be able to directly present the data generated by a camera module pipeline. Additionally, the‘–input-data-type uint8’ can also be used to insert a converter (uint8 to int8 or int8 to float32) allowing direct use of the data from the image sensor pipeline.

Tips

For the validation of the deployed model, the data should be correctly presented by the user to build the representative dataset according to the generated and expected input shape.
In cases where the transposed operation has been already inserted in the imported ONNX model to be NHWC, the ‘–inputs-ch-position chlast’ should be not used.

Deploying an ONNX QDQ model HCHW

The following figures illustrate a typical case where ‘–inputs-ch-position chlast’ and ‘–input-data-type uint8’ options are used to fit the application constraints (image sensor pipeline) for a typical quantized ONNX model.

Generally the generated C-kernel are channel last (or NHWC data format). To conserve the data arrangement of the I/O tensors, a transpose operator is added if:

Model is channel first
The number of input channels is greater than 1

This default behavior can be changed with the option: --inputs-ch-position and --outputs-ch-position

TFlite model considerations

If requested, both options can also be used for the TFLite models. TensorFlow Lite primarily uses channel last (NHWC).

How to change the I/O data type or layout (NHWC vs NCHW) - r1.0
ST Edge AI Core Technology 4.0.0

ST logo Information in this document is provided solely in connection with ST products. The contents of this document are subject to change without prior notice. © Copyright STMicroelectronics 2025. All rights reserved. www.st.com