How to change the I/O data type or layout (NHWC vs NCHW)
ST Edge AI Core Technology 2.2.0
r1.0
Purpose
The ST Edge AI Core CLI offers two advanced options allowing to override the original input and/or output data format:
They facilitate the deployment of a model in a software stack that may have data format constraints such as:
- The output data layout/type of a camera module pipeline
- The input data format for the specific postprocessing
Original data layout is conserved
Contractually, the ST Edge AI Core CLI maintains the same input format (respectively output format) as the original model. So that the same input data can be reused in the converted/deployed model without alteration. The implementation of the deployed kernels is primarily channel-last (or NHWC) for performance reasons. As a result, for an imported model that is channel-first, a transpose or reshape operation is added if necessary. It maintains the data layout of the I/O tensors so that the converted model can correctly work with the data. For instance, TensorFlow commonly uses NHWC, whereas PyTorch often uses NCHW for its operations.
Warning
Be aware that for ONNX quantized models or Deep Quantized Neural Network DQNN models, the I/O data type of the imported model (commonly float32) is not conserved but optimized to be aligned with the used activation type for the quantization scheme (int8 or binary type). This optimization avoids unnecessary quantize/dequantize operations (float32 to/from int8).
To override this default behavior, the ‘–inputs-ch-position
chlast’ option can be used to change the data layout and to be
able to directly present the data generated by a camera module
pipeline. Additionally, the‘–input-data-type
uint8’ can also be used to insert a converter
(uint8
to int8
or int8
to
float32
) allowing direct use of the data from the image
sensor pipeline.
Tips
- For the validation of the deployed model, the data should be
correctly presented by the user to build the representative dataset
according to the generated and expected input shape.
- In cases where the transposed operation has been already inserted in the imported ONNX model to be NHWC, the ‘–inputs-ch-position chlast’ should be not used.
Deploying an ONNX QDQ model HCHW
The following figures illustrate a typical case where ‘–inputs-ch-position chlast’ and ‘–input-data-type uint8’ options are used to fit the application constraints (image sensor pipeline) for a typical quantized ONNX model.
Generally the generated C-kernel are channel last (or NHWC data format). To conserve the data arrangement of the I/O tensors, a transpose operator is added if:
- Model is channel first
- The number of input channels is greater than 1
This default behavior can be changed with the option:
--inputs-ch-position
and
--outputs-ch-position
TFlite model considerations
If requested, both options can also be used for the TFLite models. TensorFlow Lite primarily uses channel last (NHWC).