You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: backends/nxp/README.md
+15-23Lines changed: 15 additions & 23 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -15,7 +15,8 @@ networks, as well as the ability to adapt and scale to new model architectures,
15
15
to AI workloads. ML application development with the eIQ Neutron NPU is fully supported by the
16
16
[eIQ machine learning software development environment](https://www.nxp.com/design/design-center/software/eiq-ml-development-environment/eiq-toolkit-for-end-to-end-model-development-and-deployment:EIQ-TOOLKIT).
17
17
The eIQ AI SW Stack provides a streamlined development experience for developers and end-users of NXP products.
18
-
eIQ extensions connect broader AI ecosystems to the edge, such as the NVIDIA TAO extension, which enables developers to bring AI models trained and fine-tuned with TAO to NXP-powered edge devices.
18
+
eIQ extensions connect broader AI ecosystems to the edge, such as the NVIDIA TAO extension, which enables developers
19
+
to bring AI models trained and fine-tuned with TAO to NXP-powered edge devices.
19
20
20
21
21
22
## Supported NXP platforms
@@ -35,37 +36,28 @@ improvements. NXP and the ExecuTorch community is actively developing this codeb
35
36
36
37
## Neutron Backend implementation and SW architecture
37
38
Neutron Backend uses the eIQ Neutron Converter as ML compiler to compile the delegated subgraph to Neutron microcode.
38
-
The Neutron Converter accepts the ML model in LiteRT format, for the **eIQ Neutron N3** class therefore the Neutron Backend uses the LiteRT flatbuffers format as IR between the ExecuTorch and Neutron Converter ML compiler.
39
-
40
-
The Neutron Backend in its early prototype phase, is based on existing NXP products, such as
41
-
onnx2tflite, known from the NXP's eIQ Toolkit.
42
-
The **onnx2tflite** is a converter from the ONNX format to LiteRT (formerly known as TFLite).
43
-
It consists of 3 stages:
44
-
* ONNX Model Parsing
45
-
* Tensor Format Inference, to identify tensors using channel-first layer
46
-
* ONNX to LiteRT Conversion
47
-
* Optimization Passes, which operate on top of the LiteRT format
48
-
* LiteRT Serialization
49
-
50
-
Due to the similarities between ONNX to LiteRT and Edge to LiteRT conversion, the Neutron Backend's
51
-
currently leverages the Tensor format Inference and LiteRT Optimizer.
52
-
This shall be considered as temporary solution, intended to be replaced with:
53
-
* Dim Order (https://github.com/pytorch/executorch/issues/4873)
54
-
* Corresponding ExecuTorch/ATen passes
55
-
56
-
before reaching higher maturity status by the end of 2025.
39
+
The Neutron Converter accepts the ML model in LiteRT format, for the **eIQ Neutron N3** class therefore the Neutron Backend
40
+
uses the LiteRT flatbuffers format as IR between the ExecuTorch and Neutron Converter ML compiler.
57
41
58
42
## Layout
59
-
The current code base is as follows:
60
43
*`backend/ir/` - TFLite/LiteRT based IR to represent the Edge Subgraph, taken from onnx2tflite code base and extended to
61
44
support Edge Dialect to LiteRT conversion.
62
45
*`backend/ir/converter` - Neutron Backends conversion from Edge (ATen) Dialect to LiteRT, TFLite. The subfolder
63
46
`node_conveters` is structured as single module for each Edge operator.
64
-
*`backend/ir/lib` - automatically generated handlers from LiteRT flatbuffers schema
47
+
*`backend/ir/lib` - automatically generated handlers from LiteRT flatbuffers schema.
65
48
*`backend/ir/tflite_generator` and `backend/ir/tflite_optimizer` handle the serialization
66
49
of the in-memory built subgraph for delegation into LiteRT/TFLite flatbuffers
67
50
representation. Code taken from the onnx2tflite tool.
0 commit comments