pytorch
diff --git a/‎.gitignore‎
Lines changed: 0 additions & 1 deletion b/‎.gitignore‎
Lines changed: 0 additions & 1 deletion
diff --git a/‎CONTRIBUTING.md‎
Lines changed: 4 additions & 4 deletions b/‎CONTRIBUTING.md‎
Lines changed: 4 additions & 4 deletions
diff --git a/‎README-wheel.md‎
Lines changed: 2 additions & 2 deletions b/‎README-wheel.md‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎README.md‎
Lines changed: 4 additions & 8 deletions b/‎README.md‎
Lines changed: 4 additions & 8 deletions
diff --git a/‎backends/apple/coreml/README.md‎
Lines changed: 1 addition & 1 deletion b/‎backends/apple/coreml/README.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎backends/cadence/build_cadence_fusionG3.sh‎
Lines changed: 1 addition & 1 deletion b/‎backends/cadence/build_cadence_fusionG3.sh‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎backends/cadence/build_cadence_hifi4.sh‎
Lines changed: 1 addition & 1 deletion b/‎backends/cadence/build_cadence_hifi4.sh‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎backends/nxp/README.md‎
Lines changed: 9 additions & 9 deletions b/‎backends/nxp/README.md‎
Lines changed: 9 additions & 9 deletions
@@ -62,7 +62,6 @@ xcuserdata/
 /include/
 /share/
 /version.py
-*.csv
 *_etdump
 
 # Android
 
@@ -24,17 +24,17 @@ For Apple, please refer to the [iOS documentation](docs/source/using-executorch-
 executorch
 ├── <a href="backends">backends</a> - Backend delegate implementations for various hardware targets. Each backend uses partitioner to split the graph into subgraphs that can be executed on specific hardware, quantizer to optimize model precision, and runtime components to execute the graph on target hardware. For details refer to the <a href="docs/source/backend-delegates-integration.md">backend documentation</a> and the <a href="docs/source/using-executorch-export.md">Export and Lowering tutorial</a> for more information.
 │   ├── <a href="backends/apple">apple</a> - Apple-specific backends.
-│   │   ├── <a href="backends/apple/coreml">coreml</a> - CoreML backend for Apple devices. See <a href="docs/source/backends-coreml.md">doc</a>.
-│   │   └── <a href="backends/apple/mps">mps</a> - Metal Performance Shaders backend for Apple devices. See <a href="docs/source/backends-mps.md">doc</a>.
+│   │   ├── <a href="backends/apple/coreml">coreml</a> - CoreML backend for Apple devices. See <a href="docs/source/backends/coreml/coreml-overview.md">doc</a>.
+│   │   └── <a href="backends/apple/mps">mps</a> - Metal Performance Shaders backend for Apple devices. See <a href="docs/source/backends/mps/mps-overview.md">doc</a>.
 │   ├── <a href="backends/arm">arm</a> - ARM architecture backends. See <a href="docs/source/backends-arm-ethos-u.md">doc</a>.
 │   ├── <a href="backends/cadence">cadence</a> - Cadence-specific backends. See <a href="docs/source/backends-cadence.md">doc</a>.
 │   ├── <a href="backends/example">example</a> - Example backend implementations.
 │   ├── <a href="backends/mediatek">mediatek</a> - MediaTek-specific backends. See <a href="docs/source/backends-mediatek.md">doc</a>.
 │   ├── <a href="backends/openvino">openvino</a> - OpenVINO backend for Intel hardware.
 │   ├── <a href="backends/qualcomm">qualcomm</a> - Qualcomm-specific backends. See <a href="docs/source/backends-qualcomm.md">doc</a>.
 │   ├── <a href="backends/transforms">transforms</a> - Transformations for backend optimization.
-│   ├── <a href="backends/vulkan">vulkan</a> - Vulkan backend for cross-platform GPU support. See <a href="docs/source/backends-vulkan.md">doc</a>.
-│   └── <a href="backends/xnnpack">xnnpack</a> - XNNPACK backend for optimized neural network operations. See <a href="docs/source/backends-xnnpack.md">doc</a>.
+│   ├── <a href="backends/vulkan">vulkan</a> - Vulkan backend for cross-platform GPU support. See <a href="docs/source/backends/vulkan/vulkan-overview.md">doc</a>.
+│   └── <a href="backends/xnnpack">xnnpack</a> - XNNPACK backend for optimized neural network operations. See <a href="docs/source/backends/xnnpack/xnnpack-overview.md">doc</a>.
 ├── <a href="codegen">codegen</a> - Tooling to autogenerate bindings between kernels and the runtime.
 ├── <a href="configurations">configurations</a> - Configuration files.
 ├── <a href="devtools">devtools</a> - Model profiling, debugging, and inspection. Please refer to the <a href="docs/source/devtools-overview.md">tools documentation</a> for more information.
 
@@ -11,8 +11,8 @@ The `executorch` pip package is in beta.
 The prebuilt `executorch.runtime` module included in this package provides a way
 to run ExecuTorch `.pte` files, with some restrictions:
 * Only [core ATen operators](docs/source/ir-ops-set-definition.md) are linked into the prebuilt module
-* Only the [XNNPACK backend delegate](docs/source/backends-xnnpack.md) is linked into the prebuilt module.
-* \[macOS only] [Core ML](docs/source/backends-coreml.md) and [MPS](docs/source/backends-mps.md) backend
+* Only the [XNNPACK backend delegate](docs/source/backends/xnnpack/xnnpack-overview.md) is linked into the prebuilt module.
+* \[macOS only] [Core ML](docs/source/backends/coreml/coreml-overview.md) and [MPS](docs/source/backends/mps/mps-overview.md) backend
   are also linked into the prebuilt module.
 
 Please visit the [ExecuTorch website](https://pytorch.org/executorch) for
 
@@ -104,16 +104,14 @@ outputs = method.execute([torch.randn(1, 3, 224, 224)])
 
 Module module("model.pte");
 auto tensor = make_tensor_ptr({2, 2}, {1.0f, 2.0f, 3.0f, 4.0f});
-auto outputs = module.forward(tensor);
+auto outputs = module.forward({tensor});
 ```
 
 **[Swift (iOS)](https://docs.pytorch.org/executorch/main/ios-section.html)**
 ```swift
-import ExecuTorch
-
 let module = Module(filePath: "model.pte")
-let input = Tensor<Float>([1.0, 2.0, 3.0, 4.0], shape: [2, 2])
-let outputs = try module.forward(input)
+let input = Tensor<Float>([1.0, 2.0, 3.0, 4.0])
+let outputs: [Value] = try module.forward([input])
 ```
 
 **[Kotlin (Android)](https://docs.pytorch.org/executorch/main/android-section.html)**
@@ -153,8 +151,6 @@ runner->generate("Hello, how are you?", config);
 
 **[Swift (iOS)](https://docs.pytorch.org/executorch/main/llm/run-on-ios.html)**
 ```swift
-import ExecuTorchLLM
-
 let runner = TextRunner(modelPath: "llama.pte", tokenizerPath: "tiktoken.bin")
 try runner.generate("Hello, how are you?", Config {
     $0.sequenceLength = 128
@@ -202,7 +198,7 @@ ExecuTorch powers on-device AI at scale across Meta's family of apps, VR/AR devi
 
 **LLMs:** [Llama 3.2/3.1/3](examples/models/llama/README.md), [Qwen 3](examples/models/qwen3/README.md), [Phi-4-mini](examples/models/phi_4_mini/README.md), [LiquidAI LFM2](examples/models/lfm2/README.md)
 
-**Multimodal:** [Llava](examples/models/llava/README.md) (vision-language), [Voxtral](examples/models/voxtral/README.md) (audio-language)
+**Multimodal:** [Llava](examples/models/llava/README.md) (vision-language), [Voxtral](examples/models/voxtral/README.md) (audio-language), [Gemma](examples/models/gemma3) (vision-language)
 
 **Vision/Speech:** [MobileNetV2](https://github.com/meta-pytorch/executorch-examples/tree/main/mv2), [DeepLabV3](https://github.com/meta-pytorch/executorch-examples/tree/main/dl3), [Whisper](https://github.com/meta-pytorch/executorch-examples/tree/main/whisper/android/WhisperApp)
 
 
@@ -1,7 +1,7 @@
 # ExecuTorch Core ML Delegate
 
 This subtree contains the Core ML Delegate implementation for ExecuTorch.
-Core ML is an optimized framework for running machine learning models on Apple devices. The delegate is the mechanism for leveraging the Core ML framework to accelerate operators when running on Apple devices.  To learn how to use the CoreML delegate, see the [documentation](https://github.com/pytorch/executorch/blob/main/docs/source/backends-coreml.md). 
+Core ML is an optimized framework for running machine learning models on Apple devices. The delegate is the mechanism for leveraging the Core ML framework to accelerate operators when running on Apple devices.  To learn how to use the CoreML delegate, see the [documentation](https://github.com/pytorch/executorch/blob/main/docs/source/backends/coreml/coreml-overview.md).
 
 ## Layout
 - `compiler/` : Lowers a module to Core ML backend.
 
@@ -9,7 +9,7 @@ set -euo pipefail
 
 unset CMAKE_PREFIX_PATH
 unset XTENSA_CORE
-export XTENSA_CORE=FCV_FG3GP
+export XTENSA_CORE=VANILLA_G3
 git submodule sync
 git submodule update --init
 ./backends/cadence/install_requirements.sh
 
@@ -9,7 +9,7 @@ set -euo pipefail
 
 unset CMAKE_PREFIX_PATH
 unset XTENSA_CORE
-export XTENSA_CORE=nxp_rt600_RI23_11_newlib
+export XTENSA_CORE=VANILLA_HIFI
 git submodule sync
 git submodule update --init
 ./backends/cadence/install_requirements.sh
 
@@ -5,14 +5,14 @@ This subtree contains the ExecuTorch Backend implementation for the
 
 The eIQ® Neutron NPU is a highly scalable accelerator core architecture providing machine learning (ML) acceleration,
 able to support common and critical tasks for edge AI such as anomaly detection, speech recognition,
-image classification, object detection, facial recognition, image segmentation, and generative AI use cases like 
+image classification, object detection, facial recognition, image segmentation, and generative AI use cases like
 large and small language models (LLMs & SLMs) and text-to-speech (TTS).
-The architecture provides power and performance optimized NPUs integrated with NXP's broad portfolio of 
+The architecture provides power and performance optimized NPUs integrated with NXP's broad portfolio of
 microcontrollers and applications processors.
 
-The eIQ Neutron NPUs offer support for a wide variety of neural network types such as CNN, RNN, TCN and Transformer 
+The eIQ Neutron NPUs offer support for a wide variety of neural network types such as CNN, RNN, TCN and Transformer
 networks, as well as the ability to adapt and scale to new model architectures, topologies and layer types introduced
-to AI workloads. ML application development with the eIQ Neutron NPU is fully supported by the 
+to AI workloads. ML application development with the eIQ Neutron NPU is fully supported by the
 [eIQ machine learning software development environment](https://www.nxp.com/design/design-center/software/eiq-ml-development-environment/eiq-toolkit-for-end-to-end-model-development-and-deployment:EIQ-TOOLKIT).
 The eIQ AI SW Stack provides a streamlined development experience for developers and end-users of NXP products.
 
@@ -22,7 +22,7 @@ At this moment following eIQ® Neutron NPU variants and NXP platforms are suppor
 
 * **eIQ Neutron N3-64**, available on [i.MX RT700](https://www.nxp.com/products/i.MX-RT700)
 
-In the future the NXP eIQ Neutron Backend will be extended to support [i.MX 9 Application Processors](https://www.nxp.com/products/processors-and-microcontrollers/arm-processors/i-mx-applications-processors/i-mx-9-processors:IMX9-PROCESSORS) 
+In the future the NXP eIQ Neutron Backend will be extended to support [i.MX 9 Application Processors](https://www.nxp.com/products/processors-and-microcontrollers/arm-processors/i-mx-applications-processors/i-mx-9-processors:IMX9-PROCESSORS)
 with eIQ Neutron NPU, like the [i.MX 95](https://www.nxp.com/products/iMX95).
 
 
@@ -33,7 +33,7 @@ The eIQ Neutron NPU Backend should be considered as prototype quality at this mo
 improvements. NXP and the ExecuTorch community is actively developing this codebase.
 
 ## Neutron Backend implementation and SW architecture
-Neutron Backend uses the eIQ Neutron Converter as ML compiler to compile the delegated subgraph to Neutron microcode. 
+Neutron Backend uses the eIQ Neutron Converter as ML compiler to compile the delegated subgraph to Neutron microcode.
 The Neutron Converter accepts the ML model in LiteRT format, for the **eIQ Neutron N3** class  therefore the Neutron Backend
 uses the LiteRT flatbuffers format as IR between the ExecuTorch and Neutron Converter ML compiler.
 
@@ -44,10 +44,10 @@ uses the LiteRT flatbuffers format as IR between the ExecuTorch and Neutron Conv
       `node_conveters` is structured as single module for each Edge operator.
     * `backend/ir/lib` - automatically generated handlers from LiteRT flatbuffers schema.
     * `backend/ir/tflite_generator` and `backend/ir/tflite_optimizer` handle the serialization
-       of the in-memory built subgraph for delegation into LiteRT/TFLite flatbuffers 
+       of the in-memory built subgraph for delegation into LiteRT/TFLite flatbuffers
        representation. Code taken from the onnx2tflite tool.
-*  `edge_passes` - Various passes operating on Edge dialect level. 
-*  `quantizer` - Neutron Backend quantizer implementation. 
+*  `edge_passes` - Various passes operating on Edge dialect level.
+*  `quantizer` - Neutron Backend quantizer implementation.
 *  `runtime` - Neutron Backend runtime implementation. For running compiled on device.
 *  `tests/` - Unit tests for Neutron backend.
     * `tests/converter/node_converter` - Operator level unit tests.