[XNNPACK Backend] Modified MobileBERT model outputs original shape on Android despite successful desktop inference

**Problem Description**  
After fine-tuning and modifying the structure of the MobileBERT model, I successfully exported it as a `.pte` file using the XNNPACK backend. When tested with `xnn_executor_runner`, the model produced the expected output shape `[1, 12]`. However, when invoked in the Android application, the model still outputs the original MobileBERT format (e.g., `[1, 512]`). There is an inconsistency between the desktop and mobile environments.  
The Android application is modified based on `ExecutorchDemo`.

---

**Steps to Reproduce**  

1. **Model Modification and Export**  
   - Modified the classification head of MobileBERT (e.g., adjusted the output dimension to 12 classes).  
   - Used the `aot_compiler` toolchain to export the model as `mobilebert_xnnpack_fp32.pte`.  
   - Verified the output format using the following command:  
     ```bash
     ./cmake-out/backends/xnnpack/xnn_executor_runner --model_path=mobilebert_xnnpack_fp32.pte
     # Output 0: tensor(sizes=[1, 12])
     ```
   
   <img width="978" alt="Image" src="https://github.com/user-attachments/assets/cfc6a0bc-1e22-4606-b534-235caad42d02" />

2. **Android Integration**  
   - Added the `.pte` file to the `assets` directory of the Android project.  
   - Loaded the model using standard JNI calls:  
     ```java
     // Load model
     mModule = Module.load(assetFilePath(this, MODEL_FILE));
     outputTensor = mModule.forward(
                                     new EValue[]{
                                         EValue.from(inputs[0]),  // input_ids
                                         EValue.from(inputs[1]),  // attention_mask
                                     }
                                 )[1].toTensor();
     ```
   - During inference, the output shape remains the original `[1, 512]`.

---

**Expected Behavior**  
The Android application should produce an output shape of `[1, 12]`, consistent with the desktop `xnn_executor_runner` results.

**Actual Behavior**  
The Android output retains the original model shape `[1, 512]`, indicating that the modification has not taken effect.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[XNNPACK Backend] Modified MobileBERT model outputs original shape on Android despite successful desktop inference #8956

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[XNNPACK Backend] Modified MobileBERT model outputs original shape on Android despite successful desktop inference #8956

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions