-
Notifications
You must be signed in to change notification settings - Fork 767
Description
Problem Description
After fine-tuning and modifying the structure of the MobileBERT model, I successfully exported it as a .pte file using the XNNPACK backend. When tested with xnn_executor_runner, the model produced the expected output shape [1, 12]. However, when invoked in the Android application, the model still outputs the original MobileBERT format (e.g., [1, 512]). There is an inconsistency between the desktop and mobile environments.
The Android application is modified based on ExecutorchDemo.
Steps to Reproduce
-
Model Modification and Export
- Modified the classification head of MobileBERT (e.g., adjusted the output dimension to 12 classes).
- Used the
aot_compilertoolchain to export the model asmobilebert_xnnpack_fp32.pte. - Verified the output format using the following command:
./cmake-out/backends/xnnpack/xnn_executor_runner --model_path=mobilebert_xnnpack_fp32.pte # Output 0: tensor(sizes=[1, 12])
-
Android Integration
- Added the
.ptefile to theassetsdirectory of the Android project. - Loaded the model using standard JNI calls:
// Load model mModule = Module.load(assetFilePath(this, MODEL_FILE)); outputTensor = mModule.forward( new EValue[]{ EValue.from(inputs[0]), // input_ids EValue.from(inputs[1]), // attention_mask } )[1].toTensor();
- During inference, the output shape remains the original
[1, 512].
- Added the
Expected Behavior
The Android application should produce an output shape of [1, 12], consistent with the desktop xnn_executor_runner results.
Actual Behavior
The Android output retains the original model shape [1, 512], indicating that the modification has not taken effect.