Fix CoreML EP issue with external weight path handling.#28062
Fix CoreML EP issue with external weight path handling.#28062
Conversation
GitHub issue #28005: CoreML EP crashes with SIGBUS when loading a model with external data via CreateSession from a file path. Root Cause In tensorprotoutils.cc:362, TensorProtoWithExternalDataToTensorProto() passes model_path directly to ReadExternalDataForTensor(), but ReadExternalDataForTensor() expects a directory (it joins the path with the external data filename). This causes path construction like /path/to/model.onnx/model.onnx_data instead of /path/to/model.onnx_data. The CPU EP's UnpackInitializerData() at line ~2572 correctly uses model_path.parent_path(). The CoreML EP is the only caller that passes a full model file path (via graph_viewer_.ModelPath() in model_builder.cc:791), triggering the bug. Changes Fix (tensorprotoutils.cc:364): Changed ReadExternalDataForTensor(ten_proto, model_path, ...) to ReadExternalDataForTensor(ten_proto, model_path.parent_path(), ...), matching the pattern used by UnpackInitializerData(). Test (coreml_basic_test.cc:444): Added CoreMLExecutionProviderTest.ExternalDataInitializer — creates a model with external data, saves it to disk, and loads it from a file path with CoreML EP to verify the fix. The test passes with the fix applied.
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Fixes CoreML EP loading of ONNX models that store initializer tensors in external data files by ensuring external data paths are resolved relative to the model directory (not the model file path), and adds a regression test for the reported crash.
Changes:
- Adjust external data loading to pass the model’s directory (
parent_path()) toReadExternalDataForTensor. - Add a CoreML EP regression test that writes an external initializer data file and loads the model via a file path.
- Add supporting test includes/utilities for temporary directories and file I/O.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| onnxruntime/test/providers/coreml/coreml_basic_test.cc | Adds a regression test that creates a model with external initializer data and verifies CoreML EP can load/run it from a file path. |
| onnxruntime/core/framework/tensorprotoutils.cc | Fixes external data path resolution by passing the model directory to ReadExternalDataForTensor. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
This reverts commit 3b99cb7.
tianleiwu
left a comment
There was a problem hiding this comment.
Thanks for the focused fix. I found one cross-platform test issue: the new regression test executes CoreML inference on all platforms, but non-Apple CoreML builds use stubs whose Predict() intentionally fails. Please gate the run/output verification while keeping the load/initialize path covered for the external-data regression.
tianleiwu
left a comment
There was a problem hiding this comment.
Updating my previous review after checking the CoreML support expectation and CI results. CoreML is expected to run on Apple/macOS, and the CoreML macOS build-and-test lanes for this PR passed. The earlier non-Apple stub concern is therefore non-blocking for this PR. From the CoreML path fix and regression coverage perspective, this looks good to me.
Added a fix. It's developer friendly to be able to validate/debug most of the CoreML code on any device, so having a test that fails in that scenario because it's incorrectly trying to run the model is a distraction. |
…th fix Update vcpkg-registry baseline to pick up onnxruntime port-version 4, which patches TensorProtoWithExternalDataToTensorProto to use model_path.parent_path() instead of model_path when resolving external data files. This fixes CoreML EP crash/silent CPU fallback on macOS when loading models with .onnx_data sidecars. Backport of microsoft/onnxruntime#28062. Upstream issue: microsoft/onnxruntime#28005. Made-with: Cursor
…th fix Update vcpkg-registry baseline to pick up onnxruntime port-version 4, which patches TensorProtoWithExternalDataToTensorProto to use model_path.parent_path() instead of model_path when resolving external data files. This fixes CoreML EP crash/silent CPU fallback on macOS when loading models with .onnx_data sidecars. Backport of microsoft/onnxruntime#28062. Upstream issue: microsoft/onnxruntime#28005. Made-with: Cursor
…th fix Update vcpkg-registry baseline to pick up onnxruntime port-version 4, which patches TensorProtoWithExternalDataToTensorProto to use model_path.parent_path() instead of model_path when resolving external data files. This fixes CoreML EP crash/silent CPU fallback on macOS when loading models with .onnx_data sidecars. Backport of microsoft/onnxruntime#28062. Upstream issue: microsoft/onnxruntime#28005. Made-with: Cursor
…th fix Update vcpkg-registry baseline to pick up onnxruntime port-version 4, which patches TensorProtoWithExternalDataToTensorProto to use model_path.parent_path() instead of model_path when resolving external data files. This fixes CoreML EP crash/silent CPU fallback on macOS when loading models with .onnx_data sidecars. Also pin sanity-check action references to commit SHA to fix sparse-checkout issue in on-pr workflow. Backport of microsoft/onnxruntime#28062. Upstream issue: microsoft/onnxruntime#28005. Made-with: Cursor
…th fix Patched ONNX Runtime 1.24.2 CoreML EP to fix !model_path.empty() crash when loading models with external data files on macOS/iOS. Backport of microsoft/onnxruntime#28062. - Pin onnxruntime deps to >= 1.24.2#5 (registry port-version with fix) - Bump package version to 0.14.1 - Pin workflow action refs to commit SHA (sparse-checkout fix) Made-with: Cursor
…th fix (#1696) Patched ONNX Runtime 1.24.2 CoreML EP to fix !model_path.empty() crash when loading models with external data files on macOS/iOS. Backport of microsoft/onnxruntime#28062. - Pin onnxruntime deps to >= 1.24.2#5 (registry port-version with fix) - Bump package version to 0.14.1 - Pin workflow action refs to commit SHA (sparse-checkout fix) Made-with: Cursor Co-authored-by: Mariusz Reichert <reichert.programming@gmail.com>
Description
GitHub issue #28005: CoreML EP crashes with SIGBUS when loading a model with external data via CreateSession from a file path.
Root Cause
In tensorprotoutils.cc:362, TensorProtoWithExternalDataToTensorProto() passes model_path directly to ReadExternalDataForTensor(), but ReadExternalDataForTensor() expects a directory (it joins the path with the external data filename). This causes path construction like /path/to/model.onnx/model.onnx_data instead of /path/to/model.onnx_data.
The CPU EP's UnpackInitializerData() at line ~2572 correctly uses model_path.parent_path(). The CoreML EP is the only caller that passes a full model file path (via graph_viewer_.ModelPath() in model_builder.cc:791), triggering the bug.
Changes
Fix (tensorprotoutils.cc:364): Changed ReadExternalDataForTensor(ten_proto, model_path, ...) to ReadExternalDataForTensor(ten_proto, model_path.parent_path(), ...), matching the pattern used by UnpackInitializerData().
Test (coreml_basic_test.cc:444): Added CoreMLExecutionProviderTest.ExternalDataInitializer — creates a model with external data, saves it to disk, and loads it from a file path with CoreML EP to verify the fix. The test passes with the fix applied.
Motivation and Context
#28005