fix: CoreML EP crash when loading model with external data file#28043
fix: CoreML EP crash when loading model with external data file#28043Scottcjn wants to merge 1 commit intomicrosoft:mainfrom
Conversation
…path TensorProtoWithExternalDataToTensorProto() passes model_path (full file path like "/path/to/model.onnx") directly to ReadExternalDataForTensor(), which expects a directory. This causes GetExternalDataInfo() to construct invalid paths like "/path/to/model.onnx/model.onnx_data" instead of "/path/to/model.onnx_data". The bug only affects the CoreML EP because it's the only production caller that passes a non-empty model_path. CPU EP uses UnpackInitializerData() which correctly calls model_path.parent_path(). Fix: use model_path.parent_path(), matching the established pattern at line ~2572. Fixes microsoft#28005 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows x64 QNN CI Pipeline |
|
Azure Pipelines successfully started running 4 pipeline(s). |
There was a problem hiding this comment.
Pull request overview
Fixes a CoreML EP crash when loading ONNX models that reference external data files by correcting the base path passed to external-data loading utilities.
Changes:
- Pass
model_path.parent_path()(model directory) toReadExternalDataForTensor()instead of the full model file path. - Add explanatory comments clarifying why the parent directory is required.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // ReadExternalDataForTensor expects the parent directory, matching the | ||
| // pattern used by UnpackInitializerData() at line ~2572. |
There was a problem hiding this comment.
The new comment hard-codes a reference to a specific line number in UnpackInitializerData ("line ~2572"). Line-number references tend to go stale quickly and make the comment misleading over time. Consider rewording to reference the function/behavior instead (e.g., that UnpackInitializerData passes model_path.parent_path()) without citing a line number.
| // ReadExternalDataForTensor expects the parent directory, matching the | |
| // pattern used by UnpackInitializerData() at line ~2572. | |
| // ReadExternalDataForTensor expects the parent directory, matching how | |
| // UnpackInitializerData() passes model_path.parent_path(). |
| // Load the external data into memory. | ||
| // model_path is the full model file path (e.g., "/path/to/model.onnx"). | ||
| // ReadExternalDataForTensor expects the parent directory, matching the | ||
| // pattern used by UnpackInitializerData() at line ~2572. | ||
| std::vector<uint8_t> unpacked_data; | ||
| ORT_RETURN_IF_ERROR(ReadExternalDataForTensor(ten_proto, model_path, unpacked_data)); | ||
| ORT_RETURN_IF_ERROR(ReadExternalDataForTensor(ten_proto, model_path.parent_path(), unpacked_data)); |
There was a problem hiding this comment.
This fix changes how external-data paths are resolved for TensorProtoWithExternalDataToTensorProto when a model is loaded from a file path, but there doesn't appear to be a unit test covering this specific scenario (external data in file + model_path points to the .onnx file, not a directory). Please add a regression test that creates a temp directory containing a dummy model path and an external data file, then verifies TensorProtoWithExternalDataToTensorProto successfully inlines the external data when given the model file path.
|
@copilot apply changes based on the comments in this thread |
|
Closing — this is superseded by upstream's defensive fix in Upstream's version (now on std::filesystem::path external_data_path = model_path;
std::error_code ec;
if (std::filesystem::is_regular_file(model_path, ec)) {
external_data_path = model_path.parent_path();
} else if (ec) {
ec.clear();
}
ORT_RETURN_IF_ERROR(ReadExternalDataForTensor(ten_proto, external_data_path, unpacked_data));That covers the CoreML EP crash I was fixing (file path → Thanks for the review cycles. |
Fixes #28005
CoreML EP crashes with SIGBUS when loading models that use external data files (.onnx_data). CPU EP works fine which is what tipped me off that it was provider-specific.
The bug is in TensorProtoWithExternalDataToTensorProto() -- it passes model_path straight to ReadExternalDataForTensor() but model_path is the full file path like /path/to/model.onnx not the directory. So GetExternalDataInfo() builds a garbage path like /path/to/model.onnx/model.onnx_data.
Fix is model_path.parent_path() instead. Same thing UnpackInitializerData() already does at line ~2572. Only affects CoreML EP since thats the only caller passing a non-empty path here.