Fix validation for external data paths for models loaded from bytes#27430
Conversation
There was a problem hiding this comment.
Pull request overview
This PR fixes a security vulnerability in the validation of external data paths when ONNX models are loaded from bytes (in-memory). Previously, when a model was loaded from bytes without an explicit external data folder path being set, path traversal attacks using ".." sequences were not properly validated, potentially allowing access to arbitrary files on the filesystem.
Changes:
- Added path traversal validation for in-memory models with empty base directory
- Added comprehensive test coverage for the security fix
- Extended unit tests to cover the new validation path
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| onnxruntime/core/framework/tensorprotoutils.cc | Adds validation logic to reject paths containing ".." when base_dir is empty (model loaded from bytes) |
| onnxruntime/test/framework/tensorutils_test.cc | Adds unit test case for path traversal validation with empty base directory |
| onnxruntime/test/shared_lib/test_inference.cc | Adds two integration tests for in-memory model loading with malicious external data paths |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Here is AI analysis: Low: Empty-base-dir validation rejects safe filenames that contain
|
…27430) ### Description This PR fixes the validation of external data paths when ONNX models are loaded from bytes (in-memory). Previously, when a model was loaded from bytes without an explicit external data folder path being set, path using ".." sequences were not properly validated, potentially allowing access to arbitrary files on the filesystem. ### Motivation and Context Address a security concern
This cherry-picks the following commits for the release: | Commit ID | PR Number | Commit Title | |-----------|-----------|-------------| | decd177 | #27090 | Fix GatherND division by zero when batch dimensions mismatch | | 55f8234 | #27360 | Fix QMoE CPU Operator | | df9146f | #27403 | [MLAS] Adding DynamicQGemm function pointers and ukernel interface | | 0f93853 | #27318 | [js/web] Use embedded WASM module in Blob URL workers when wasmBinary is provided | | b2a6e69 | #27364 | QMoE CPU Performance Update (Up to 4x on 4-bit) | | f501e1d | #27413 | Fix refcount bug in map input conversion that caused shutdown segfault | | b32b205 | #27421 | Fix error where bytes is not assigned for dynamic qgemm pack b size | | 426b006 | #27397 | Fix DllImportResolver | | 0982844 | #27412 | MatmulNBits prepacking scales fix | | 9afb0d2 | #27430 | Fix validation for external data paths for models loaded from bytes | | 71d2cd0 | #27401 | Enable Python 3.14 CI and Upgrade Dependencies | | 79e0676 | #27419 | fix: out of bounds access for resize operation | | 82eb99c | #27459 | Fix SkipLayerNorm fusion incorrectly applied when gamma/beta are not 1D | | 355278a | #27444 | Fix GatherCopyData Integer Truncation Leading to Heap Out-of-Bounds Read/Write | | cf96123 | #27411 | [web] fix usage of wasmBinary together with a blob URL for .mjs | | 1131a86 | #27399 | [web] remove the unhelpful "Unknown CPU vendor" warning. | | ffbbc4f | #27316 | Build Windows ARM64X binaries as part of packaging pipeline | --------- Signed-off-by: Jonathan Clohessy <Jonathan.Clohessy@arm.com> Co-authored-by: patryk-kaiser-ARM <patryk.kaiser@arm.com> Co-authored-by: don <70039285+0-don@users.noreply.github.com> Co-authored-by: Jonathan Clohessy <jonathan.clohessy@arm.com> Co-authored-by: Hariharan Seshadri <shariharan91@gmail.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Adrian Lizarraga <adlizarraga@microsoft.com> Co-authored-by: Lukas Folle <126877803+lukas-folle-snkeos@users.noreply.github.com> Co-authored-by: Chi Lo <54722500+chilo-ms@users.noreply.github.com> Co-authored-by: Yulong Wang <7679871+fs-eire@users.noreply.github.com> Co-authored-by: Chaya <cha182350@gmail.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com> Co-authored-by: Erik <erscor@microsoft.com> Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
|
I am not sure this does not re-introduce the issue of symlinks escaping the directory. Plus, we always want the paths to be relative within the external data reference. |
Description
This PR fixes the validation of external data paths when ONNX models are loaded from bytes (in-memory). Previously, when a model was loaded from bytes without an explicit external data folder path being set, path using ".." sequences were not properly validated, potentially allowing access to arbitrary files on the filesystem.
Motivation and Context
Address a security concern