Skip to content

Fix validation for external data paths for models loaded from bytes#27430

Merged
adrianlizarraga merged 3 commits intomainfrom
adrianl/ExtDataValidation_InMemoryModel_PathEscape
Feb 24, 2026
Merged

Fix validation for external data paths for models loaded from bytes#27430
adrianlizarraga merged 3 commits intomainfrom
adrianl/ExtDataValidation_InMemoryModel_PathEscape

Conversation

@adrianlizarraga
Copy link
Copy Markdown
Contributor

@adrianlizarraga adrianlizarraga commented Feb 24, 2026

Description

This PR fixes the validation of external data paths when ONNX models are loaded from bytes (in-memory). Previously, when a model was loaded from bytes without an explicit external data folder path being set, path using ".." sequences were not properly validated, potentially allowing access to arbitrary files on the filesystem.

Motivation and Context

Address a security concern

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a security vulnerability in the validation of external data paths when ONNX models are loaded from bytes (in-memory). Previously, when a model was loaded from bytes without an explicit external data folder path being set, path traversal attacks using ".." sequences were not properly validated, potentially allowing access to arbitrary files on the filesystem.

Changes:

  • Added path traversal validation for in-memory models with empty base directory
  • Added comprehensive test coverage for the security fix
  • Extended unit tests to cover the new validation path

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
onnxruntime/core/framework/tensorprotoutils.cc Adds validation logic to reject paths containing ".." when base_dir is empty (model loaded from bytes)
onnxruntime/test/framework/tensorutils_test.cc Adds unit test case for path traversal validation with empty base directory
onnxruntime/test/shared_lib/test_inference.cc Adds two integration tests for in-memory model loading with malicious external data paths

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@adrianlizarraga adrianlizarraga marked this pull request as ready for review February 24, 2026 06:04
@tianleiwu
Copy link
Copy Markdown
Contributor

Here is AI analysis:

Low: Empty-base-dir validation rejects safe filenames that contain .. (valid in Linux/MacOS)

  • File: onnxruntime/core/framework/tensorprotoutils.cc:383
  • Issue: The new check uses substring matching:
    • norm_location.native().find("..") != npos
    • This rejects paths like weights..bin (or foo..bar/data.bin) even though they do not include a .. path component and do not traverse directories.
  • Impact: Behavior regression for valid external data locations when models are loaded from bytes with no external-data folder override.
  • Recommendation: Check path components for exact .. segments instead of substring search. For example, iterate norm_location components and fail only when a component is exactly ...

Test Coverage Gaps

  • The new tests cover traversal patterns (../data.bin, a/../../data.bin) and normalization (a/../data.bin) but do not cover non-traversal filenames containing .. text.
  • Suggested addition in onnxruntime/test/framework/tensorutils_test.cc:
    • ASSERT_STATUS_OK(utils::ValidateExternalDataPath("", "weights..bin"));

Overall

  • The security intent is good and the main traversal case is addressed.
  • The current implementation is overly broad in one spot and can be tightened without weakening security.

@adrianlizarraga adrianlizarraga enabled auto-merge (squash) February 24, 2026 19:27
@adrianlizarraga adrianlizarraga merged commit 9afb0d2 into main Feb 24, 2026
90 checks passed
@adrianlizarraga adrianlizarraga deleted the adrianl/ExtDataValidation_InMemoryModel_PathEscape branch February 24, 2026 20:12
tianleiwu pushed a commit that referenced this pull request Feb 26, 2026
…27430)

### Description
This PR fixes the validation of external data paths when ONNX models are
loaded from bytes (in-memory). Previously, when a model was loaded from
bytes without an explicit external data folder path being set, path
using ".." sequences were not properly validated, potentially allowing
access to arbitrary files on the filesystem.


### Motivation and Context
Address a security concern
tianleiwu added a commit that referenced this pull request Feb 27, 2026
This cherry-picks the following commits for the release:

| Commit ID | PR Number | Commit Title |
|-----------|-----------|-------------|
| decd177 | #27090 | Fix GatherND division by zero when batch
dimensions mismatch |
| 55f8234 | #27360 | Fix QMoE CPU Operator |
| df9146f | #27403 | [MLAS] Adding DynamicQGemm function pointers and
ukernel interface |
| 0f93853 | #27318 | [js/web] Use embedded WASM module in Blob URL
workers when wasmBinary is provided |
| b2a6e69 | #27364 | QMoE CPU Performance Update (Up to 4x on 4-bit)
|
| f501e1d | #27413 | Fix refcount bug in map input conversion that
caused shutdown segfault |
| b32b205 | #27421 | Fix error where bytes is not assigned for
dynamic qgemm pack b size |
| 426b006 | #27397 | Fix DllImportResolver |
| 0982844 | #27412 | MatmulNBits prepacking scales fix |
| 9afb0d2 | #27430 | Fix validation for external data paths for
models loaded from bytes |
| 71d2cd0 | #27401 | Enable Python 3.14 CI and Upgrade Dependencies |
| 79e0676 | #27419 | fix: out of bounds access for resize operation |
| 82eb99c | #27459 | Fix SkipLayerNorm fusion incorrectly applied
when gamma/beta are not 1D |
| 355278a | #27444 | Fix GatherCopyData Integer Truncation Leading to
Heap Out-of-Bounds Read/Write |
| cf96123 | #27411 | [web] fix usage of wasmBinary together with a
blob URL for .mjs |
| 1131a86 | #27399 | [web] remove the unhelpful "Unknown CPU vendor"
warning. |
| ffbbc4f | #27316 | Build Windows ARM64X binaries as part of
packaging pipeline |

---------

Signed-off-by: Jonathan Clohessy <Jonathan.Clohessy@arm.com>
Co-authored-by: patryk-kaiser-ARM <patryk.kaiser@arm.com>
Co-authored-by: don <70039285+0-don@users.noreply.github.com>
Co-authored-by: Jonathan Clohessy <jonathan.clohessy@arm.com>
Co-authored-by: Hariharan Seshadri <shariharan91@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Adrian Lizarraga <adlizarraga@microsoft.com>
Co-authored-by: Lukas Folle <126877803+lukas-folle-snkeos@users.noreply.github.com>
Co-authored-by: Chi Lo <54722500+chilo-ms@users.noreply.github.com>
Co-authored-by: Yulong Wang <7679871+fs-eire@users.noreply.github.com>
Co-authored-by: Chaya <cha182350@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>
Co-authored-by: Erik <erscor@microsoft.com>
Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
@yuslepukhin
Copy link
Copy Markdown
Member

I am not sure this does not re-introduce the issue of symlinks escaping the directory. Plus, we always want the paths to be relative within the external data reference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants