Skip to content

chore(deps): bump the inference-dependencies group across 1 directory with 11 updates#562

Merged
WilliamBerryiii merged 3 commits intomainfrom
dependabot/pip/evaluation/inference-dependencies-75952c0aae
May 1, 2026
Merged

chore(deps): bump the inference-dependencies group across 1 directory with 11 updates#562
WilliamBerryiii merged 3 commits intomainfrom
dependabot/pip/evaluation/inference-dependencies-75952c0aae

Conversation

@dependabot
Copy link
Copy Markdown
Contributor

@dependabot dependabot Bot commented on behalf of github Apr 27, 2026

Bumps the inference-dependencies group with 11 updates in the /evaluation directory:

Package From To
numpy 2.2.6 2.4.4
marshmallow 3.26.2 4.3.0
packaging 25.0 26.2
onnxscript 0.6.2 0.7.0
onnxruntime-gpu 1.24.4 1.25.1
gymnasium 1.2.3 1.3.0
torch 2.10.0 2.11.0
tensordict 0.12.1 0.12.2
lerobot 0.5.0 0.5.1
hypothesis 6.151.13 6.152.4
matplotlib 3.10.8 3.10.9

Updates numpy from 2.2.6 to 2.4.4

Release notes

Sourced from numpy's releases.

2.4.4 (Mar 29, 2026)

NumPy 2.4.4 Release Notes

The NumPy 2.4.4 is a patch release that fixes bugs discovered after the 2.4.3 release. It should finally close issue #30816, the OpenBLAS threading problem on ARM.

This release supports Python versions 3.11-3.14

Contributors

A total of 8 people contributed to this release. People with a "+" by their names contributed a patch for the first time.

  • Charles Harris
  • Daniel Haag +
  • Denis Prokopenko +
  • Harshith J +
  • Koki Watanabe
  • Marten van Kerkwijk
  • Matti Picus
  • Nathan Goldbaum

Pull requests merged

A total of 7 pull requests were merged for this release.

  • #30978: MAINT: Prepare 2.4.x for further development
  • #31049: BUG: Add test to reproduce problem described in #30816 (#30818)
  • #31052: BUG: fix FNV-1a 64-bit selection by using NPY_SIZEOF_UINTP (#31035)
  • #31053: BUG: avoid warning on ufunc with where=True and no output
  • #31058: DOC: document caveats of ndarray.resize on 3.14 and newer
  • #31079: TST: fix POWER VSX feature mapping (#30801)
  • #31084: MAINT: numpy.i: Replace deprecated sprintf with snprintf...

2.4.3 (Mar 9, 2026)

NumPy 2.4.3 Release Notes

The NumPy 2.4.3 is a patch release that fixes bugs discovered after the 2.4.2 release. The most user visible fix may be a threading fix for OpenBLAS on ARM, closing issue #30816.

This release supports Python versions 3.11-3.14

Contributors

A total of 11 people contributed to this release. People with a "+" by their names contributed a patch for the first time.

  • Antareep Sarkar +

... (truncated)

Changelog

Sourced from numpy's changelog.

This is a walkthrough of the NumPy 2.4.0 release on Linux, which will be the first feature release using the numpy/numpy-release <https://github.com/numpy/numpy-release>__ repository.

The commands can be copied into the command line, but be sure to replace 2.4.0 with the correct version. This should be read together with the :ref:general release guide <prepare_release>.

Facility preparation

Before beginning to make a release, use the requirements/*_requirements.txt files to ensure that you have the needed software. Most software can be installed with pip, but some will require apt-get, dnf, or whatever your system uses for software. You will also need a GitHub personal access token (PAT) to push the documentation. There are a few ways to streamline things:

  • Git can be set up to use a keyring to store your GitHub personal access token. Search online for the details.

Prior to release

Add/drop Python versions

When adding or dropping Python versions, multiple config and CI files need to be edited in addition to changing the minimum version in pyproject.toml. Make these changes in an ordinary PR against main and backport if necessary. We currently release wheels for new Python versions after the first Python RC once manylinux and cibuildwheel support that new Python version.

Backport pull requests

Changes that have been marked for this release must be backported to the maintenance/2.4.x branch.

Update 2.4.0 milestones

Look at the issues/prs with 2.4.0 milestones and either push them off to a later version, or maybe remove the milestone. You may need to add a milestone.

Check the numpy-release repo

... (truncated)

Commits
  • be93fe2 Merge pull request #31090 from charris/prepare-2.4.4
  • f5245dc REL: Prepare for the NumPy 2.4.4 release
  • 02e838b Merge pull request #31084 from charris/backport-31056
  • fa74b2d MAINT: numpy.i: Replace deprecated sprintf with snprintf (#31056)
  • 533a6db Merge pull request #31079 from charris/backport-20801
  • 9e496cb TST: fix POWER VSX feature mapping (#30801)
  • 8052c4b Merge pull request #31058 from charris/backport-31021
  • 7f13b5a MAINT: Skip test on PyPy.
  • 4c5fdd6 MAINT: Remove unused import of tracemalloc.
  • a3ca5ed Update numpy/_core/src/multiarray/shape.c
  • Additional commits viewable in compare view

Updates marshmallow from 3.26.2 to 4.3.0

Changelog

Sourced from marshmallow's changelog.

4.3.0 (2026-04-03)

Features:

  • Add pre_load and post_load parameters to marshmallow.fields.Field for field-level pre- and post-processing (:issue:2787).
  • Typing: improvements to marshmallow.validate (:pr:2940).

4.2.4 (2026-04-02)

Bug fixes:

  • marshmallow.validate.URL and marshmallow.validate.Email accept Internationalized Domain Names (IDNs) (:issue:2821, :issue:2936). marshmallow.validate.Email also correctly rejects IDN domains with leading/trailing hyphens. Thanks :user:touhidurrr for the report.
  • Typing: Fix typing of nested in marshmallow.fields.Nested (:pr:2935).

4.2.3 (2026-03-25)

Bug fixes:

  • Make marshmallow.fields.Number and marshmallow.fields.Mapping abstract base classes to prevent using them within Schemas (:issue:2924). Thanks :user:MartingaleCoda for reporting.
  • Allow required to be set on marshmallow.fields.Contant (:issue:2900). Thanks :user:nosnickid for the report and :user:worksbyfriday for the PR.
  • Fix marshmallow.validate.OneOf emitting extra pairs when labels outnumber choices (:issue:2869). Thanks: user:T90REAL for the report and :user:rstar327 for the PR.
  • Fix behavior when passing a dot-delimited attribute name to partial for a key with data_key set (:pr:2903). Thanks :user:bysiber for the PR.
  • Fix Enum field by-name lookup to only return actual members (:pr:2902). Thanks :user:bysiber for the PR.
  • marshmallow.fields.DateTime with format="timestamp_ms" properly rejects bool values (:pr:2904). Thanks :user:bysiber for the PR.
  • Fix typing of error_messages argument to marshmallow.fields.Field (:pr:1636). Thanks :user:repole for reporting and :user:dhruvildarji for the PR.

Other changes:

  • Add ipaddress.* to marshmallow.Schema.TYPE_MAPPING (:issue:1695). Thanks :user:liberforce for the suggestion and :user:dhruvildarji for the PR.

4.2.2 (2026-02-04)

Bug fixes:

  • Fix behavior of fields.Contant(None) (:issue:2868).

... (truncated)

Commits

Updates packaging from 25.0 to 26.2

Release notes

Sourced from packaging's releases.

26.2

What's Changed

Fixes:

Documentation:

Internal:

New Contributors

Full Changelog: pypa/packaging@26.1...26.2

26.1

Features:

Behavior adaptations:

... (truncated)

Changelog

Sourced from packaging's changelog.

26.2 - 2026-04-24


Fixes:
  • Fix incorrect sysconfig var name for pyemscripten in (:pull:1160)
  • Make Version, Specifier, SpecifierSet, Tag, Marker, and Requirement pickle-safe
    and backward-compatible with pickles created in 25.0-26.1 (including references to the removed
    packaging._structures module) (:pull:1163, :pull:1168, :pull:1170, :pull:1171)
  • Re-export ExceptionGroup in metadata for now in (:pull:1164)

Documentation:

  • Add errors section and fix missing details in (:pull:1159)
  • Document our property-based test suite in (:pull:1167)
  • Fix a DirectUrl typo in (:pull:1167)
  • Add example of is_unsatisfiable in (:pull:1166)

Internal:

  • Enable the auditor persona on zizmor in (:pull:1158)
  • Test new pickle guarantees in (:pull:1174)
  • Use new native ReadTheDocs uv integration in (:pull:1175)

26.1 - 2026-04-14

Features:

  • PEP 783: add handling for Emscripten wheel tags in (:pull:804) (old name used in implementation, fixed in next release)
  • PEP 803: add handling for the abi3.abi3t free-threading tag in (:pull:1099)
  • PEP 723: add packaging.dependency_groups module, based on the dependency-groups package in (:pull:1065)
  • Add the packaging.direct_url module in (:pull:944)
  • Add the packaging.errors module in (:pull:1071)
  • Add SpecifierSet.is_unsatisfiable using ranges (new internals that will be expanded in future versions) in (:pull:1119)
  • Add create_compatible_tags_selector to select compatible tags in (:pull:1110)
  • Add a key argument to SpecifierSet.filter() in (:pull:1068)
  • Support & and | for Marker's in (:pull:1146)
  • Normalize Version.__replace__ and add Version.from_parts in (:pull:1078)
  • Add an option to validate compressed tag set sort order in parse_wheel_filename in (:pull:1150)

Behavior adaptations:

  • Narrow exclusion of pre-releases for <V.postN to match spec in (:pull:1140)
  • Narrow exclusion of post-releases for >V to match spec in (:pull:1141)
  • Rename format_full_version to _format_full_version to make it visibly private in (:pull:1125)
  • Restrict local version to ASCII in (:pull:1102)

Pylock (PEP 751) updates:

... (truncated)

Commits

Updates onnxscript from 0.6.2 to 0.7.0

Release notes

Sourced from onnxscript's releases.

v0.7.0

What's Changed

Optimizer and Rewriter

ONNX IR

Torch Lib

Core ONNX Script

New Features

Other Changes

New Contributors

... (truncated)

Commits
  • df97c94 Add an option to not inline a function when building the graph (#2851)
  • 90f754a chore(deps): bump actions/upload-pages-artifact from 4 to 5 (#2895)
  • b068297 Bumped version to 0.7.0 (#2894)
  • c8f5f6a Make GraphBuilder.init use keyword-only args after graph (#2893)
  • c6e8ec6 Handling initializers in GraphBuilder (#2889)
  • 63ffecf fix: normalize cache key dtype to prevent initializer name collisions (#2888)
  • 13f265c fix(fuse_batchnorm): support convtranpose + bn fusion with group != 1 (#2879)
  • 6c092e2 Add fusion rule to remove Expand before broadcast-capable binary operators (#...
  • c7d13fb Add input() and add_output() methods to GraphBuilder (#2828)
  • 864b785 Fix BatchNorm fusion producing invalid ONNX when Conv nodes share weight init...
  • Additional commits viewable in compare view

Updates onnxruntime-gpu from 1.24.4 to 1.25.1

Release notes

Sourced from onnxruntime-gpu's releases.

ONNX Runtime v1.25.1

n.b. This changelog is LLM generated. Only the contributor listing has been verified.

ONNX Runtime Release 1.25.1

📢 Announcements & Breaking Changes

ONNX Op Updates

  • Enhanced ONNX operator support with new opset versions: Reshape (opset 25), Transpose (opset 24) (#27752)

✨ New Features

📊 New ONNX Ops & Model Support

  • LinearAttention and CausalConvState operators for Qwen3.5 model support (#27907)
  • RotaryEmbedding (RotEMB) and RMSNorm operators added (#27752)
  • Linear Attention signature support (#27842)

🌐 Web & JavaScript

WebGPU EP

  • Qwen3.5 model support on WebGPU execution provider (#27996)
  • QMoE 1-token decode path optimization — fused operations to reduce GPU dispatches for improved performance (#27998)

🐛 Bug Fixes

Core Runtime Fixes

  • Improved filesystem error messages during Linux device discovery for better debugging experience (#27289)
  • Fixed missing include for SetRawDataInTensorProto in NVIDIA TensorRT RTX tests (#28065)

🙏 Contributors

Thanks to our 7 contributors for this release: @​guschmue, @​sanaa-hamel-microsoft, @​apsonawane, @​eserscor, @​ishwar-raut1, @​qjia7, @​theHamsta

Full Changelog: microsoft/onnxruntime@v1.25.0...v1.25.1

ONNX Runtime v1.25.0

📢 Announcements & Breaking Changes

... (truncated)

Commits

Updates gymnasium from 1.2.3 to 1.3.0

Release notes

Sourced from gymnasium's releases.

v1.3.0

Gymnasium v1.3.0

This release brings a new Taxi environment version, a new RepeatAction wrapper, and a range of bug fixes across vector environments and wrappers.

Core Changes

Bug Fixes

Third-Party Environments

10 new community environments have been added to the third-party environments list, including a new Cybersecurity environments section.

Full Changelog: Farama-Foundation/Gymnasium@v1.2.3...v1.3.0

Commits
  • eb5c00e Update to use Taxi-v4
  • 4436f89 fix incorrect TypeVar use in core for RenderFrame (#1560)
  • 877ba30 Update to 1.3.0
  • c3b809f Update Taxi to V4 and fix is_rainy implementation (#1561)
  • 9e6f855 Add RepeatAction wrapper (#1553)
  • 1532e66 Add external environment Hill Climb Racing Env (#1554)
  • df8704c Add boltcrypt to third party environments (#1557)
  • 01c0d39 Add external environment firecastrl (wildfire env) (#1551)
  • 9edc68e Fix spelling in test_mujoco_v5.py (#1550)
  • a31fa4b Change action seed for MuJoCo/test_verify_reward_survive test, to be valid ...
  • Additional commits viewable in compare view

Updates torch from 2.10.0 to 2.11.0

Release notes

Sourced from torch's releases.

PyTorch 2.11.0 Release Notes

Highlights

For more details about these highlighted features, you can look at the release blogpost. Below are the full release notes for this release.

Backwards Incompatible Changes

Release Engineering

... (truncated)

Commits
  • 70d99e9 [release only] Increase timeout for rocm libtorch and manywheel builds (#178006)
  • 3e05c5a [MPS] Properly handle conjugated tensors in bmm (#178010)
  • db741c7 [MPS] fix compiling of SDPA producing nan results (#178009)
  • 483b55d Update pytorch_sphinx_theme2 version to 0.4.6 (#177616)
  • 7f2cdeb [windows][smoke test] Add an option to install cuda if required cuda/cudnn on...
  • 76fd078 [release-only] Fix libtorch builds. Fix lint (#177299)
  • fa384de [Inductor][MPS] Fix half-precision type mismatches in Metal shader codegen (#...
  • 036b25f Let stable::from_blob accept a lambda as deleter (cherry-pick) (#176440)
  • 41f8e3e [CI] Stop using G3 runners (#177161)
  • e2fa295 [CD] Unpin cuda-bindings dependencies (#177159)
  • Additional commits viewable in compare view

Updates tensordict from 0.12.1 to 0.12.2

Release notes

Sourced from tensordict's releases.

TensorDict v0.12.2

Patch release with a bug fix for consolidated nested tensors.

Bug Fixes

  • Fix _ragged_idx loss during consolidation of nested tensors, which caused numerical incorrectness when the nested tensor had more than 2 dimensions and ragged_idx != 1 (#1675)

Installation

pip install tensordict==0.12.2

Full Changelog: pytorch/tensordict@v0.12.1...v0.12.2

Commits
  • 8ee33fa [Release] Bump version to 0.12.2
  • dcb6ddd [BugFix] fix ragged_idx of consolidated tensor (#1675)
  • 85ea4e7 [CI] Temporarily use vmoens/test-infra fork for macOS builds
  • See full diff in compare view

Updates lerobot from 0.5.0 to 0.5.1

Release notes

Sourced from lerobot's releases.

Release v0.5.1

What's Changed

@dependabot dependabot Bot added dependencies Dependency version updates python Pull requests that update python code labels Apr 27, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 27, 2026

Dependency Review

The following issues were found:
  • ✅ 0 vulnerable package(s)
  • ✅ 0 package(s) with incompatible licenses
  • ✅ 0 package(s) with invalid SPDX license definitions
  • ⚠️ 7 package(s) with unknown licenses.
See the Details below.

Snapshot Warnings

⚠️: No snapshots were found for the head SHA a515251.
Ensure that dependencies are being submitted on PR branches and consider enabling retry-on-snapshot-warnings. See the documentation for more information and troubleshooting advice.

License Issues

evaluation/uv.lock

PackageVersionLicenseIssue Type
gymnasium1.3.0NullUnknown License
hypothesis6.152.4NullUnknown License
lerobot0.5.1NullUnknown License
matplotlib3.10.9NullUnknown License
onnxruntime-gpu1.25.1NullUnknown License
onnxscript0.7.0NullUnknown License
tensordict0.12.2NullUnknown License

OpenSSF Scorecard

PackageVersionScoreDetails
pip/gymnasium 1.3.0 UnknownUnknown
pip/hypothesis 6.152.4 UnknownUnknown
pip/lerobot 0.5.1 UnknownUnknown
pip/matplotlib 3.10.9 UnknownUnknown
pip/onnxruntime-gpu 1.25.1 UnknownUnknown
pip/onnxscript 0.7.0 UnknownUnknown
pip/tensordict 0.12.2 UnknownUnknown

Scanned Files

  • evaluation/uv.lock

@github-actions github-actions Bot changed the title chore(deps): bump the inference-dependencies group across 1 directory with 11 updates security(deps): bump the inference-dependencies group across 1 directory with 11 updates Apr 27, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 27, 2026

AW Dependabot PR Review completed successfully!

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Maintainer review recommended

Advisory Review Summary

Affected ecosystems / surfaces: python-runtime (pip — evaluation/)

Package From To Severity Surface
numpy 2.2.6 2.4.4 ⚠️ High-risk python-runtime
marshmallow 3.26.2 4.3.0 ⚠️ High-risk (major) python-runtime
packaging 25.0 26.2 Low python-runtime
onnxscript 0.6.2 0.7.0 Low python-runtime
onnxruntime-gpu 1.24.4 1.25.0 ⚠️ High-risk (CUDA ABI + security) python-runtime
gymnasium 1.2.3 1.3.0 Low python-runtime
torch 2.10.0 2.11.0 ⚠️ High-risk (CUDA ABI) python-runtime
tensordict 0.12.1 0.12.2 Medium (torch-coupled) python-runtime
lerobot 0.5.0 0.5.1 Low (pre-1.0 patch) python-runtime
hypothesis 6.151.13 6.152.3 Low (dev only) python-runtime
matplotlib 3.10.8 3.10.9 Low (dev only) python-runtime

numpy

Version jump: 2.2.6 → 2.4.4 (skips the 2.3.x series entirely).

  • NumPy 2.3 introduced a new variable-width string dtype (numpy.strings) and related C-extension API additions. Code compiled against 2.2.x wheels is compatible at runtime, but C-extension authors must recompile against ≥2.3 to use the new APIs.
  • 2.4.4 is a pure patch release fixing an OpenBLAS ARM threading issue (#30816) and a FNV-1a 64-bit hash selection bug (#31052).
  • Release notes: https://github.com/numpy/numpy/releases/tag/v2.4.4

Repo-specific risk: training/rl independently pins numpy==1.26.4 (the <2.0.0 Isaac Sim constraint) — that surface is unaffected by this PR. The evaluation surface itself has no documented <2.0.0 constraint, so the 2.4.4 target is permissible.


marshmallow

Major version bump: 3.26.2 → 4.3.0. marshmallow 4.0 removed APIs deprecated in the 3.x series.

Notable breaking changes in 4.0 (source: (marshmallow.readthedocs.io/redacted)

  • Schema.Meta field-level defaults restructured; missing / default semantics unified.
  • Several previously-deprecated constructor kwargs removed from Field subclasses.
  • @pre_load / @post_load pass-many behaviour changed.
  • Python ≥3.9 required (project already requires ≥3.12 — no conflict).

Any marshmallow schema code in evaluation/ must be audited against the 4.0 migration guide before merging.


onnxscript

Pre-1.0 minor bump: 0.6.2 → 0.7.0. No GHSA/CVE found. Pre-1.0 minor bumps may contain breaking changes. Upstream compare: microsoft/onnxscript@v0.6.2...v0.7.0


onnxruntime-gpu

CUDA minimum raised to 12.0; multiple security fixes.

Breaking changes in v1.25.0 (source: https://github.com/microsoft/onnxruntime/releases/tag/v1.25.0):

CUDA 11.x is no longer supported. Users pinned to CUDA 11.x should stay on ORT 1.24.x or upgrade their CUDA toolkit/driver.

ORT_API_VERSION updated to 25. C/C++ extension consumers compiled against older ORT headers must rebuild.

Security fixes bundled in 1.25.0:

Note: CVE-2026-27904 is cited in onnxruntime's release notes in relation to minimatch (a Node.js library used in ONNX Runtime's JS/build tooling). No GHSA record was located for this CVE ID at query time; the severity and affected-version range are not independently verified here. Do not treat this advisory summary as authoritative for that CVE.


gymnasium

Minor bump: 1.2.3 → 1.3.0. No GHSA/CVE found. Low risk. Source: https://github.com/Farama-Foundation/Gymnasium/releases


torch

Minor bump: 2.10.0 → 2.11.0. No GHSA/CVE found. CUDA ABI-sensitive: pre-built wheels are compiled against a specific CUDA version. Verify the evaluation GPU environment runs a compatible CUDA toolkit.


tensordict

Patch bump: 0.12.1 → 0.12.2. Tightly coupled with PyTorch; this patch version tracks torch==2.11.0. No GHSA/CVE found. Source: https://github.com/pytorch/tensordict/releases


lerobot

Pre-1.0 patch bump: 0.5.0 → 0.5.1. No GHSA/CVE found. Low risk. Source: https://github.com/huggingface/lerobot/releases


hypothesis / matplotlib

Dev-only patch bumps. No advisory or operational risk.


Uncovered-manifest note

training/il/lerobot/pyproject.toml pins lerobot==0.5.0 and numpy==2.2.6 independently but is not covered by .github/dependabot.yml. Consider adding a pip entry for /training/il/lerobot so those pins receive automated updates.


Validation advice

# In evaluation/
ruff check .
pytest -m "not e2e"        # non-GPU unit tests
# On a GPU node (CUDA ≥ 12.0 required):
python -c "import torch; print(torch.cuda.is_available(), torch.version.cuda)"
python -c "import onnxruntime; print(onnxruntime.__version__)"
pytest -m e2e              # GPU-gated SIL smoke tests

Advisory verdict: COMMENT — multiple high-risk signals fire: onnxruntime-gpu requires CUDA ≥ 12.0 (breaking for CUDA 11.x environments), marshmallow crosses a major version boundary, and numpy / torch involve CUDA/ABI-sensitive bumps that should be validated on GPU hardware before merging.

Note

🔒 Integrity filter blocked 1 item

The following item was blocked because it doesn't meet the GitHub integrity level.

  • #562 pull_request_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".

To allow these resources, lower min-integrity in your GitHub frontmatter:

tools:
  github:
    min-integrity: approved  # merged | approved | unapproved | none

Generated by AW Dependabot PR Review for issue #562 · ● 920.4K

Comment thread evaluation/pyproject.toml Outdated
Comment thread evaluation/pyproject.toml Outdated
Comment thread evaluation/pyproject.toml Outdated
Comment thread evaluation/pyproject.toml Outdated
Comment thread evaluation/pyproject.toml
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 27, 2026

AW Dependabot PR Review completed successfully!

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Maintainer review recommended

Advisory Review Summary

Ecosystems / surfaces touched:

  • pip / uv under evaluation/python-runtime surface (11 packages, single manifest)

Changed file: evaluation/pyproject.toml only. No uv.lock change is present in this PR — if lock-file regeneration is deferred to a separate PR, note that transitive pins are not yet updated.

Package table

Package From To Severity Surface
numpy 2.2.6 2.4.4 — (ABI-sensitive) python-runtime
marshmallow 3.26.2 4.3.0 — (major bump) python-runtime
packaging 25.0 26.2 — (major bump) python-runtime
onnxscript 0.6.2 0.7.0 python-runtime
onnxruntime-gpu 1.24.4 1.25.0 Security (CVE-2026-27904) python-runtime
gymnasium 1.2.3 1.3.0 python-runtime
torch 2.10.0 2.11.0 — (backward-incompat) python-runtime
tensordict 0.12.1 0.12.2 python-runtime
lerobot 0.5.0 0.5.1 — (pre-1.0) python-runtime
hypothesis 6.151.13 6.152.3 python-runtime (dev)
matplotlib 3.10.8 3.10.9 python-runtime (dev)

onnxruntime-gpu

CVE-2026-27904 — Bundled minimatch dependency upgraded from 3.1.2 → 3.1.4. This addresses a vulnerability in a JavaScript build-time/dev-time tool bundled inside the ORT distribution. Additional security hardening in 1.25.0: Pad Reflect vulnerability fix (onnxruntime#27652), transpose optimizer security fix (onnxruntime#27555), and hardened shell command handling.

Breaking change: CUDA minimum version raised to 12.0. CUDA 11.x is no longer supported. Environments pinned to CUDA 11.x must stay on ORT 1.24.x or upgrade CUDA before this PR lands.

Source: ONNX Runtime v1.25.0 release notes


marshmallow

Major version: 3.26.2 → 4.3.0. Marshmallow 4.0 removed all APIs deprecated in 3.x. Any schema using @post_load(pass_many=True), @validates_schema(pass_many=True), or affected Schema.Meta options will raise errors at runtime.

Audit all marshmallow.Schema subclasses in evaluation/ against the [3→4 migration guide]((marshmallow.readthedocs.io/redacted) Run ruff check + targeted pytest before merging.

Source: marshmallow CHANGELOG


numpy

2.2.6 → 2.4.4 skips the 2.3.x series (additional C-API / ABI changes). The 2.4.4 patch release closes the OpenBLAS threading bug on ARM (#30816) and fixes a FNV-1a 64-bit hash selection issue (#31052). Confirm CUDA wheel ABI compatibility with NumPy 2.4.x for co-installed packages (torch, onnxruntime-gpu).

Note: training/rl/pyproject.toml retains numpy==1.26.4 (within the >=1.26.0,<2.0.0 Isaac Sim constraint) — that surface is unaffected by this PR.

Source: NumPy 2.4.4 release notes


torch

2.10.0 → 2.11.0 — minor bump with a documented "Backwards Incompatible Changes" section (Release Engineering category; details truncated in the Dependabot body). See the [PyTorch 2.11.0 release blog]((pytorch.org/redacted) for the full list.

Source: PyTorch 2.11.0 release notes


packaging

25.0 → 26.2 — major version jump. The 26.0/26.1 releases removed packaging._structures; 26.2 restores pickle backward-compatibility for serialized version/specifier objects. If evaluation pipelines cache pickled packaging objects, regenerate them after upgrade.

Source: packaging 26.2 release notes


onnxscript

0.6.2 → 0.7.0 — pre-1.0 minor bump. No advisory found. Changelog: compare view.

gymnasium

1.2.3 → 1.3.0 — adds RepeatAction wrapper and external environment support. No breaking changes identified. Compare view.

tensordict / lerobot / hypothesis / matplotlib

Patch bumps. lerobot 0.5.1 is pre-1.0 and includes a fix for a breaking change from transformers 5.4.0 (lerobot#3231).


Advisory verdict: COMMENT — Three high-risk signals fire on the python-runtime surface: onnxruntime-gpu requires CUDA ≥ 12.0, marshmallow crosses a major breaking version, and numpy skips an ABI-changing minor series. Maintainer validation against the CUDA environment and existing schema code is recommended before merge. The security fix for CVE-2026-27904 in onnxruntime-gpu 1.25.0 is a positive motivator to merge once those checks pass.

Note

🔒 Integrity filter blocked 1 item

The following item was blocked because it doesn't meet the GitHub integrity level.

  • #562 pull_request_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".

To allow these resources, lower min-integrity in your GitHub frontmatter:

tools:
  github:
    min-integrity: approved  # merged | approved | unapproved | none

Generated by AW Dependabot PR Review for issue #562 · ● 1.5M

Comment thread evaluation/pyproject.toml Outdated
Comment thread evaluation/pyproject.toml Outdated
Comment thread evaluation/pyproject.toml Outdated
Comment thread evaluation/pyproject.toml Outdated
Comment thread evaluation/pyproject.toml Outdated
@katriendg
Copy link
Copy Markdown
Collaborator

@dependabot rebase

@dependabot @github
Copy link
Copy Markdown
Contributor Author

dependabot Bot commented on behalf of github Apr 28, 2026

Looks like this PR has been edited by someone other than Dependabot. That means Dependabot can't rebase it - sorry!

If you're happy for Dependabot to recreate it from scratch, overwriting any edits, you can request @dependabot recreate.

@katriendg
Copy link
Copy Markdown
Collaborator

@dependabot recreate

… with 11 updates

Bumps the inference-dependencies group with 11 updates in the /evaluation directory:

| Package | From | To |
| --- | --- | --- |
| [numpy](https://github.com/numpy/numpy) | `2.2.6` | `2.4.4` |
| [marshmallow](https://github.com/marshmallow-code/marshmallow) | `3.26.2` | `4.3.0` |
| [packaging](https://github.com/pypa/packaging) | `25.0` | `26.2` |
| [onnxscript](https://github.com/microsoft/onnxscript) | `0.6.2` | `0.7.0` |
| [onnxruntime-gpu](https://github.com/microsoft/onnxruntime) | `1.24.4` | `1.25.1` |
| [gymnasium](https://github.com/Farama-Foundation/Gymnasium) | `1.2.3` | `1.3.0` |
| [torch](https://github.com/pytorch/pytorch) | `2.10.0` | `2.11.0` |
| [tensordict](https://github.com/pytorch/tensordict) | `0.12.1` | `0.12.2` |
| [lerobot](https://github.com/huggingface/lerobot) | `0.5.0` | `0.5.1` |
| [hypothesis](https://github.com/HypothesisWorks/hypothesis) | `6.151.13` | `6.152.4` |
| [matplotlib](https://github.com/matplotlib/matplotlib) | `3.10.8` | `3.10.9` |



Updates `numpy` from 2.2.6 to 2.4.4
- [Release notes](https://github.com/numpy/numpy/releases)
- [Changelog](https://github.com/numpy/numpy/blob/main/doc/RELEASE_WALKTHROUGH.rst)
- [Commits](numpy/numpy@v2.2.6...v2.4.4)

Updates `marshmallow` from 3.26.2 to 4.3.0
- [Changelog](https://github.com/marshmallow-code/marshmallow/blob/dev/CHANGELOG.rst)
- [Commits](marshmallow-code/marshmallow@3.26.2...4.3.0)

Updates `packaging` from 25.0 to 26.2
- [Release notes](https://github.com/pypa/packaging/releases)
- [Changelog](https://github.com/pypa/packaging/blob/main/CHANGELOG.rst)
- [Commits](pypa/packaging@25.0...26.2)

Updates `onnxscript` from 0.6.2 to 0.7.0
- [Release notes](https://github.com/microsoft/onnxscript/releases)
- [Commits](microsoft/onnxscript@v0.6.2...v0.7.0)

Updates `onnxruntime-gpu` from 1.24.4 to 1.25.1
- [Release notes](https://github.com/microsoft/onnxruntime/releases)
- [Changelog](https://github.com/microsoft/onnxruntime/blob/main/docs/ReleaseManagement.md)
- [Commits](microsoft/onnxruntime@v1.24.4...v1.25.1)

Updates `gymnasium` from 1.2.3 to 1.3.0
- [Release notes](https://github.com/Farama-Foundation/Gymnasium/releases)
- [Commits](Farama-Foundation/Gymnasium@v1.2.3...v1.3.0)

Updates `torch` from 2.10.0 to 2.11.0
- [Release notes](https://github.com/pytorch/pytorch/releases)
- [Changelog](https://github.com/pytorch/pytorch/blob/main/RELEASE.md)
- [Commits](pytorch/pytorch@v2.10.0...v2.11.0)

Updates `tensordict` from 0.12.1 to 0.12.2
- [Release notes](https://github.com/pytorch/tensordict/releases)
- [Commits](pytorch/tensordict@v0.12.1...v0.12.2)

Updates `lerobot` from 0.5.0 to 0.5.1
- [Release notes](https://github.com/huggingface/lerobot/releases)
- [Commits](huggingface/lerobot@v0.5.0...v0.5.1)

Updates `hypothesis` from 6.151.13 to 6.152.4
- [Release notes](https://github.com/HypothesisWorks/hypothesis/releases)
- [Commits](HypothesisWorks/hypothesis@hypothesis-python-6.151.13...hypothesis-python-6.152.4)

Updates `matplotlib` from 3.10.8 to 3.10.9
- [Release notes](https://github.com/matplotlib/matplotlib/releases)
- [Commits](matplotlib/matplotlib@v3.10.8...v3.10.9)

---
updated-dependencies:
- dependency-name: gymnasium
  dependency-version: 1.3.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: inference-dependencies
- dependency-name: hypothesis
  dependency-version: 6.152.3
  dependency-type: direct:development
  update-type: version-update:semver-minor
  dependency-group: inference-dependencies
- dependency-name: lerobot
  dependency-version: 0.5.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: inference-dependencies
- dependency-name: marshmallow
  dependency-version: 4.3.0
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: inference-dependencies
- dependency-name: matplotlib
  dependency-version: 3.10.9
  dependency-type: direct:development
  update-type: version-update:semver-patch
  dependency-group: inference-dependencies
- dependency-name: numpy
  dependency-version: 2.4.4
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: inference-dependencies
- dependency-name: onnxruntime-gpu
  dependency-version: 1.25.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: inference-dependencies
- dependency-name: onnxscript
  dependency-version: 0.7.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: inference-dependencies
- dependency-name: packaging
  dependency-version: '26.2'
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: inference-dependencies
- dependency-name: tensordict
  dependency-version: 0.12.2
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: inference-dependencies
- dependency-name: torch
  dependency-version: 2.11.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: inference-dependencies
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot Bot changed the title security(deps): bump the inference-dependencies group across 1 directory with 11 updates chore(deps): bump the inference-dependencies group across 1 directory with 11 updates Apr 28, 2026
@dependabot dependabot Bot force-pushed the dependabot/pip/evaluation/inference-dependencies-75952c0aae branch from d83d1e8 to d8de8f3 Compare April 28, 2026 14:12
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Maintainer review recommended

Advisory Review Summary

PR: chore(deps): bump the inference-dependencies group across 1 directory with 11 updates
Author: dependabot[bot] · Surfaces touched: python-runtime (evaluation)
Manifests changed: evaluation/pyproject.toml only
Lock file status: evaluation/uv.lock was NOT updated in this PR — see critical note below.


Package Table

Package From To Severity Surface
marshmallow 3.26.2 4.3.0 ⚠️ Major bump python-runtime (evaluation)
torch 2.10.0 2.11.0 GPU ABI-sensitive python-runtime (evaluation)
onnxruntime-gpu 1.24.4 1.25.1 GPU ABI-sensitive python-runtime (evaluation)
numpy 2.2.6 2.4.4 Low (within 2.x) python-runtime (evaluation)
onnxscript 0.6.2 0.7.0 Low python-runtime (evaluation)
gymnasium 1.2.3 1.3.0 Low python-runtime (evaluation)
tensordict 0.12.1 0.12.2 Low python-runtime (evaluation)
lerobot 0.5.0 0.5.1 Low python-runtime (evaluation)
packaging 25.0 26.2 Low python-runtime (evaluation)
hypothesis 6.151.13 6.152.4 Low (dev) python-runtime (evaluation)
matplotlib 3.10.8 3.10.9 Low (dev) python-runtime (evaluation)

🔴 Critical: Lock File Not Updated

evaluation/uv.lock was not changed in this PR (the diff contains only 13 additions, all in evaluation/pyproject.toml). The lock file still resolves the old versions:

  • marshmallow3.26.2 (lock) vs 4.3.0 (manifest)
  • numpy2.2.6 (lock) vs 2.4.4 (manifest)
  • torch2.10.0 (lock) vs 2.11.0 (manifest)
  • onnxruntime-gpu1.24.4 (lock) vs 1.25.1 (manifest)

Any workflow using uv sync --frozen will install the old versions. A regenerated lock file is required before this PR is production-safe. Dependabot should have updated both files; the missing lock regeneration suggests a gap in the .github/dependabot.yml lockfile configuration.


marshmallow

3.26.2 → 4.3.0 — MAJOR version bump

No GHSA or CVE advisory was referenced in the PR body. No security vulnerability is claimed for this bump; it is a routine major-version upgrade.

marshmallow is listed as a direct dependency in evaluation/pyproject.toml but is not directly imported in any evaluation/ Python source files found. It is consumed transitively by azure-ai-ml==1.32.0. The lock file records azure-ai-ml as depending on { name = "marshmallow" } with no upper-bound constraint, so the version may be compatible—but this needs verification against azure-ai-ml's actual runtime behaviour with marshmallow 4.x.

Marshmallow 4.0 introduced breaking changes at the API boundary (e.g. Meta.strict removed, field default handling changed, unknown-field behaviour). Review the migration guide before merging: (marshmallow.readthedocs.io/redacted)

Validation Signal

Deterministic CI: PR Validation: pending

  • No check runs have reported yet (total_count: 0 at review time). The relevant authoritative runs for this surface are Evaluation Pytest Tests, Pytest Inference, and Python Lint.
  • ⚠️ Deterministic CI conclusion not yet available; verdict is advisory only.

Static impact reasoning: marshmallow is not directly imported in evaluation Python files; however, a major version bump without a corresponding lock file regeneration means the new version has not been resolved or tested by uv. Merge only after the lock file is regenerated and CI passes.


torch

2.10.0 → 2.11.0 — minor bump, GPU ABI-sensitive

No GHSA/CVE advisory referenced. Minor bump within the 2.x series. Release notes: https://github.com/pytorch/pytorch/releases/tag/v2.11.0

torch links against CUDA and carries CUDA kernel ABIs. Hosted CI cannot exercise GPU execution paths used in evaluation/sil/ workflows. GPU smoke-testing on target hardware before merge is recommended.

Validation Signal

Deterministic CI: PR Validation: pendingEvaluation Pytest Tests and Pytest Inference not yet available.
Static impact reasoning: Minor bump; no major-version ABI boundary crossed. GPU paths require hardware validation beyond hosted CI scope.


onnxruntime-gpu

1.24.4 → 1.25.1 — minor bump, GPU ABI-sensitive

No GHSA/CVE advisory referenced. onnxruntime-gpu links against CUDA/cuDNN execution providers. Minor patch bumps can still carry CUDA operator kernel changes. Validate on GPU nodes before merge.

Release notes: https://github.com/microsoft/onnxruntime/releases/tag/v1.25.1

Validation Signal

Deterministic CI: PR Validation: pending
Static impact reasoning: ABI-sensitive GPU package; hosted CI cannot validate CUDA execution providers.


numpy

2.2.6 → 2.4.4 — minor bump within 2.x

No GHSA/CVE advisory referenced. evaluation/ was already on numpy 2.x (2.2.6), so the 2.0 ABI boundary has already been crossed on this surface. This bump is low-risk for the evaluation surface.

Isaac Sim ABI guard: training/rl/scripts/train.sh enforces numpy>=1.26.0,<2.0.0 and training/rl/pyproject.toml pins numpy==1.26.4. This PR does not touch training/rl/, so the ABI guard is not violated. The two numpy versions coexist in different virtual environments.


Remaining Packages (Low Risk)

Package From To Notes
onnxscript 0.6.2 0.7.0 Minor bump; ONNX script authoring toolchain
gymnasium 1.2.3 1.3.0 Minor bump; RL environment framework
tensordict 0.12.1 0.12.2 Patch bump; pytorch tensordict
lerobot 0.5.0 0.5.1 Patch bump; LeRobot IL policy runtime
packaging 25.0 26.2 Minor bump; PEP 440 utilities
hypothesis 6.151.13 6.152.4 Patch bump; dev-only property-based testing
matplotlib 3.10.8 3.10.9 Patch bump; dev-only plotting

No advisories or breaking changes found for these packages.


Advisory verdict: COMMENT — Marshmallow 3.x → 4.x major version bump requires compatibility verification with azure-ai-ml==1.32.0; evaluation/uv.lock was not updated in this PR and must be regenerated before the changes take effect; CI is pending and GPU paths cannot be validated by hosted runners.

Note

🔒 Integrity filter blocked 1 item

The following item were blocked because they don't meet the GitHub integrity level.

  • #562 pull_request_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".

To allow these resources, lower min-integrity in your GitHub frontmatter:

tools:
  github:
    min-integrity: approved  # merged | approved | unapproved | none

Generated by AW Dependabot PR Review for issue #562 · ● 2M

Comment thread evaluation/pyproject.toml Outdated
Comment thread evaluation/pyproject.toml Outdated
Comment thread evaluation/pyproject.toml
Comment thread evaluation/pyproject.toml Outdated
…nstraints

- Pin marshmallow<4, torch<2.11, numpy<2.3, packaging<26 to satisfy azure-ai-ml==1.32.0 and lerobot==0.5.1 transitive bounds
- Add matching ignore rules to dependabot.yml so future group PRs pre-filter incompatible bumps
- Refresh evaluation/uv.lock for compatible upgrades (lerobot 0.5.1, onnxruntime-gpu 1.25.1, onnxscript 0.7.0, gymnasium 1.3.0, tensordict 0.12.2, matplotlib 3.10.9, hypothesis 6.152.4)

🤖 - Generated by Copilot
@katriendg
Copy link
Copy Markdown
Collaborator

Pushed a follow-up commit to make this PR mergeable.

CI failure root cause: four packages in the group bump exceeded transitive upper bounds:

Package Bump Constraint
marshmallow 4.3.0 azure-ai-ml==1.32.0 requires marshmallow<4.0.0
torch 2.11.0 lerobot==0.5.1 requires torch<2.11.0
numpy 2.4.4 lerobot==0.5.1 requires numpy<2.3.0
packaging 26.2 lerobot==0.5.1 requires packaging<26.0

Changes in 64ef0be:

  • Reverted the four blocked packages to their last compatible versions (marshmallow==3.26.2, torch==2.10.0, numpy==2.2.6, packaging==25.0)
  • Refreshed evaluation/uv.lock
  • Added matching ignore rules to .github/dependabot.yml for the inference-dependencies group so future group PRs pre-filter these bumps until azure-ai-ml or lerobot lift the caps

All 217 evaluation pytests pass locally with 99.87% coverage. The seven remaining group bumps (lerobot 0.5.1, onnxruntime-gpu 1.25.1, onnxscript 0.7.0, gymnasium 1.3.0, tensordict 0.12.2, matplotlib 3.10.9, hypothesis 6.152.4) are preserved.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 29, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 66.59%. Comparing base (244531e) to head (a515251).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #562      +/-   ##
==========================================
+ Coverage   65.16%   66.59%   +1.42%     
==========================================
  Files         251      262      +11     
  Lines       15597    16611    +1014     
  Branches     2152     2294     +142     
==========================================
+ Hits        10164    11062     +898     
- Misses       5142     5262     +120     
+ Partials      291      287       -4     
Flag Coverage Δ *Carryforward flag
pester 83.13% <ø> (ø) Carriedforward from 64ef0be
pytest-data-pipeline 100.00% <ø> (ø) Carriedforward from 64ef0be
pytest-dataviewer 65.21% <ø> (-1.72%) ⬇️ Carriedforward from 64ef0be
pytest-dm-tools 100.00% <ø> (ø) Carriedforward from 64ef0be
pytest-evaluation 99.83% <ø> (?)
pytest-fuzz 4.99% <ø> (+0.09%) ⬆️ Carriedforward from 64ef0be
pytest-inference 0.00% <ø> (ø) Carriedforward from 64ef0be
pytest-training 82.14% <ø> (ø) Carriedforward from 64ef0be
vitest 51.06% <ø> (-1.96%) ⬇️ Carriedforward from 64ef0be

*This pull request uses carry forward flags. Click here to find out more.
see 30 files with indirect coverage changes

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Maintainer review recommended

Advisory Review Summary

Ecosystems and surfaces touched:

  • python-runtimeevaluation/pyproject.toml + evaluation/uv.lock (pip/uv)
  • dependabot-config.github/dependabot.yml (upper-bound caps added)

PR body vs. diff discrepancy. The Dependabot-generated PR body lists 11 package bumps including numpy 2.2.6 → 2.4.4, marshmallow 3.26.2 → 4.3.0, packaging 25.0 → 26.2, and torch 2.10.0 → 2.11.0. A second commit by @katriendg reverted all four back to their original pinned versions and added matching ignore caps in dependabot.yml. The net change to evaluation/pyproject.toml is 7 packages, not 11.

Net package changes

Package From To Severity Surface
onnxruntime-gpu 1.24.4 1.25.1 ⚠️ ABI-sensitive python-runtime / evaluation
onnxscript 0.6.2 0.7.0 Medium (pre-1.0) python-runtime / evaluation
gymnasium 1.2.3 1.3.0 Low python-runtime / evaluation
tensordict 0.12.1 0.12.2 Low python-runtime / evaluation
lerobot 0.5.0 0.5.1 Low python-runtime / evaluation
hypothesis 6.151.13 6.152.4 Low (dev dep) python-runtime / evaluation
matplotlib 3.10.8 3.10.9 Low (dev dep) python-runtime / evaluation

Reverted (net no change from main): numpy (back to 2.2.6), marshmallow (back to 3.26.2), packaging (back to 25.0), torch (back to 2.10.0).


onnxruntime-gpu (1.24.4 → 1.25.1)

Advisory: No CVE or GHSA identifiers present in this PR. No open advisories found for this version range at the time of review.

Release notes: This is a patch/minor bump within the 1.x series. ONNX Runtime 1.25.x release notes are available at github.com/microsoft/onnxruntime/releases/tag/v1.25.1 — unable to fetch from sandbox.

Repo-specific risk: onnxruntime-gpu is an ABI-sensitive package per the repository surface rubric. It links against CUDA and cuDNN at specific ABI levels. While this is a minor bump (1.24.4 → 1.25.1), the CUDA toolkit version on evaluation GPU nodes must match the requirement matrix for ONNX Runtime 1.25. Hosted CI cannot exercise GPU code paths that depend on this.

Isaac Sim ABI guard: The diff does not touch training/rl/requirements.txt or training/rl/pyproject.toml, so the Isaac Sim numpy >=1.26.0,<2.0.0 pin guard does not apply here.

Validation Signal

  • Deterministic CI: PR Validation: in_progress:in_progressrun 25106476055
    • Relevant check runs for python-runtime (evaluation) surface: Evaluation Pytest Tests, Pytest Inference, Python Lint — conclusions not yet available (in progress).
    • ⚠️ Deterministic CI conclusion not yet available; verdict is advisory only.
  • Static impact reasoning: The diff exclusively modifies evaluation/ — no training/rl/ paths are touched, so the Isaac Sim ABI guard does not fire. The onnxruntime-gpu 1.25.1 bump is the sole ABI-sensitive change; GPU-specific behavior cannot be validated by hosted CI.

onnxscript (0.6.2 → 0.7.0)

Advisory: No CVE or GHSA identifiers. Package is pre-1.0; minor version increments may include breaking API changes without SemVer obligation. Release notes: github.com/microsoft/onnxscript/releases.

Repo-specific risk: Medium. If evaluation/ scripts call onnxscript APIs directly, validate against the 0.7 changelog. If onnxscript is only a transitive dependency of onnxruntime, risk is lower.


gymnasium (1.2.3 → 1.3.0)

Advisory: No CVE or GHSA identifiers. Minor bump within 1.x stable series. Low risk.


tensordict (0.12.1 → 0.12.2), lerobot (0.5.0 → 0.5.1), hypothesis (6.151.13 → 6.152.4), matplotlib (3.10.8 → 3.10.9)

Advisory: No CVE or GHSA identifiers for any of these. All are patch bumps. Low risk.


Dependabot caps (.github/dependabot.yml)

The second commit adds four ignore rules to prevent future grouped bumps from re-introducing incompatible versions:

  • marshmallow >= 4.0.0 — correct; azure-ai-ml==1.32.0 has a transitive hard requirement on marshmallow < 4.
  • torch >= 2.11.0 — correct; lerobot == 0.5.1 constrains torch to < 2.11.
  • numpy >= 2.3.0 — correct; transitive constraint from the above.
  • packaging >= 26.0 — correct; transitive constraint.

These caps are appropriate and prevent silent breakage in future Dependabot runs. Revisit when azure-ai-ml >= 2 or lerobot >= 0.6 is adopted.


Advisory verdict: COMMENT — CI is still in progress and onnxruntime-gpu is an ABI-sensitive bump that hosted CI cannot fully validate. All other changes are low-risk patch or minor bumps. The human-authored cap commit correctly reverts unsafe Dependabot bumps and adds protective ignore rules.

Note

🔒 Integrity filter blocked 1 item

The following item were blocked because they don't meet the GitHub integrity level.

  • #562 pull_request_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".

To allow these resources, lower min-integrity in your GitHub frontmatter:

tools:
  github:
    min-integrity: approved  # merged | approved | unapproved | none

Generated by AW Dependabot PR Review for issue #562 · ● 1.7M

Comment thread evaluation/pyproject.toml
Comment thread evaluation/pyproject.toml
Comment thread .github/dependabot.yml
@katriendg
Copy link
Copy Markdown
Collaborator

@WilliamBerryiii because of issues with bumping, I manually reverted some, and added current exclusions to dependabot.yml. Please review if OK before we merge? Thanks.

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Maintainer review recommended

Advisory Review Summary

Surfaces touched: python-runtime (evaluation)evaluation/pyproject.toml, evaluation/uv.lock; .github/dependabot.yml (ignore rules added).

PR body discrepancy: The Dependabot-generated description lists 11 package bumps, but numpy (2.2.6→2.4.4), marshmallow (3.26.2→4.3.0), torch (2.10.0→2.11.0), and packaging (25.0→26.2) are not changed in evaluation/pyproject.toml. This PR simultaneously adds upper-bound ignore rules in .github/dependabot.yml capping all four, attributed to azure-ai-ml<2 and lerobot<0.6 transitive constraints. The 7 bumps below reflect the actual manifest changes.

Package From To Severity Surface
onnxruntime-gpu 1.24.4 1.25.1 ⚠️ ABI-sensitive (CUDA) python-runtime/evaluation
onnxscript 0.6.2 0.7.0 Low (pre-1.0 minor) python-runtime/evaluation
gymnasium 1.2.3 1.3.0 Low python-runtime/evaluation
tensordict 0.12.1 0.12.2 Low python-runtime/evaluation
lerobot 0.5.0 0.5.1 Low python-runtime/evaluation
hypothesis 6.151.13 6.152.4 Low (dev dep) python-runtime/evaluation
matplotlib 3.10.8 3.10.9 Low (dev dep) python-runtime/evaluation

onnxruntime-gpu

Advisory enrichment: No GHSA or CVE identifiers found in the PR body. OSV.dev and NVD could not be queried (network sandbox). No known critical published advisory for the 1.24→1.25 version pair. Upstream repository: microsoft/onnxruntime; compare range: microsoft/onnxruntime@v1.24.4...v1.25.1

Repo-specific risk: onnxruntime-gpu is explicitly classified as ABI-sensitive under the python-runtime (evaluation) surface rubric (Isaac Sim / CUDA dependency chain). Direct usage found in evaluation/sil/play_policy.py:245. A minor version bump may shift CUDA toolkit or cuDNN requirements in ways that hosted CI (CPU-only runners) cannot validate.

Validation advice: Run play_policy.py against a GPU node to confirm CUDA runtime compatibility before merging.

Validation Signal

  • Deterministic CI: PR Validation: successrun link. Per-surface check runs for Evaluation Pytest Tests, Pytest Inference, and Python Lint could not be individually enumerated (MCP integrity filter); overall orchestrator conclusion is success.
  • Static impact reasoning: onnxruntime-gpu matches the python-runtime ABI-sensitive trigger list. Minor bump (1.24→1.25) reduces likelihood of a hard ABI break versus a major, but GPU driver compatibility on target hardware (H100, RTX PRO 6000 with GRID drivers) requires manual validation. Isaac Sim ABI guard is N/A — training/rl/ is not in this diff.

onnxscript

Advisory enrichment: No advisory identifiers in PR body. Pre-1.0 package; 0.6→0.7 is a minor version increment. Upstream compare: microsoft/onnxscript@0.6.2...0.7.0

Repo-specific risk: No direct import onnxscript found in evaluation/ Python sources — this is a transitive dependency of onnxruntime. Low operational risk.

Validation Signal

  • Deterministic CI: PR Validation: success (see above).
  • Static impact reasoning: Transitive-only usage; no direct API calls to patch.

gymnasium

Advisory enrichment: No advisory identifiers in PR body. Minor bump 1.2.3→1.3.0. Upstream: https://github.com/Farama-Foundation/Gymnasium/releases/tag/v1.3.0

Repo-specific risk: Used in evaluation/sil/monitor_checkpoints.py and evaluation/sil/play_policy.py. Minor bump within the stable 1.x series; API surface is stable.

Validation Signal

  • Deterministic CI: PR Validation: success (see above).
  • Static impact reasoning: 1.x minor bump; low risk.

tensordict

Advisory enrichment: No advisory identifiers in PR body. Patch bump 0.12.1→0.12.2.

Repo-specific risk: Torch companion library. Patch releases are generally safe; torch itself is pinned at 2.10.0 (unchanged) so there is no cross-version ABI risk between torch and tensordict in this PR.

Validation Signal

  • Deterministic CI: PR Validation: success (see above).
  • Static impact reasoning: Patch bump, low risk.

lerobot

Advisory enrichment: No advisory identifiers in PR body. Patch bump 0.5.0→0.5.1.

Repo-specific risk: Patch bump, low risk. See inline comment on this line for the version skew with evaluation/sil/docker/requirements-lerobot-eval.txt.

Validation Signal

  • Deterministic CI: PR Validation: success (see above).
  • Static impact reasoning: Patch bump; Docker requirements divergence noted in inline comment.

hypothesis / matplotlib

Advisory enrichment: Dev-only dependencies. No advisory identifiers. Patch bumps (hypothesis 6.151.13→6.152.4, matplotlib 3.10.8→3.10.9). Minimal risk; test tooling only.


Dependabot Ignore Rules Added

This PR adds upper-bound caps in .github/dependabot.yml for the /evaluation pip ecosystem:

Package Cap Reason
marshmallow < 4.0.0 azure-ai-ml<2 transitive constraint
torch < 2.11.0 lerobot<0.6 transitive constraint
numpy < 2.3.0 azure-ai-ml<2 / lerobot<0.6 transitive constraint
packaging < 26.0 azure-ai-ml<2 transitive constraint

Revisit these caps when azure-ai-ml>=2 or lerobot>=0.6 releases compatible versions.


Uncovered Manifest

evaluation/sil/docker/requirements-lerobot-eval.txt pins lerobot==0.4.1. This file is not touched by this PR. It is tracked under the docker Dependabot entry at /evaluation/sil/docker (which manages Dockerfiles, not pip requirements files), so no pip Dependabot entry covers it. The version skew with the main manifest (lerobot==0.5.1) grows with each lerobot bump. Consider adding a pip Dependabot entry for /evaluation/sil/docker or manually aligning this file.


Advisory verdict: COMMENT — onnxruntime-gpu (1.24.4→1.25.1) is an ABI-sensitive CUDA package that triggers mandatory maintainer review per the python-runtime (evaluation) surface rubric. Overall PR Validation CI passed (success), but GPU driver compatibility on target hardware cannot be confirmed by hosted runners alone.

Note

🔒 Integrity filter blocked 1 item

The following item were blocked because they don't meet the GitHub integrity level.

  • #562 pull_request_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".

To allow these resources, lower min-integrity in your GitHub frontmatter:

tools:
  github:
    min-integrity: approved  # merged | approved | unapproved | none

Generated by AW Dependabot PR Review for issue #562 · ● 1.4M

Comment thread evaluation/pyproject.toml
Comment thread .github/dependabot.yml
Comment thread evaluation/pyproject.toml
Comment thread evaluation/pyproject.toml
Comment thread evaluation/pyproject.toml
@WilliamBerryiii WilliamBerryiii merged commit 087f53a into main May 1, 2026
47 checks passed
@WilliamBerryiii WilliamBerryiii deleted the dependabot/pip/evaluation/inference-dependencies-75952c0aae branch May 1, 2026 20:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Dependency version updates python Pull requests that update python code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants