chore(deps): bump the inference-dependencies group across 1 directory with 11 updates#562
Conversation
Dependency ReviewThe following issues were found:
Snapshot WarningsEnsure that dependencies are being submitted on PR branches and consider enabling retry-on-snapshot-warnings. See the documentation for more information and troubleshooting advice. License Issuesevaluation/uv.lock
OpenSSF Scorecard
Scanned Files
|
|
✅ AW Dependabot PR Review completed successfully! |
There was a problem hiding this comment.
Advisory Review Summary
Affected ecosystems / surfaces: python-runtime (pip — evaluation/)
| Package | From | To | Severity | Surface |
|---|---|---|---|---|
| numpy | 2.2.6 | 2.4.4 | python-runtime | |
| marshmallow | 3.26.2 | 4.3.0 | python-runtime | |
| packaging | 25.0 | 26.2 | Low | python-runtime |
| onnxscript | 0.6.2 | 0.7.0 | Low | python-runtime |
| onnxruntime-gpu | 1.24.4 | 1.25.0 | python-runtime | |
| gymnasium | 1.2.3 | 1.3.0 | Low | python-runtime |
| torch | 2.10.0 | 2.11.0 | python-runtime | |
| tensordict | 0.12.1 | 0.12.2 | Medium (torch-coupled) | python-runtime |
| lerobot | 0.5.0 | 0.5.1 | Low (pre-1.0 patch) | python-runtime |
| hypothesis | 6.151.13 | 6.152.3 | Low (dev only) | python-runtime |
| matplotlib | 3.10.8 | 3.10.9 | Low (dev only) | python-runtime |
numpy
Version jump: 2.2.6 → 2.4.4 (skips the 2.3.x series entirely).
- NumPy 2.3 introduced a new variable-width string dtype (
numpy.strings) and related C-extension API additions. Code compiled against 2.2.x wheels is compatible at runtime, but C-extension authors must recompile against ≥2.3 to use the new APIs. - 2.4.4 is a pure patch release fixing an OpenBLAS ARM threading issue (#30816) and a FNV-1a 64-bit hash selection bug (#31052).
- Release notes: https://github.com/numpy/numpy/releases/tag/v2.4.4
Repo-specific risk: training/rl independently pins numpy==1.26.4 (the <2.0.0 Isaac Sim constraint) — that surface is unaffected by this PR. The evaluation surface itself has no documented <2.0.0 constraint, so the 2.4.4 target is permissible.
marshmallow
Major version bump: 3.26.2 → 4.3.0. marshmallow 4.0 removed APIs deprecated in the 3.x series.
Notable breaking changes in 4.0 (source: (marshmallow.readthedocs.io/redacted)
Schema.Metafield-level defaults restructured;missing/defaultsemantics unified.- Several previously-deprecated constructor kwargs removed from
Fieldsubclasses. @pre_load/@post_loadpass-many behaviour changed.- Python ≥3.9 required (project already requires ≥3.12 — no conflict).
Any marshmallow schema code in evaluation/ must be audited against the 4.0 migration guide before merging.
onnxscript
Pre-1.0 minor bump: 0.6.2 → 0.7.0. No GHSA/CVE found. Pre-1.0 minor bumps may contain breaking changes. Upstream compare: microsoft/onnxscript@v0.6.2...v0.7.0
onnxruntime-gpu
CUDA minimum raised to 12.0; multiple security fixes.
Breaking changes in v1.25.0 (source: https://github.com/microsoft/onnxruntime/releases/tag/v1.25.0):
CUDA 11.x is no longer supported. Users pinned to CUDA 11.x should stay on ORT 1.24.x or upgrade their CUDA toolkit/driver.
ORT_API_VERSION updated to 25. C/C++ extension consumers compiled against older ORT headers must rebuild.
Security fixes bundled in 1.25.0:
- Fixed potential integer truncation → heap out-of-bounds read/write (PR microsoft/onnxruntime#27544)
- Addressed Pad Reflect vulnerability (PR microsoft/onnxruntime#27652)
- Security fix for transpose optimizer (PR microsoft/onnxruntime#27555)
- Upgraded minimatch 3.1.2 → 3.1.4 for CVE-2026-27904 (PR microsoft/onnxruntime#27667 — affects build tooling only; no Python runtime exposure)
- Added
onnx::TensorProtodata size validation before allocation (PR microsoft/onnxruntime#27547) - Fixed misaligned address reads for tensor attributes from raw data buffers (PR microsoft/onnxruntime#27312)
Note: CVE-2026-27904 is cited in onnxruntime's release notes in relation to
minimatch(a Node.js library used in ONNX Runtime's JS/build tooling). No GHSA record was located for this CVE ID at query time; the severity and affected-version range are not independently verified here. Do not treat this advisory summary as authoritative for that CVE.
gymnasium
Minor bump: 1.2.3 → 1.3.0. No GHSA/CVE found. Low risk. Source: https://github.com/Farama-Foundation/Gymnasium/releases
torch
Minor bump: 2.10.0 → 2.11.0. No GHSA/CVE found. CUDA ABI-sensitive: pre-built wheels are compiled against a specific CUDA version. Verify the evaluation GPU environment runs a compatible CUDA toolkit.
tensordict
Patch bump: 0.12.1 → 0.12.2. Tightly coupled with PyTorch; this patch version tracks torch==2.11.0. No GHSA/CVE found. Source: https://github.com/pytorch/tensordict/releases
lerobot
Pre-1.0 patch bump: 0.5.0 → 0.5.1. No GHSA/CVE found. Low risk. Source: https://github.com/huggingface/lerobot/releases
hypothesis / matplotlib
Dev-only patch bumps. No advisory or operational risk.
Uncovered-manifest note
training/il/lerobot/pyproject.toml pins lerobot==0.5.0 and numpy==2.2.6 independently but is not covered by .github/dependabot.yml. Consider adding a pip entry for /training/il/lerobot so those pins receive automated updates.
Validation advice
# In evaluation/
ruff check .
pytest -m "not e2e" # non-GPU unit tests
# On a GPU node (CUDA ≥ 12.0 required):
python -c "import torch; print(torch.cuda.is_available(), torch.version.cuda)"
python -c "import onnxruntime; print(onnxruntime.__version__)"
pytest -m e2e # GPU-gated SIL smoke testsAdvisory verdict: COMMENT — multiple high-risk signals fire: onnxruntime-gpu requires CUDA ≥ 12.0 (breaking for CUDA 11.x environments), marshmallow crosses a major version boundary, and numpy / torch involve CUDA/ABI-sensitive bumps that should be validated on GPU hardware before merging.
Note
🔒 Integrity filter blocked 1 item
The following item was blocked because it doesn't meet the GitHub integrity level.
- #562
pull_request_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
To allow these resources, lower min-integrity in your GitHub frontmatter:
tools:
github:
min-integrity: approved # merged | approved | unapproved | noneGenerated by AW Dependabot PR Review for issue #562 · ● 920.4K
|
✅ AW Dependabot PR Review completed successfully! |
There was a problem hiding this comment.
Advisory Review Summary
Ecosystems / surfaces touched:
pip/uvunderevaluation/→ python-runtime surface (11 packages, single manifest)
Changed file: evaluation/pyproject.toml only. No uv.lock change is present in this PR — if lock-file regeneration is deferred to a separate PR, note that transitive pins are not yet updated.
Package table
| Package | From | To | Severity | Surface |
|---|---|---|---|---|
| numpy | 2.2.6 | 2.4.4 | — (ABI-sensitive) | python-runtime |
| marshmallow | 3.26.2 | 4.3.0 | — (major bump) | python-runtime |
| packaging | 25.0 | 26.2 | — (major bump) | python-runtime |
| onnxscript | 0.6.2 | 0.7.0 | — | python-runtime |
| onnxruntime-gpu | 1.24.4 | 1.25.0 | Security (CVE-2026-27904) | python-runtime |
| gymnasium | 1.2.3 | 1.3.0 | — | python-runtime |
| torch | 2.10.0 | 2.11.0 | — (backward-incompat) | python-runtime |
| tensordict | 0.12.1 | 0.12.2 | — | python-runtime |
| lerobot | 0.5.0 | 0.5.1 | — (pre-1.0) | python-runtime |
| hypothesis | 6.151.13 | 6.152.3 | — | python-runtime (dev) |
| matplotlib | 3.10.8 | 3.10.9 | — | python-runtime (dev) |
onnxruntime-gpu
CVE-2026-27904 — Bundled minimatch dependency upgraded from 3.1.2 → 3.1.4. This addresses a vulnerability in a JavaScript build-time/dev-time tool bundled inside the ORT distribution. Additional security hardening in 1.25.0: Pad Reflect vulnerability fix (onnxruntime#27652), transpose optimizer security fix (onnxruntime#27555), and hardened shell command handling.
Breaking change: CUDA minimum version raised to 12.0. CUDA 11.x is no longer supported. Environments pinned to CUDA 11.x must stay on ORT 1.24.x or upgrade CUDA before this PR lands.
Source: ONNX Runtime v1.25.0 release notes
marshmallow
Major version: 3.26.2 → 4.3.0. Marshmallow 4.0 removed all APIs deprecated in 3.x. Any schema using @post_load(pass_many=True), @validates_schema(pass_many=True), or affected Schema.Meta options will raise errors at runtime.
Audit all
marshmallow.Schemasubclasses inevaluation/against the [3→4 migration guide]((marshmallow.readthedocs.io/redacted) Runruff check+ targetedpytestbefore merging.
Source: marshmallow CHANGELOG
numpy
2.2.6 → 2.4.4 skips the 2.3.x series (additional C-API / ABI changes). The 2.4.4 patch release closes the OpenBLAS threading bug on ARM (#30816) and fixes a FNV-1a 64-bit hash selection issue (#31052). Confirm CUDA wheel ABI compatibility with NumPy 2.4.x for co-installed packages (torch, onnxruntime-gpu).
Note: training/rl/pyproject.toml retains numpy==1.26.4 (within the >=1.26.0,<2.0.0 Isaac Sim constraint) — that surface is unaffected by this PR.
Source: NumPy 2.4.4 release notes
torch
2.10.0 → 2.11.0 — minor bump with a documented "Backwards Incompatible Changes" section (Release Engineering category; details truncated in the Dependabot body). See the [PyTorch 2.11.0 release blog]((pytorch.org/redacted) for the full list.
Source: PyTorch 2.11.0 release notes
packaging
25.0 → 26.2 — major version jump. The 26.0/26.1 releases removed packaging._structures; 26.2 restores pickle backward-compatibility for serialized version/specifier objects. If evaluation pipelines cache pickled packaging objects, regenerate them after upgrade.
Source: packaging 26.2 release notes
onnxscript
0.6.2 → 0.7.0 — pre-1.0 minor bump. No advisory found. Changelog: compare view.
gymnasium
1.2.3 → 1.3.0 — adds RepeatAction wrapper and external environment support. No breaking changes identified. Compare view.
tensordict / lerobot / hypothesis / matplotlib
Patch bumps. lerobot 0.5.1 is pre-1.0 and includes a fix for a breaking change from transformers 5.4.0 (lerobot#3231).
Advisory verdict: COMMENT — Three high-risk signals fire on the python-runtime surface: onnxruntime-gpu requires CUDA ≥ 12.0, marshmallow crosses a major breaking version, and numpy skips an ABI-changing minor series. Maintainer validation against the CUDA environment and existing schema code is recommended before merge. The security fix for CVE-2026-27904 in onnxruntime-gpu 1.25.0 is a positive motivator to merge once those checks pass.
Note
🔒 Integrity filter blocked 1 item
The following item was blocked because it doesn't meet the GitHub integrity level.
- #562
pull_request_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
To allow these resources, lower min-integrity in your GitHub frontmatter:
tools:
github:
min-integrity: approved # merged | approved | unapproved | noneGenerated by AW Dependabot PR Review for issue #562 · ● 1.5M
|
@dependabot rebase |
|
Looks like this PR has been edited by someone other than Dependabot. That means Dependabot can't rebase it - sorry! If you're happy for Dependabot to recreate it from scratch, overwriting any edits, you can request |
|
@dependabot recreate |
… with 11 updates Bumps the inference-dependencies group with 11 updates in the /evaluation directory: | Package | From | To | | --- | --- | --- | | [numpy](https://github.com/numpy/numpy) | `2.2.6` | `2.4.4` | | [marshmallow](https://github.com/marshmallow-code/marshmallow) | `3.26.2` | `4.3.0` | | [packaging](https://github.com/pypa/packaging) | `25.0` | `26.2` | | [onnxscript](https://github.com/microsoft/onnxscript) | `0.6.2` | `0.7.0` | | [onnxruntime-gpu](https://github.com/microsoft/onnxruntime) | `1.24.4` | `1.25.1` | | [gymnasium](https://github.com/Farama-Foundation/Gymnasium) | `1.2.3` | `1.3.0` | | [torch](https://github.com/pytorch/pytorch) | `2.10.0` | `2.11.0` | | [tensordict](https://github.com/pytorch/tensordict) | `0.12.1` | `0.12.2` | | [lerobot](https://github.com/huggingface/lerobot) | `0.5.0` | `0.5.1` | | [hypothesis](https://github.com/HypothesisWorks/hypothesis) | `6.151.13` | `6.152.4` | | [matplotlib](https://github.com/matplotlib/matplotlib) | `3.10.8` | `3.10.9` | Updates `numpy` from 2.2.6 to 2.4.4 - [Release notes](https://github.com/numpy/numpy/releases) - [Changelog](https://github.com/numpy/numpy/blob/main/doc/RELEASE_WALKTHROUGH.rst) - [Commits](numpy/numpy@v2.2.6...v2.4.4) Updates `marshmallow` from 3.26.2 to 4.3.0 - [Changelog](https://github.com/marshmallow-code/marshmallow/blob/dev/CHANGELOG.rst) - [Commits](marshmallow-code/marshmallow@3.26.2...4.3.0) Updates `packaging` from 25.0 to 26.2 - [Release notes](https://github.com/pypa/packaging/releases) - [Changelog](https://github.com/pypa/packaging/blob/main/CHANGELOG.rst) - [Commits](pypa/packaging@25.0...26.2) Updates `onnxscript` from 0.6.2 to 0.7.0 - [Release notes](https://github.com/microsoft/onnxscript/releases) - [Commits](microsoft/onnxscript@v0.6.2...v0.7.0) Updates `onnxruntime-gpu` from 1.24.4 to 1.25.1 - [Release notes](https://github.com/microsoft/onnxruntime/releases) - [Changelog](https://github.com/microsoft/onnxruntime/blob/main/docs/ReleaseManagement.md) - [Commits](microsoft/onnxruntime@v1.24.4...v1.25.1) Updates `gymnasium` from 1.2.3 to 1.3.0 - [Release notes](https://github.com/Farama-Foundation/Gymnasium/releases) - [Commits](Farama-Foundation/Gymnasium@v1.2.3...v1.3.0) Updates `torch` from 2.10.0 to 2.11.0 - [Release notes](https://github.com/pytorch/pytorch/releases) - [Changelog](https://github.com/pytorch/pytorch/blob/main/RELEASE.md) - [Commits](pytorch/pytorch@v2.10.0...v2.11.0) Updates `tensordict` from 0.12.1 to 0.12.2 - [Release notes](https://github.com/pytorch/tensordict/releases) - [Commits](pytorch/tensordict@v0.12.1...v0.12.2) Updates `lerobot` from 0.5.0 to 0.5.1 - [Release notes](https://github.com/huggingface/lerobot/releases) - [Commits](huggingface/lerobot@v0.5.0...v0.5.1) Updates `hypothesis` from 6.151.13 to 6.152.4 - [Release notes](https://github.com/HypothesisWorks/hypothesis/releases) - [Commits](HypothesisWorks/hypothesis@hypothesis-python-6.151.13...hypothesis-python-6.152.4) Updates `matplotlib` from 3.10.8 to 3.10.9 - [Release notes](https://github.com/matplotlib/matplotlib/releases) - [Commits](matplotlib/matplotlib@v3.10.8...v3.10.9) --- updated-dependencies: - dependency-name: gymnasium dependency-version: 1.3.0 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: inference-dependencies - dependency-name: hypothesis dependency-version: 6.152.3 dependency-type: direct:development update-type: version-update:semver-minor dependency-group: inference-dependencies - dependency-name: lerobot dependency-version: 0.5.1 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: inference-dependencies - dependency-name: marshmallow dependency-version: 4.3.0 dependency-type: direct:production update-type: version-update:semver-major dependency-group: inference-dependencies - dependency-name: matplotlib dependency-version: 3.10.9 dependency-type: direct:development update-type: version-update:semver-patch dependency-group: inference-dependencies - dependency-name: numpy dependency-version: 2.4.4 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: inference-dependencies - dependency-name: onnxruntime-gpu dependency-version: 1.25.0 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: inference-dependencies - dependency-name: onnxscript dependency-version: 0.7.0 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: inference-dependencies - dependency-name: packaging dependency-version: '26.2' dependency-type: direct:production update-type: version-update:semver-major dependency-group: inference-dependencies - dependency-name: tensordict dependency-version: 0.12.2 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: inference-dependencies - dependency-name: torch dependency-version: 2.11.0 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: inference-dependencies ... Signed-off-by: dependabot[bot] <support@github.com>
d83d1e8 to
d8de8f3
Compare
There was a problem hiding this comment.
Advisory Review Summary
PR: chore(deps): bump the inference-dependencies group across 1 directory with 11 updates
Author: dependabot[bot] · Surfaces touched: python-runtime (evaluation)
Manifests changed: evaluation/pyproject.toml only
Lock file status: evaluation/uv.lock was NOT updated in this PR — see critical note below.
Package Table
| Package | From | To | Severity | Surface |
|---|---|---|---|---|
| marshmallow | 3.26.2 | 4.3.0 | python-runtime (evaluation) | |
| torch | 2.10.0 | 2.11.0 | GPU ABI-sensitive | python-runtime (evaluation) |
| onnxruntime-gpu | 1.24.4 | 1.25.1 | GPU ABI-sensitive | python-runtime (evaluation) |
| numpy | 2.2.6 | 2.4.4 | Low (within 2.x) | python-runtime (evaluation) |
| onnxscript | 0.6.2 | 0.7.0 | Low | python-runtime (evaluation) |
| gymnasium | 1.2.3 | 1.3.0 | Low | python-runtime (evaluation) |
| tensordict | 0.12.1 | 0.12.2 | Low | python-runtime (evaluation) |
| lerobot | 0.5.0 | 0.5.1 | Low | python-runtime (evaluation) |
| packaging | 25.0 | 26.2 | Low | python-runtime (evaluation) |
| hypothesis | 6.151.13 | 6.152.4 | Low (dev) | python-runtime (evaluation) |
| matplotlib | 3.10.8 | 3.10.9 | Low (dev) | python-runtime (evaluation) |
🔴 Critical: Lock File Not Updated
evaluation/uv.lock was not changed in this PR (the diff contains only 13 additions, all in evaluation/pyproject.toml). The lock file still resolves the old versions:
marshmallow→3.26.2(lock) vs4.3.0(manifest)numpy→2.2.6(lock) vs2.4.4(manifest)torch→2.10.0(lock) vs2.11.0(manifest)onnxruntime-gpu→1.24.4(lock) vs1.25.1(manifest)
Any workflow using uv sync --frozen will install the old versions. A regenerated lock file is required before this PR is production-safe. Dependabot should have updated both files; the missing lock regeneration suggests a gap in the .github/dependabot.yml lockfile configuration.
marshmallow
3.26.2 → 4.3.0 — MAJOR version bump
No GHSA or CVE advisory was referenced in the PR body. No security vulnerability is claimed for this bump; it is a routine major-version upgrade.
marshmallow is listed as a direct dependency in evaluation/pyproject.toml but is not directly imported in any evaluation/ Python source files found. It is consumed transitively by azure-ai-ml==1.32.0. The lock file records azure-ai-ml as depending on { name = "marshmallow" } with no upper-bound constraint, so the version may be compatible—but this needs verification against azure-ai-ml's actual runtime behaviour with marshmallow 4.x.
Marshmallow 4.0 introduced breaking changes at the API boundary (e.g. Meta.strict removed, field default handling changed, unknown-field behaviour). Review the migration guide before merging: (marshmallow.readthedocs.io/redacted)
Validation Signal
Deterministic CI: PR Validation: pending
- No check runs have reported yet (total_count: 0 at review time). The relevant authoritative runs for this surface are
Evaluation Pytest Tests,Pytest Inference, andPython Lint. ⚠️ Deterministic CI conclusion not yet available; verdict is advisory only.
Static impact reasoning: marshmallow is not directly imported in evaluation Python files; however, a major version bump without a corresponding lock file regeneration means the new version has not been resolved or tested by uv. Merge only after the lock file is regenerated and CI passes.
torch
2.10.0 → 2.11.0 — minor bump, GPU ABI-sensitive
No GHSA/CVE advisory referenced. Minor bump within the 2.x series. Release notes: https://github.com/pytorch/pytorch/releases/tag/v2.11.0
torch links against CUDA and carries CUDA kernel ABIs. Hosted CI cannot exercise GPU execution paths used in evaluation/sil/ workflows. GPU smoke-testing on target hardware before merge is recommended.
Validation Signal
Deterministic CI: PR Validation: pending — Evaluation Pytest Tests and Pytest Inference not yet available.
Static impact reasoning: Minor bump; no major-version ABI boundary crossed. GPU paths require hardware validation beyond hosted CI scope.
onnxruntime-gpu
1.24.4 → 1.25.1 — minor bump, GPU ABI-sensitive
No GHSA/CVE advisory referenced. onnxruntime-gpu links against CUDA/cuDNN execution providers. Minor patch bumps can still carry CUDA operator kernel changes. Validate on GPU nodes before merge.
Release notes: https://github.com/microsoft/onnxruntime/releases/tag/v1.25.1
Validation Signal
Deterministic CI: PR Validation: pending
Static impact reasoning: ABI-sensitive GPU package; hosted CI cannot validate CUDA execution providers.
numpy
2.2.6 → 2.4.4 — minor bump within 2.x
No GHSA/CVE advisory referenced. evaluation/ was already on numpy 2.x (2.2.6), so the 2.0 ABI boundary has already been crossed on this surface. This bump is low-risk for the evaluation surface.
Isaac Sim ABI guard: training/rl/scripts/train.sh enforces numpy>=1.26.0,<2.0.0 and training/rl/pyproject.toml pins numpy==1.26.4. This PR does not touch training/rl/, so the ABI guard is not violated. The two numpy versions coexist in different virtual environments.
Remaining Packages (Low Risk)
| Package | From | To | Notes |
|---|---|---|---|
| onnxscript | 0.6.2 | 0.7.0 | Minor bump; ONNX script authoring toolchain |
| gymnasium | 1.2.3 | 1.3.0 | Minor bump; RL environment framework |
| tensordict | 0.12.1 | 0.12.2 | Patch bump; pytorch tensordict |
| lerobot | 0.5.0 | 0.5.1 | Patch bump; LeRobot IL policy runtime |
| packaging | 25.0 | 26.2 | Minor bump; PEP 440 utilities |
| hypothesis | 6.151.13 | 6.152.4 | Patch bump; dev-only property-based testing |
| matplotlib | 3.10.8 | 3.10.9 | Patch bump; dev-only plotting |
No advisories or breaking changes found for these packages.
Advisory verdict: COMMENT — Marshmallow 3.x → 4.x major version bump requires compatibility verification with azure-ai-ml==1.32.0; evaluation/uv.lock was not updated in this PR and must be regenerated before the changes take effect; CI is pending and GPU paths cannot be validated by hosted runners.
Note
🔒 Integrity filter blocked 1 item
The following item were blocked because they don't meet the GitHub integrity level.
- #562
pull_request_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
To allow these resources, lower min-integrity in your GitHub frontmatter:
tools:
github:
min-integrity: approved # merged | approved | unapproved | noneGenerated by AW Dependabot PR Review for issue #562 · ● 2M
…nstraints - Pin marshmallow<4, torch<2.11, numpy<2.3, packaging<26 to satisfy azure-ai-ml==1.32.0 and lerobot==0.5.1 transitive bounds - Add matching ignore rules to dependabot.yml so future group PRs pre-filter incompatible bumps - Refresh evaluation/uv.lock for compatible upgrades (lerobot 0.5.1, onnxruntime-gpu 1.25.1, onnxscript 0.7.0, gymnasium 1.3.0, tensordict 0.12.2, matplotlib 3.10.9, hypothesis 6.152.4) 🤖 - Generated by Copilot
|
Pushed a follow-up commit to make this PR mergeable. CI failure root cause: four packages in the group bump exceeded transitive upper bounds:
Changes in 64ef0be:
All 217 evaluation pytests pass locally with 99.87% coverage. The seven remaining group bumps ( |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #562 +/- ##
==========================================
+ Coverage 65.16% 66.59% +1.42%
==========================================
Files 251 262 +11
Lines 15597 16611 +1014
Branches 2152 2294 +142
==========================================
+ Hits 10164 11062 +898
- Misses 5142 5262 +120
+ Partials 291 287 -4
*This pull request uses carry forward flags. Click here to find out more. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Advisory Review Summary
Ecosystems and surfaces touched:
python-runtime—evaluation/pyproject.toml+evaluation/uv.lock(pip/uv)dependabot-config—.github/dependabot.yml(upper-bound caps added)
PR body vs. diff discrepancy. The Dependabot-generated PR body lists 11 package bumps including
numpy 2.2.6 → 2.4.4,marshmallow 3.26.2 → 4.3.0,packaging 25.0 → 26.2, andtorch 2.10.0 → 2.11.0. A second commit by@katriendgreverted all four back to their original pinned versions and added matching ignore caps independabot.yml. The net change toevaluation/pyproject.tomlis 7 packages, not 11.
Net package changes
| Package | From | To | Severity | Surface |
|---|---|---|---|---|
onnxruntime-gpu |
1.24.4 | 1.25.1 | python-runtime / evaluation | |
onnxscript |
0.6.2 | 0.7.0 | Medium (pre-1.0) | python-runtime / evaluation |
gymnasium |
1.2.3 | 1.3.0 | Low | python-runtime / evaluation |
tensordict |
0.12.1 | 0.12.2 | Low | python-runtime / evaluation |
lerobot |
0.5.0 | 0.5.1 | Low | python-runtime / evaluation |
hypothesis |
6.151.13 | 6.152.4 | Low (dev dep) | python-runtime / evaluation |
matplotlib |
3.10.8 | 3.10.9 | Low (dev dep) | python-runtime / evaluation |
Reverted (net no change from main): numpy (back to 2.2.6), marshmallow (back to 3.26.2), packaging (back to 25.0), torch (back to 2.10.0).
onnxruntime-gpu (1.24.4 → 1.25.1)
Advisory: No CVE or GHSA identifiers present in this PR. No open advisories found for this version range at the time of review.
Release notes: This is a patch/minor bump within the 1.x series. ONNX Runtime 1.25.x release notes are available at github.com/microsoft/onnxruntime/releases/tag/v1.25.1 — unable to fetch from sandbox.
Repo-specific risk: onnxruntime-gpu is an ABI-sensitive package per the repository surface rubric. It links against CUDA and cuDNN at specific ABI levels. While this is a minor bump (1.24.4 → 1.25.1), the CUDA toolkit version on evaluation GPU nodes must match the requirement matrix for ONNX Runtime 1.25. Hosted CI cannot exercise GPU code paths that depend on this.
Isaac Sim ABI guard: The diff does not touch training/rl/requirements.txt or training/rl/pyproject.toml, so the Isaac Sim numpy >=1.26.0,<2.0.0 pin guard does not apply here.
Validation Signal
- Deterministic CI:
PR Validation: in_progress:in_progress— run 25106476055- Relevant check runs for
python-runtime (evaluation)surface:Evaluation Pytest Tests,Pytest Inference,Python Lint— conclusions not yet available (in progress). ⚠️ Deterministic CI conclusion not yet available; verdict is advisory only.
- Relevant check runs for
- Static impact reasoning: The diff exclusively modifies
evaluation/— notraining/rl/paths are touched, so the Isaac Sim ABI guard does not fire. Theonnxruntime-gpu1.25.1 bump is the sole ABI-sensitive change; GPU-specific behavior cannot be validated by hosted CI.
onnxscript (0.6.2 → 0.7.0)
Advisory: No CVE or GHSA identifiers. Package is pre-1.0; minor version increments may include breaking API changes without SemVer obligation. Release notes: github.com/microsoft/onnxscript/releases.
Repo-specific risk: Medium. If evaluation/ scripts call onnxscript APIs directly, validate against the 0.7 changelog. If onnxscript is only a transitive dependency of onnxruntime, risk is lower.
gymnasium (1.2.3 → 1.3.0)
Advisory: No CVE or GHSA identifiers. Minor bump within 1.x stable series. Low risk.
tensordict (0.12.1 → 0.12.2), lerobot (0.5.0 → 0.5.1), hypothesis (6.151.13 → 6.152.4), matplotlib (3.10.8 → 3.10.9)
Advisory: No CVE or GHSA identifiers for any of these. All are patch bumps. Low risk.
Dependabot caps (.github/dependabot.yml)
The second commit adds four ignore rules to prevent future grouped bumps from re-introducing incompatible versions:
marshmallow >= 4.0.0— correct;azure-ai-ml==1.32.0has a transitive hard requirement onmarshmallow < 4.torch >= 2.11.0— correct;lerobot == 0.5.1constrains torch to < 2.11.numpy >= 2.3.0— correct; transitive constraint from the above.packaging >= 26.0— correct; transitive constraint.
These caps are appropriate and prevent silent breakage in future Dependabot runs. Revisit when azure-ai-ml >= 2 or lerobot >= 0.6 is adopted.
Advisory verdict: COMMENT — CI is still in progress and onnxruntime-gpu is an ABI-sensitive bump that hosted CI cannot fully validate. All other changes are low-risk patch or minor bumps. The human-authored cap commit correctly reverts unsafe Dependabot bumps and adds protective ignore rules.
Note
🔒 Integrity filter blocked 1 item
The following item were blocked because they don't meet the GitHub integrity level.
- #562
pull_request_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
To allow these resources, lower min-integrity in your GitHub frontmatter:
tools:
github:
min-integrity: approved # merged | approved | unapproved | noneGenerated by AW Dependabot PR Review for issue #562 · ● 1.7M
|
@WilliamBerryiii because of issues with bumping, I manually reverted some, and added current exclusions to dependabot.yml. Please review if OK before we merge? Thanks. |
There was a problem hiding this comment.
Advisory Review Summary
Surfaces touched: python-runtime (evaluation) — evaluation/pyproject.toml, evaluation/uv.lock; .github/dependabot.yml (ignore rules added).
PR body discrepancy: The Dependabot-generated description lists 11 package bumps, but
numpy(2.2.6→2.4.4),marshmallow(3.26.2→4.3.0),torch(2.10.0→2.11.0), andpackaging(25.0→26.2) are not changed inevaluation/pyproject.toml. This PR simultaneously adds upper-bound ignore rules in.github/dependabot.ymlcapping all four, attributed toazure-ai-ml<2andlerobot<0.6transitive constraints. The 7 bumps below reflect the actual manifest changes.
| Package | From | To | Severity | Surface |
|---|---|---|---|---|
onnxruntime-gpu |
1.24.4 | 1.25.1 | python-runtime/evaluation | |
onnxscript |
0.6.2 | 0.7.0 | Low (pre-1.0 minor) | python-runtime/evaluation |
gymnasium |
1.2.3 | 1.3.0 | Low | python-runtime/evaluation |
tensordict |
0.12.1 | 0.12.2 | Low | python-runtime/evaluation |
lerobot |
0.5.0 | 0.5.1 | Low | python-runtime/evaluation |
hypothesis |
6.151.13 | 6.152.4 | Low (dev dep) | python-runtime/evaluation |
matplotlib |
3.10.8 | 3.10.9 | Low (dev dep) | python-runtime/evaluation |
onnxruntime-gpu
Advisory enrichment: No GHSA or CVE identifiers found in the PR body. OSV.dev and NVD could not be queried (network sandbox). No known critical published advisory for the 1.24→1.25 version pair. Upstream repository: microsoft/onnxruntime; compare range: microsoft/onnxruntime@v1.24.4...v1.25.1
Repo-specific risk: onnxruntime-gpu is explicitly classified as ABI-sensitive under the python-runtime (evaluation) surface rubric (Isaac Sim / CUDA dependency chain). Direct usage found in evaluation/sil/play_policy.py:245. A minor version bump may shift CUDA toolkit or cuDNN requirements in ways that hosted CI (CPU-only runners) cannot validate.
Validation advice: Run play_policy.py against a GPU node to confirm CUDA runtime compatibility before merging.
Validation Signal
- Deterministic CI:
PR Validation: success— run link. Per-surface check runs forEvaluation Pytest Tests,Pytest Inference, andPython Lintcould not be individually enumerated (MCP integrity filter); overall orchestrator conclusion issuccess. - Static impact reasoning:
onnxruntime-gpumatches the python-runtime ABI-sensitive trigger list. Minor bump (1.24→1.25) reduces likelihood of a hard ABI break versus a major, but GPU driver compatibility on target hardware (H100, RTX PRO 6000 with GRID drivers) requires manual validation. Isaac Sim ABI guard is N/A —training/rl/is not in this diff.
onnxscript
Advisory enrichment: No advisory identifiers in PR body. Pre-1.0 package; 0.6→0.7 is a minor version increment. Upstream compare: microsoft/onnxscript@0.6.2...0.7.0
Repo-specific risk: No direct import onnxscript found in evaluation/ Python sources — this is a transitive dependency of onnxruntime. Low operational risk.
Validation Signal
- Deterministic CI:
PR Validation: success(see above). - Static impact reasoning: Transitive-only usage; no direct API calls to patch.
gymnasium
Advisory enrichment: No advisory identifiers in PR body. Minor bump 1.2.3→1.3.0. Upstream: https://github.com/Farama-Foundation/Gymnasium/releases/tag/v1.3.0
Repo-specific risk: Used in evaluation/sil/monitor_checkpoints.py and evaluation/sil/play_policy.py. Minor bump within the stable 1.x series; API surface is stable.
Validation Signal
- Deterministic CI:
PR Validation: success(see above). - Static impact reasoning: 1.x minor bump; low risk.
tensordict
Advisory enrichment: No advisory identifiers in PR body. Patch bump 0.12.1→0.12.2.
Repo-specific risk: Torch companion library. Patch releases are generally safe; torch itself is pinned at 2.10.0 (unchanged) so there is no cross-version ABI risk between torch and tensordict in this PR.
Validation Signal
- Deterministic CI:
PR Validation: success(see above). - Static impact reasoning: Patch bump, low risk.
lerobot
Advisory enrichment: No advisory identifiers in PR body. Patch bump 0.5.0→0.5.1.
Repo-specific risk: Patch bump, low risk. See inline comment on this line for the version skew with evaluation/sil/docker/requirements-lerobot-eval.txt.
Validation Signal
- Deterministic CI:
PR Validation: success(see above). - Static impact reasoning: Patch bump; Docker requirements divergence noted in inline comment.
hypothesis / matplotlib
Advisory enrichment: Dev-only dependencies. No advisory identifiers. Patch bumps (hypothesis 6.151.13→6.152.4, matplotlib 3.10.8→3.10.9). Minimal risk; test tooling only.
Dependabot Ignore Rules Added
This PR adds upper-bound caps in .github/dependabot.yml for the /evaluation pip ecosystem:
| Package | Cap | Reason |
|---|---|---|
marshmallow |
< 4.0.0 |
azure-ai-ml<2 transitive constraint |
torch |
< 2.11.0 |
lerobot<0.6 transitive constraint |
numpy |
< 2.3.0 |
azure-ai-ml<2 / lerobot<0.6 transitive constraint |
packaging |
< 26.0 |
azure-ai-ml<2 transitive constraint |
Revisit these caps when azure-ai-ml>=2 or lerobot>=0.6 releases compatible versions.
Uncovered Manifest
evaluation/sil/docker/requirements-lerobot-eval.txt pins lerobot==0.4.1. This file is not touched by this PR. It is tracked under the docker Dependabot entry at /evaluation/sil/docker (which manages Dockerfiles, not pip requirements files), so no pip Dependabot entry covers it. The version skew with the main manifest (lerobot==0.5.1) grows with each lerobot bump. Consider adding a pip Dependabot entry for /evaluation/sil/docker or manually aligning this file.
Advisory verdict: COMMENT — onnxruntime-gpu (1.24.4→1.25.1) is an ABI-sensitive CUDA package that triggers mandatory maintainer review per the python-runtime (evaluation) surface rubric. Overall PR Validation CI passed (success), but GPU driver compatibility on target hardware cannot be confirmed by hosted runners alone.
Note
🔒 Integrity filter blocked 1 item
The following item were blocked because they don't meet the GitHub integrity level.
- #562
pull_request_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
To allow these resources, lower min-integrity in your GitHub frontmatter:
tools:
github:
min-integrity: approved # merged | approved | unapproved | noneGenerated by AW Dependabot PR Review for issue #562 · ● 1.4M
Bumps the inference-dependencies group with 11 updates in the /evaluation directory:
2.2.62.4.43.26.24.3.025.026.20.6.20.7.01.24.41.25.11.2.31.3.02.10.02.11.00.12.10.12.20.5.00.5.16.151.136.152.43.10.83.10.9Updates
numpyfrom 2.2.6 to 2.4.4Release notes
Sourced from numpy's releases.
... (truncated)
Changelog
Sourced from numpy's changelog.
... (truncated)
Commits
be93fe2Merge pull request #31090 from charris/prepare-2.4.4f5245dcREL: Prepare for the NumPy 2.4.4 release02e838bMerge pull request #31084 from charris/backport-31056fa74b2dMAINT: numpy.i: Replace deprecatedsprintfwithsnprintf(#31056)533a6dbMerge pull request #31079 from charris/backport-208019e496cbTST: fix POWER VSX feature mapping (#30801)8052c4bMerge pull request #31058 from charris/backport-310217f13b5aMAINT: Skip test on PyPy.4c5fdd6MAINT: Remove unused import of tracemalloc.a3ca5edUpdate numpy/_core/src/multiarray/shape.cUpdates
marshmallowfrom 3.26.2 to 4.3.0Changelog
Sourced from marshmallow's changelog.
... (truncated)
Commits
b596fdbBump version and update changelog256f0aaAdd pre/post_load parameters to Field (#2799)c847ad4Typing improvements to marshmallow.validate (#2940)eb86322Remove redundant docs job (#2939)a44ad62Avoid infinite recursion in nesting docs (#2938)3360e34Bump version and update changelog7b9ce45Fix changelog typos and update releasing docsf07eadcFix validate.Email to accept IDNs (#2937)4acb783Fix Unreachable Warning (#2935)3492faeRemove redundant python-version (#2932)Updates
packagingfrom 25.0 to 26.2Release notes
Sourced from packaging's releases.
... (truncated)
Changelog
Sourced from packaging's changelog.
... (truncated)
Commits
84a87eeBump for release4a616b6docs: a few more updates to prepare for 26.2 (#1176)9de6f44ci: use native uv integration in rtd (#1175)bc76e14chore: update changelog for 26.2 (#1161)3f00091tests: add a pickle check (#1174)48a8a06fix: make Requirements/Markers pickle-safe (#1171)823b44efix: make Tags pickle-safe (#1170)4bed32dfix: make Specifier / SpecifierSet pickle-safe (#1168)963118efix: re-export ExceptionGroup for now (#1164)66e34a8docs(specifiers): add is_unsatisfiable() usage example (#1166)Updates
onnxscriptfrom 0.6.2 to 0.7.0Release notes
Sourced from onnxscript's releases.
... (truncated)
Commits
df97c94Add an option to not inline a function when building the graph (#2851)90f754achore(deps): bump actions/upload-pages-artifact from 4 to 5 (#2895)b068297Bumped version to 0.7.0 (#2894)c8f5f6aMake GraphBuilder.init use keyword-only args after graph (#2893)c6e8ec6Handling initializers in GraphBuilder (#2889)63ffecffix: normalize cache key dtype to prevent initializer name collisions (#2888)13f265cfix(fuse_batchnorm): support convtranpose + bn fusion with group != 1 (#2879)6c092e2Add fusion rule to remove Expand before broadcast-capable binary operators (#...c7d13fbAdd input() and add_output() methods to GraphBuilder (#2828)864b785Fix BatchNorm fusion producing invalid ONNX when Conv nodes share weight init...Updates
onnxruntime-gpufrom 1.24.4 to 1.25.1Release notes
Sourced from onnxruntime-gpu's releases.
... (truncated)
Commits
8a77e45rel-1.25.1 cherry-pick round 2 (#28224)6fd52e4ORT 1.25.1 release: version bump and cherry-pick #27907 (#28149)7a71bc5Cherry-pick CI/pipeline fixes for rel-1.25.0 (#28106)211edbcFF rel-1.25 to last merge prior to version bump & add first round of cherry p...57b265e[MLAS] Add depthwise with multiplier conv special kernel for NCHW data layout...bec2792Plugin EP event profiling APIs (#27649)a997c4f[VitisAI] external_ep_library typo fix (#27647)f2c28e2S390x test fixes (#27404)0f43e16[QNN-EP] Fix use-after-free of logger object (#27804)f22e3a9webgpu: Optimize DP4A SmallM MatMulNBits tiling (#27910)Updates
gymnasiumfrom 1.2.3 to 1.3.0Release notes
Sourced from gymnasium's releases.
Commits
eb5c00eUpdate to use Taxi-v44436f89fix incorrectTypeVaruse incoreforRenderFrame(#1560)877ba30Update to 1.3.0c3b809fUpdate Taxi to V4 and fixis_rainyimplementation (#1561)9e6f855AddRepeatActionwrapper (#1553)1532e66Add external environment Hill Climb Racing Env (#1554)df8704cAdd boltcrypt to third party environments (#1557)01c0d39Add external environment firecastrl (wildfire env) (#1551)9edc68eFix spelling intest_mujoco_v5.py(#1550)a31fa4bChange action seed forMuJoCo/test_verify_reward_survivetest, to be valid ...Updates
torchfrom 2.10.0 to 2.11.0Release notes
Sourced from torch's releases.
... (truncated)
Commits
70d99e9[release only] Increase timeout for rocm libtorch and manywheel builds (#178006)3e05c5a[MPS] Properly handle conjugated tensors in bmm (#178010)db741c7[MPS] fix compiling of SDPA producing nan results (#178009)483b55dUpdate pytorch_sphinx_theme2 version to 0.4.6 (#177616)7f2cdeb[windows][smoke test] Add an option to install cuda if required cuda/cudnn on...76fd078[release-only] Fix libtorch builds. Fix lint (#177299)fa384de[Inductor][MPS] Fix half-precision type mismatches in Metal shader codegen (#...036b25fLet stable::from_blob accept a lambda as deleter (cherry-pick) (#176440)41f8e3e[CI] Stop using G3 runners (#177161)e2fa295[CD] Unpin cuda-bindings dependencies (#177159)Updates
tensordictfrom 0.12.1 to 0.12.2Release notes
Sourced from tensordict's releases.
Commits
8ee33fa[Release] Bump version to 0.12.2dcb6ddd[BugFix] fix ragged_idx of consolidated tensor (#1675)85ea4e7[CI] Temporarily use vmoens/test-infra fork for macOS buildsUpdates
lerobotfrom 0.5.0 to 0.5.1Release notes
Sourced from lerobot's releases.