security: pytest harness, dependabot advisories, and OSSF Scorecard remediations#501
Conversation
- Bump lodash override to 4.18.0 (3 prototype pollution advisories) - Add follow-redirects >=1.16.0 <2.0.0 override (header leak advisory) - Update mlflow pins to stable 3.11.1 in IL training and eval configs - Add smol-toml >=1.6.1 root override to eliminate vulnerable 1.6.0 - Regenerate lockfiles for docs/docusaurus and evaluation 🔒 - Generated by Copilot
… compliance - pin numpy, marshmallow, packaging, torch to resolved uv.lock versions - regenerate uv.lock after pinning changes 🔒 - Generated by Copilot
- Accept packaging==26.1 from main over 25.0 📦 - Generated by Copilot
…ecard - add workflow-permissions.md cataloging 15 job-scoped write permissions across 9 workflows with rationale - cover SARIF upload, release artifact attachment, and Sigstore attestation cases - remove redundant top-level security-events permission from check-binary-integrity.yml - link new doc from docs/security/README.md - add intoto to cspell dictionary 📝 - Generated by Copilot
Dependency ReviewThe following issues were found:
Snapshot WarningsEnsure that dependencies are being submitted on PR branches and consider enabling retry-on-snapshot-warnings. See the documentation for more information and troubleshooting advice. License Issuesevaluation/uv.lock
OpenSSF ScorecardScorecard details
Scanned Files
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #501 +/- ##
==========================================
+ Coverage 65.07% 67.61% +2.53%
==========================================
Files 253 265 +12
Lines 15621 16851 +1230
Branches 2087 2266 +179
==========================================
+ Hits 10166 11394 +1228
Misses 5165 5165
- Partials 290 292 +2
🚀 New features to boost your workflow:
|
📝 - Generated by Copilot
- add 12 test modules and conftest under evaluation/tests (217 tests, 99.87% branch coverage) - add reusable workflow .github/workflows/evaluation-pytests.yml with codecov OIDC upload (flag: pytest-evaluation) - wire workflow into .github/workflows/pr-validation.yml and configure codecov.yml flag - pin evaluation/pyproject.toml security upgrades (numpy, marshmallow, packaging, torch, mlflow); requires-python >=3.12 - pin smol-toml 1.6.1 (exact) in package.json overrides - replace datetime.utcnow() with timezone-aware now() in evaluation/metrics/upload_artifacts.py Refs #440 🧪 - Generated by Copilot
…pendency-advisories # Conflicts: # evaluation/uv.lock # package.json
- Pin pip install of uv==0.10.9 with --require-hashes and SHA256 for all 17 wheels + sdist in both Dockerfiles - Resolves OSSF Scorecard Pinned-Dependencies findings: - data-management/viewer/backend/Dockerfile:12 - evaluation/sil/docker/Dockerfile.lerobot-eval:8-9 - Expected score: Pinned-Dependencies 9 -> 10 🤖 - Generated by Copilot
- Add cspell words array for envaccount, envcontainer, myacct, mycontainer, noseparator, preds, xticklabels - Loosen evaluation packaging pin to >=24.2,<26.0 to satisfy lerobot 0.5.0 - Add E402 noqa to test imports gated by pytest.importorskip - Discard unused mock_mlflow tuple element to clear RUF059 - Trim trailing blank line from workflow-permissions.md to clear MD012 🤖 - Generated by Copilot
- relax packaging pin to <26.0 for lerobot 0.5.0 compatibility - add E402 noqa for sys.path-dependent imports in policy tests - rename unused mock fixtures to _mock_mlflow (RUF059) - format lerobot eval test files - fix MD012 trailing blank in workflow-permissions docs - add envaccount, envcontainer, mycontainer to cspell dictionary 🔧 - Generated by Copilot
- Reformat lerobot eval test files with ruff - Pin packaging==25.0 to satisfy SHA-pinning compliance check 🤖 - Generated by Copilot
…pendency-advisories # Conflicts: # evaluation/sil/docker/Dockerfile.lerobot-eval
- Update _load_skrl test to assert enable_training_mode(enabled=False, apply_to_models=True) - Source switched to skrl 2.0 API in #492; test on this branch was stale 🤖 Generated by Copilot
…pendency-advisories
|
A few nits from review — none are blockers, just observations for awareness: 1. Tests use hardcoded 2. 3. |
rezatnoMsirhC
left a comment
There was a problem hiding this comment.
Solid PR — thorough Dockerfile hash pinning across all 18 platform artifacts, correct OSSF Token-Permissions fix in check-binary-integrity.yml, well-structured test harness with good fixture isolation in conftest.py, and a correct datetime.utcnow() → datetime.now(UTC) fix. The new reusable workflow is correctly SHA-pinned with persist-credentials: false and proper OIDC-gated Codecov upload.
Four non-blocking comments inline. The packaging version discrepancy (26.1 in description vs 25.0 in code) is worth a follow-up clarification before this branch is used as a dependency baseline.
Align evaluation-pytests.yml with other workflows pinned to v8.1.0 (08807647e7069bb48b6ef5acd8ec9567f424441b). 🔧 - Generated by Copilot
…sions doc - resolve PYSEC-2025-49 by bumping packaging from 25.0 to 26.1 - add missing top-level heading to docs/security/workflow-permissions.md 🔒 - Generated by Copilot
…nize tests - replace hardcoded paths with env vars (MLFLOW_CONFIG_PATH, AML_DOWNLOAD_DIR, AML_CONFIG_PATH, DATA_ROOT, DATASET_CONFIG_PATH) with sensible defaults - replace fragile runpy/exec+source test loading with importlib spec loader - use tmp_path and monkeypatch.setenv for test isolation ♻️ - Generated by Copilot
…duplicate H1 - revert packaging==26.1 to ==25.0 because lerobot==0.5.0 requires packaging<26.0; PYSEC-2025-49 is a setuptools advisory (fixed in 78.1.1), not packaging - remove duplicate H1 in workflow-permissions.md; markdownlint MD025 treats frontmatter title: as the document H1 🔧 - Generated by Copilot
- branch protection requires a status check named 'pr-validation-summary' but no workflow produced it, leaving every PR stuck on 'Expected - Waiting for status to be reported' - new aggregator runs after every PR validation job with if: always(), reads needs.*.result, and fails only when an upstream job neither succeeded nor was skipped 🔧 - Generated by Copilot
🤖 I have created a release *beep* *boop* --- ## [0.8.0](v0.7.4...v0.8.0) (2026-05-08) ### ⚠ BREAKING CHANGES * **dataviewer:** bump frontend stack to React 19, Vite 8, Tailwind v4, MSAL 5, ESLint 10 ([#524](#524)) ### ✨ Features * **agents:** add automated validation for high-risk Dependabot bumps ([#574](#574)) ([8c3686a](8c3686a)), closes [#573](#573) * **data:** add camera selector to annotation workspace and fix AV1 frame extraction ([#591](#591)) ([c809d2f](c809d2f)) * **data:** seed dataviewer frontend test foundation and per-section codecov flags ([#594](#594)) ([c06c4e3](c06c4e3)) * **dataviewer:** add OWASP security middleware stack ([#439](#439)) ([239edb9](239edb9)) * **infrastructure:** add conversion pipeline Terraform module ([#542](#542)) ([244531e](244531e)) * **infrastructure:** upgrade OSMO to chart 1.2.1 / image 6.2 with secure auth and skrl 2.0.0 compatibility ([#492](#492)) ([edfd7a5](edfd7a5)) * **pipeline:** add ACSA setup for ROS2 bag sync to Blob ([#451](#451)) ([c271a54](c271a54)) * **workflows:** add advisory Dependabot PR reviewer agentic workflow ([#498](#498)) ([d4bb140](d4bb140)) * **workflows:** trigger AW Dependabot PR reviewer after PR Validation ([#580](#580)) ([7ab3d16](7ab3d16)) ### 🐛 Bug Fixes * **ci:** correct stale version comment for actions/create-github-app-token ([#506](#506)) ([b2e9a54](b2e9a54)) * **ci:** restore data-pipeline and training broken tests by domain folder restructure ([#547](#547)) ([06d8472](06d8472)) * **docs:** update remaining stale 'Coming soon' labels in docs/README.md ([#507](#507)) ([02439d6](02439d6)) * **docs:** update stale coming soon label for Training section ([#472](#472)) ([46db49b](46db49b)) * **evaluation:** scope SIL AzureML validation code path and script reference ([#387](#387)) ([9f138a9](9f138a9)) * **infrastructure:** OSMO workflow execution, PostgreSQL public access, and quickstart corrections ([#477](#477)) ([9ed2da6](9ed2da6)) * **scripts:** exclude CHANGELOG.md from changed-files msdate check ([#644](#644)) ([8133bdc](8133bdc)) * **workflows:** allow dependabot[bot] to activate AW Dependabot PR Review ([#586](#586)) ([39dc022](39dc022)) * **workflows:** correct branches filter on AW Dependabot PR Review workflow_run trigger ([#584](#584)) ([fe06b52](fe06b52)) * **workflows:** normalize validate.yaml placeholder env/compute values ([#510](#510)) ([340ff44](340ff44)) * **workflows:** recompile aw-dependabot-pr-review lock file ([#576](#576)) ([d77c167](d77c167)) * **workflows:** switch AW Dependabot PR Review to pull_request_target ([#589](#589)) ([3f1edd1](3f1edd1)) ### 📚 Documentation * **docs:** Fix deployment guide links ([#614](#614)) ([0070b04](0070b04)) * document dependency-pinning-artifacts directory purpose ([#508](#508)) ([50e0010](50e0010)) ### 📦 Build System * **training:** standardize on Python 3.12 across manifests, containers, and runtime scripts ([#541](#541)) ([7ad014a](7ad014a)) ### 🔧 Operations * **build:** add Copilot cloud agent setup-steps workflow ([#593](#593)) ([c912668](c912668)) ### 🔧 Miscellaneous * **build:** exclude auto-generated CHANGELOG.md from cspell and seed dictionary ([#582](#582)) ([de1dd57](de1dd57)) * **build:** redesign codecov flags and split pytest CI per component ([#520](#520)) ([357e745](357e745)) * **dataviewer:** bump frontend stack to React 19, Vite 8, Tailwind v4, MSAL 5, ESLint 10 ([#524](#524)) ([50f8ad4](50f8ad4)) * **dataviewer:** repoint stale src/dataviewer references to data-management/viewer ([#504](#504)) ([88fa1b4](88fa1b4)), closes [#503](#503) * **deps-dev:** bump basic-ftp from 5.3.0 to 5.3.1 ([#618](#618)) ([ca10f2a](ca10f2a)) * **deps-dev:** bump globals from 15.15.0 to 17.5.0 in /data-management/viewer/frontend ([#527](#527)) ([0e0b2ae](0e0b2ae)) * **deps-dev:** bump ip-address from 10.1.0 to 10.2.0 ([#616](#616)) ([816c9cf](816c9cf)) * **deps-dev:** bump lint-staged from 16.4.0 to 17.0.2 in the root-npm-dependencies group across 1 directory ([#626](#626)) ([0e2f293](0e2f293)) * **deps-dev:** bump pydantic from 2.13.3 to 2.13.4 in the python-dependencies group across 1 directory ([#629](#629)) ([c24f1c1](c24f1c1)) * **deps-dev:** bump the python-dependencies group across 1 directory with 2 updates ([#514](#514)) ([8410f4b](8410f4b)) * **deps:** bump azure-core from 1.39.0 to 1.40.0 in /evaluation in the inference-dependencies group across 1 directory ([#597](#597)) ([6141db4](6141db4)) * **deps:** bump cryptography from 46.0.6 to 46.0.7 in /data-management/viewer ([#424](#424)) ([5fb6d58](5fb6d58)) * **deps:** bump cryptography from 46.0.6 to 46.0.7 in /data-management/viewer/backend ([#423](#423)) ([b516ad5](b516ad5)) * **deps:** bump lucide-react from 0.469.0 to 1.8.0 in /data-management/viewer/frontend ([#528](#528)) ([1bdfc1e](1bdfc1e)) * **deps:** bump nginx from `8aa63af` to `5616878` in /data-management/viewer/frontend ([#511](#511)) ([9e7e20e](9e7e20e)) * **deps:** bump nginx from 1.27-alpine to 1.29-alpine in /data-management/viewer/frontend ([#484](#484)) ([0e5c3dd](0e5c3dd)) * **deps:** bump node from `435f353` to `e49fd70` in /data-management/viewer/frontend ([#560](#560)) ([2884649](2884649)) * **deps:** bump react-is from 18.3.1 to 19.2.5 in /data-management/viewer/frontend ([#530](#530)) ([d51318c](d51318c)) * **deps:** bump tensordict from 0.11.0 to 0.12.1 in /evaluation in the inference-dependencies group across 1 directory ([#456](#456)) ([b24e733](b24e733)) * **deps:** bump the dataviewer-backend-dependencies group across 1 directory with 2 updates ([#531](#531)) ([171a1da](171a1da)) * **deps:** bump the dataviewer-backend-dependencies group across 1 directory with 5 updates ([#516](#516)) ([4f9a577](4f9a577)) * **deps:** bump the dataviewer-backend-dependencies group across 1 directory with 5 updates ([#602](#602)) ([6c27ab5](6c27ab5)) * **deps:** bump the dataviewer-dependencies group across 1 directory with 2 updates ([#529](#529)) ([8646971](8646971)) * **deps:** bump the dataviewer-dependencies group across 1 directory with 3 updates ([#601](#601)) ([d28fb50](d28fb50)) * **deps:** bump the dataviewer-dependencies group across 1 directory with 3 updates ([#632](#632)) ([4ca5f3e](4ca5f3e)) * **deps:** bump the dataviewer-dependencies group across 1 directory with 5 updates ([#515](#515)) ([109ee81](109ee81)) * **deps:** bump the dataviewer-frontend-patch-minor group across 1 directory with 6 updates ([#630](#630)) ([04d5dfd](04d5dfd)) * **deps:** bump the dataviewer-frontend-patch-minor group across 1 directory with 9 updates ([#563](#563)) ([c08f450](c08f450)) * **deps:** bump the docusaurus-dependencies group across 1 directory with 4 updates ([#627](#627)) ([f5825fc](f5825fc)) * **deps:** bump the docusaurus-dependencies group across 1 directory with 6 updates ([#599](#599)) ([b859344](b859344)) * **deps:** bump the github-actions group across 1 directory with 4 updates ([#459](#459)) ([2609c52](2609c52)) * **deps:** bump the github-actions group across 1 directory with 4 updates ([#517](#517)) ([f54bf5d](f54bf5d)) * **deps:** bump the inference-dependencies group across 1 directory with 11 updates ([#562](#562)) ([087f53a](087f53a)) * **deps:** bump the inference-dependencies group across 1 directory with 2 updates ([#628](#628)) ([4a3be47](4a3be47)) * **deps:** bump the pip group across 2 directories with 1 update ([#494](#494)) ([a14b6b0](a14b6b0)) * **docs:** update stale Python 3.11 references to 3.12 ([#575](#575)) ([6f85c95](6f85c95)) * **scripts:** remove redundant SC1091 disables in OSMO deploy scripts ([#509](#509)) ([ae1cb82](ae1cb82)) ### 🔒 Security * **build:** pin dependencies and hash-verify downloads ([#465](#465)) ([0289f49](0289f49)) * **build:** remediate dependency security advisories ([#479](#479)) ([7196d6d](7196d6d)) * **deps-dev:** bump basic-ftp from 5.2.1 to 5.2.2 ([#454](#454)) ([cb158f1](cb158f1)) * **deps-dev:** bump basic-ftp from 5.2.2 to 5.3.0 ([#495](#495)) ([e983b8b](e983b8b)) * **deps-dev:** bump hypothesis from 6.152.3 to 6.152.4 in the python-dependencies group ([#598](#598)) ([83384d2](83384d2)) * **deps-dev:** bump markdownlint-cli2 from 0.22.0 to 0.22.1 in the root-npm-dependencies group ([#559](#559)) ([32bde35](32bde35)) * **deps-dev:** bump picomatch from 2.3.1 to 2.3.2 in /docs/docusaurus ([#455](#455)) ([66f86ca](66f86ca)) * **deps-dev:** bump postcss from 8.5.10 to 8.5.12 in /data-management/viewer/frontend ([#569](#569)) ([a652dba](a652dba)) * **deps-dev:** bump the python-dependencies group with 2 updates ([#457](#457)) ([749d231](749d231)) * **deps-dev:** bump the python-dependencies group with 2 updates ([#485](#485)) ([71b44fd](71b44fd)) * **deps-dev:** bump the python-dependencies group with 3 updates ([#564](#564)) ([9fc52fd](9fc52fd)) * **deps-dev:** bump typescript from 6.0.2 to 6.0.3 in /docs/docusaurus in the docusaurus-dependencies group ([#513](#513)) ([5694dbc](5694dbc)) * **deps:** bump azureml/openmpi4.1.0-ubuntu22.04 from 20260303.v5 to 20260409.v4 in /evaluation/sil/docker ([#480](#480)) ([25d4df8](25d4df8)) * **deps:** bump cryptography from 46.0.6 to 46.0.7 in /evaluation in the uv group across 1 directory ([#538](#538)) ([92c5b2e](92c5b2e)) * **deps:** bump diffusers from 0.35.2 to 0.38.0 in /training/il/lerobot ([#638](#638)) ([6261d19](6261d19)) * **deps:** bump follow-redirects from 1.15.11 to 1.16.0 in /docs/docusaurus ([#469](#469)) ([0458908](0458908)) * **deps:** bump gitpython and mako for lerobot IL training ([#623](#623)) ([9f8022b](9f8022b)) * **deps:** bump node from 24.14.1-slim to 25.9.0-slim in /data-management/viewer/frontend ([#482](#482)) ([1532d09](1532d09)) * **deps:** bump packaging from 26.0 to 26.1 in /evaluation in the inference-dependencies group ([#483](#483)) ([f4afb6c](f4afb6c)) * **deps:** bump pillow from 12.1.1 to 12.2.0 ([#467](#467)) ([39fb663](39fb663)) * **deps:** bump python from 3.11-slim to 3.14-slim in /data-management/viewer/backend ([#481](#481)) ([7af9dfc](7af9dfc)) * **deps:** bump the dataviewer-backend-dependencies group across 1 directory with 15 updates ([#428](#428)) ([e4446a2](e4446a2)) * **deps:** bump the dataviewer-backend-dependencies group in /data-management/viewer/backend with 4 updates ([#487](#487)) ([0f57c5b](0f57c5b)) * **deps:** bump the dataviewer-backend-dependencies group in /data-management/viewer/backend with 8 updates ([#566](#566)) ([d6e7869](d6e7869)) * **deps:** bump the dataviewer-dependencies group across 1 directory with 5 updates ([#464](#464)) ([24c208d](24c208d)) * **deps:** bump the dataviewer-dependencies group in /data-management/viewer with 2 updates ([#486](#486)) ([90149f3](90149f3)) * **deps:** bump the dataviewer-dependencies group in /data-management/viewer with 6 updates ([#565](#565)) ([f0bb36b](f0bb36b)) * **deps:** bump the dataviewer-frontend-patch-minor group across 1 directory with 10 updates ([#613](#613)) ([e481f83](e481f83)) * **deps:** bump the github-actions group across 1 directory with 4 updates ([#534](#534)) ([5478ab6](5478ab6)) * **deps:** bump the github-actions group with 2 updates ([#488](#488)) ([4e6ce98](4e6ce98)) * **deps:** bump the github-actions group with 3 updates ([#567](#567)) ([48c38dc](48c38dc)) * **deps:** bump the github-actions group with 3 updates ([#634](#634)) ([00cfb49](00cfb49)) * **deps:** bump the github-actions group with 6 updates ([#603](#603)) ([73eb79a](73eb79a)) * **deps:** bump the training-dependencies group across 1 directory with 23 updates ([#463](#463)) ([d5a8656](d5a8656)) * **deps:** bump yaml from 2.8.2 to 2.8.3 in /data-management/viewer/frontend ([#453](#453)) ([10449df](10449df)) * pytest harness, dependabot advisories, and OSSF Scorecard remediations ([#501](#501)) ([e8756e8](e8756e8)) * **scripts:** pin and hash-verify all shell script downloads ([#468](#468)) ([0c2bb9c](0c2bb9c)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). --------- Co-authored-by: physical-ai-toolchain-release[bot] <267194360+physical-ai-toolchain-release[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Closes #440
Closes #502
Description
This PR bundles three related work streams that all touch the security/dependency-pinning surface and share a common dependency closure:
evaluation/package.The streams are bundled because the security upgrades (notably
mlflow 3.11.1requiring Python>=3.12) cascade into the dependency closure exercised by the new test suite, and the Scorecard remediations touch the same Dockerfiles and workflow files — splitting them would leave the branch in a non-buildable or partially-remediated state.Test Harness
evaluation/tests/covering policy evaluation, runners, plotting, AzureML/MLflow bootstrap, blob/model download, batch eval, robot types, and artifact upload — 217 tests, 99.87% branch coverage..github/workflows/evaluation-pytests.ymlwithif: !cancelled()guard, OIDC-authenticated Codecov upload under flagpytest-evaluation, and Python 3.12 runtime..github/workflows/pr-validation.yml;codecov.ymlupdated with the new flag.evaluation/metrics/upload_artifacts.py: replaced deprecateddatetime.utcnow()with timezone-awaredatetime.now(timezone.utc)to silence test warnings and match Python 3.12 guidance.Dependabot Security Advisories Addressed
>=1.16.0 <2.0.0npm overridedocs/docusaurus/package.json3.11.0rc1→3.11.1evaluation/pyproject.toml,requirements-lerobot-eval.txt,training/il/lerobot/pyproject.toml1.6.1(exact) npm overridepackage.json4.17.21→4.18.1docs/docusaurus/package.jsonOSSF Scorecard Remediations (closes #502)
Token-Permissions (documentation)
docs/security/workflow-permissions.mddocuments the three workflows that require write-scoped permissions and the rationale for each (CodeQLsecurity-events: write, Dependency Reviewpull-requests: write, Scorecardid-token: write+security-events: write).Pinned-Dependencies
pip installofuv==0.10.9by SHA256 hash with--require-hashesin both Dockerfiles that previously used unpinnedpipCommandinvocations:data-management/viewer/backend/Dockerfileevaluation/sil/docker/Dockerfile.lerobot-evalDependency Conflict Resolution
The
mlflow 3.11.1upgrade requiredpython >=3.12. To keepevaluation/solvable the following bumps were applied inevaluation/pyproject.toml:requires-python:>=3.11→>=3.12numpy:<2.0.0→<3.0.0marshmallow:<4.0→<5.0torch:<2.8→<3.0packaging: pinned to26.1(matches mlflow upper bound;lerobot 0.5.0upper-bound conflict tolerated via existing lockfile resolution, see test(evaluation): add unit test infrastructure and initial test suite #440)evaluation/uv.lockwas re-resolved against the merged tree.Deviations from Plan
1.6.1rather than>=1.6.1to satisfy the repository's dependency-pinning lint rule.if: !cancelled()so cancellations do not erroneously fail coverage reporting.--require-hasheswith the full 18-artifact hash list instead of pinning to a single platform-specific wheel hash, to preserve cross-arch/OS portability.Type of Change
Component(s) Affected
evaluation/— pytest harness, security pins,requires-pythonfloor.github/workflows/— new reusableevaluation-pytestsworkflow + PR validation wiringdocs/docusaurus/— npm advisory remediation (follow-redirects, lodash)docs/security/— workflow-permissions exceptions documentationtraining/il/lerobot/— mlflow pin alignmentdata-management/viewer/backend/Dockerfile— uv pinned by SHA256evaluation/sil/docker/Dockerfile.lerobot-eval— uv pinned by SHA256package.json— smol-toml override,codecov.ymlflagTesting Performed
pytest evaluation/tests/— 217/217 passing, 99.87% branch coveragenpm audit(root): 0 vulnerabilitiesnpm audit(docs/docusaurus): 0 vulnerabilitiesuv lock(evaluation): resolves cleanly against the merged treepipCommandfindings remained for Pinned-Dependencies, both addressed by this PRDocumentation Impact
docs/security/workflow-permissions.mddocumenting Token-Permissions exceptionsBug Fix Checklist
Checklist
if: !cancelled()main(merged viaortstrategy)🤖 - Generated by Copilot