Skip to content

feat(scripts): consolidate scripts library paths and enhance dataviewer#383

Merged
agreaves-ms merged 10 commits into
mainfrom
feat/dataviewer-additional-fixes
Apr 6, 2026
Merged

feat(scripts): consolidate scripts library paths and enhance dataviewer#383
agreaves-ms merged 10 commits into
mainfrom
feat/dataviewer-additional-fixes

Conversation

@agreaves-ms
Copy link
Copy Markdown
Contributor

@agreaves-ms agreaves-ms commented Apr 2, 2026

feat(scripts): consolidate scripts library paths and enhance dataviewer

Pull Request

Description

This PR eliminates symlink indirection across the repository by replacing all library symlinks with direct references to the canonical scripts/lib/ location using a dynamic REPO_ROOT resolution pattern. Alongside the library consolidation, the dataviewer backend gained ffmpeg-based frame extraction with an OpenCV fallback and a complete pandas-to-PyArrow migration in the LeRobot loader. Development conventions were codified into the copilot instructions file to establish normative guidance for agents.

The symlink removal is a breaking change — any external tooling or workflows that sourced libraries from the intermediate symlink paths (scripts/lib/, infrastructure/setup/lib/, data-management/setup/lib/, evaluation/sil/scripts/lib/, training/*/scripts/lib/) must update to the canonical scripts/lib/ location.

Library Consolidation

Nine symlinks across five directories were deleted, and all consuming scripts were updated to compute the repository root at runtime via git rev-parse --show-toplevel with a cd-based fallback. This touched 23 shell scripts (deploy, cleanup, optional, submission, and pipeline scripts) and 24+ PowerShell scripts (CI linting, security, and test infrastructure).

  • Removed symlinks in data-management/setup/lib/, infrastructure/setup/lib/, scripts/lib/, evaluation/sil/scripts/lib/, training/il/scripts/lib/, and training/rl/scripts/lib/
  • Updated all shell scripts with the REPO_ROOT derivation pattern sourcing $REPO_ROOT/scripts/lib/common.sh and $REPO_ROOT/scripts/lib/terraform-outputs.sh
  • Rewrote PowerShell imports from scripts/lib/Modules/CIHelpers.psm1 to scripts/lib/Modules/CIHelpers.psm1 across CI linting, security, and test scripts
  • Updated shellcheck source directives to match the new canonical paths

Dataviewer Backend

The dataset service received two significant improvements targeting performance and compatibility.

  • Refactored hdf5_handler.py with a two-tier frame extraction strategy: ffmpeg via subprocess with a 10-second timeout as the primary path, falling back to OpenCV when ffmpeg is unavailable
    • Added _extract_frame_ffmpeg() and _extract_frame_cv2() static methods
    • Improved cv2 import error handling with early return and logger warning for graceful degradation
  • Migrated lerobot_loader.py from pandas to PyArrow table API, eliminating table.to_pandas() conversions
    • Added _column_to_numpy() helper for list-typed parquet columns
    • Replaced df.iterrows() with direct column access via table.column().to_numpy() and PyArrow compute filters
  • Promoted numpy, pyarrow, and Pillow from optional to base dependencies in pyproject.toml
  • Added TestVideoGenerationFallback test class in test_hdf5_handler.py covering cv2 import failure scenarios

Frontend

  • Added overflow-y-auto to the annotation workspace container in AnnotationWorkspaceContent.tsx and removed constraining min-h-0 from the Tabs component
  • Simplified the export dialog episode display from comma-separated enumeration to a count in ExportDialog.tsx

Documentation and Configuration

  • Extended .github/copilot-instructions.md with comprehensive conventions covering Python, TypeScript, testing, Git workflow, CI/CD pipeline, and validation commands
  • Updated .github/instructions/shell-scripts.instructions.md with the REPO_ROOT derivation pattern and Repository Root subsection
  • Corrected library path references in docs/reference/scripts.md
  • Updated .gitignore to exclude the .copilot-tracking/ directory

Type of Change

  • 🐛 Bug fix (non-breaking change fixing an issue)
  • ✨ New feature (non-breaking change adding functionality)
  • 💥 Breaking change (fix or feature causing existing functionality to change)
  • 📚 Documentation update
  • 🏗️ Infrastructure change (Terraform/IaC)
  • ♻️ Refactoring (no functional changes)

Component(s) Affected

  • infrastructure/terraform/prerequisites/ - Azure subscription setup
  • infrastructure/terraform/ - Terraform infrastructure
  • infrastructure/setup/ - OSMO control plane / Helm
  • workflows/ - Training and evaluation workflows
  • training/ - Training pipelines and scripts
  • docs/ - Documentation

Testing Performed

  • Terraform plan reviewed (no unexpected changes)
  • Terraform apply tested in dev environment
  • Training scripts tested locally with Isaac Sim
  • OSMO workflow submitted successfully
  • Smoke tests passed (smoke_test_azure.py)

Documentation Impact

  • No documentation changes needed
  • Documentation updated in this PR
  • Documentation issue filed

Bug Fix Checklist

Complete this section for bug fix PRs. Skip for other contribution types.

  • Linked to issue being fixed
  • Regression test included, OR
  • Justification for no regression test:

Checklist

@agreaves-ms agreaves-ms requested a review from a team as a code owner April 2, 2026 05:15
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 2, 2026

Dependency Review

The following issues were found:

  • ✅ 0 vulnerable package(s)
  • ✅ 0 package(s) with incompatible licenses
  • ✅ 0 package(s) with invalid SPDX license definitions
  • ⚠️ 64 package(s) with unknown licenses.
  • ⚠️ 6 packages with OpenSSF Scorecard issues.

View full job summary

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 2, 2026

Codecov Report

❌ Patch coverage is 63.70656% with 94 lines in your changes missing coverage. Please review.
✅ Project coverage is 50.46%. Comparing base (78e90d0) to head (2b896cc).

Files with missing lines Patch % Lines
.../viewer/backend/src/api/services/lerobot_loader.py 10.00% 36 Missing ⚠️
scripts/lib/Get-VerifiedDownload.ps1 71.69% 30 Missing ⚠️
...rc/api/services/dataset_service/lerobot_handler.py 48.38% 16 Missing ⚠️
scripts/tests/Mocks/GitMocks.psm1 0.00% 5 Missing ⚠️
scripts/tests/pester.config.ps1 0.00% 3 Missing ⚠️
...d/src/api/services/dataset_service/hdf5_handler.py 80.00% 1 Missing ⚠️
scripts/lib/terraform-outputs.ps1 97.91% 1 Missing ⚠️
scripts/security/Test-SHAStaleness.ps1 50.00% 1 Missing ⚠️
scripts/tests/Invoke-PesterTests.ps1 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #383      +/-   ##
==========================================
+ Coverage   43.72%   50.46%   +6.74%     
==========================================
  Files         243      267      +24     
  Lines       14910    18098    +3188     
  Branches     1855     1855              
==========================================
+ Hits         6519     9134    +2615     
- Misses       8101     8674     +573     
  Partials      290      290              
Flag Coverage Δ *Carryforward flag
pester 81.96% <77.59%> (+4.28%) ⬆️
pytest 6.89% <ø> (ø) Carriedforward from eb908e4
pytest-dataviewer 61.97% <30.26%> (-0.01%) ⬇️
vitest 50.72% <ø> (ø)

*This pull request uses carry forward flags. Click here to find out more.

Files with missing lines Coverage Δ
...nnotation-workspace/AnnotationWorkspaceContent.tsx 86.66% <ø> (ø)
...er/frontend/src/components/export/ExportDialog.tsx 15.15% <ø> (ø)
scripts/Update-TerraformDocs.ps1 72.85% <100.00%> (ø)
scripts/lib/Modules/CIHelpers.psm1 100.00% <ø> (ø)
scripts/linting/ConvertTo-JUnitXml.ps1 93.18% <100.00%> (ø)
scripts/linting/Invoke-FrontmatterValidation.ps1 91.29% <100.00%> (ø)
scripts/linting/Invoke-GoLint.ps1 91.01% <100.00%> (ø)
scripts/linting/Invoke-GoTest.ps1 92.76% <100.00%> (ø)
scripts/linting/Invoke-LinkLanguageCheck.ps1 82.75% <100.00%> (ø)
scripts/linting/Invoke-MsDateFreshnessCheck.ps1 90.35% <100.00%> (ø)
... and 22 more
🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

@nguyena2 nguyena2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR removes scripts/lib/Modules, but these workflows still import CIHelpers from scripts/lib/Modules/CIHelpers.psm1. Please update all workflow imports (and pester-tests path mapping logic) to CIHelpers.psm1 to match the new canonical location.

@agreaves-ms agreaves-ms force-pushed the feat/dataviewer-additional-fixes branch from 7b7cdde to 3f97911 Compare April 2, 2026 17:56
Comment thread .github/instructions/shell-scripts.instructions.md Outdated
Comment thread data-management/setup/deploy-dataviewer.sh Outdated
Comment thread infrastructure/setup/01-deploy-robotics-charts.sh Outdated
Comment thread evaluation/sil/scripts/submit-osmo-eval.sh Outdated
Comment thread training/rl/scripts/submit-azureml-training.sh Outdated
Comment thread data-management/viewer/backend/tests/test_hdf5_handler.py
…raction

- add fallback for video frame extraction using OpenCV
- include new dependencies: numpy, pyarrow, and Pillow
- update tests to cover new video generation logic

🎥 - Generated by Copilot
🔧 - Generated by Copilot
… path

- Replace scripts/lib/Modules/CIHelpers.psm1 with shared/lib/Modules/CIHelpers.psm1 across 6 workflow files
- Update pester-tests.yml changed-file path mapping from scripts/lib/ to shared/lib/

🤖 - Generated by Copilot
- Parenthesize fallback in REPO_ROOT derivation across 23 shell scripts to prevent two-line output
- Update shell-scripts.instructions.md template with corrected pattern
- Remove dead pyarrow try/except guard in lerobot_loader.py
- Add from __future__ import annotations to lerobot_handler.py and lerobot_loader.py
- Remove quoted type annotation made redundant by future annotations
- Fix cv2 fallback test to clear sys.modules before mocking __import__

🤖 - Generated by Copilot
- fix segmented Join-Path references using 'shared' 'lib' string segments
- fix path depth off-by-one in test files (shared/ci/tests/ was depth 3, scripts/tests/ is depth 2)
- fix pester.config.ps1 logs path and repoRoot Split-Path calculation

🔧 - Generated by Copilot
@agreaves-ms agreaves-ms force-pushed the feat/dataviewer-additional-fixes branch from 960f1e8 to 46682d9 Compare April 6, 2026 17:55
- fix CIHelpers import path in Invoke-TerraformDocsCheck.ps1
- fix dot-source path in Update-TerraformDocs.Tests.ps1
- tag subprocess-spawning test blocks as Integration
- format markdown table column alignment

🔧 - Generated by Copilot
- test successful extraction returns JPEG bytes
- test graceful fallback when ffmpeg is missing
- test non-zero exit code returns None
- test seek time calculation from frame index and fps

🧪 - Generated by Copilot
@agreaves-ms agreaves-ms changed the title feat(scripts): consolidate shared library paths and enhance dataviewer feat(scripts): consolidate scripts library paths and enhance dataviewer Apr 6, 2026
@agreaves-ms agreaves-ms merged commit 176d9c9 into main Apr 6, 2026
30 checks passed
@agreaves-ms agreaves-ms deleted the feat/dataviewer-additional-fixes branch April 6, 2026 19:14
WilliamBerryiii pushed a commit that referenced this pull request Apr 8, 2026
🤖 I have created a release *beep* *boop*
---


##
[0.6.0](v0.5.0...v0.6.0)
(2026-04-08)


### ✨ Features

* **build:** add terraform-docs generation pipeline
([#378](#378))
([78e90d0](78e90d0))
* **infrastructure:** enable optional AML diagnostic logs
([#400](#400))
([58dd8db](58dd8db))
* **scripts:** consolidate scripts library paths and enhance dataviewer
([#383](#383))
([176d9c9](176d9c9))


### 🐛 Bug Fixes

* **build:** remediate CVEs, enforce equality pinning, repair Dependabot
config
([#391](#391))
([0c29148](0c29148))
* **infrastructure:** add Storage File Data Privileged Contributor role
for ML identity
([#380](#380))
([378f7ed](378f7ed))
* **infrastructure:** replace hardcoded NAT Gateway availability zones
with variable
([#356](#356))
([a1397bd](a1397bd))
* **infrastructure:** resolve TFLint violations and enable hard-fail
([#376](#376))
([dfb55cd](dfb55cd))
* **scripts:** add dot-source guard to Invoke-MsDateFreshnessCheck.ps1
([#397](#397))
([f6f22c3](f6f22c3))
* **training:** validate AzureML and OSMO RL submissions end to end
([#372](#372))
([49904d3](49904d3))


### 📚 Documentation

* **infrastructure:** add terraform-docs tooling and improve developer
experience
([#365](#365))
([a0fb03a](a0fb03a))
* **reference:** centralize workflow template docs and convert workflow
READMEs to pointer index
([#379](#379))
([68097e4](68097e4))


### 🔧 Miscellaneous

* **deps-dev:** bump the npm_and_yarn group across 1 directory with 2
updates
([#374](#374))
([d848c8b](d848c8b))
* **deps-dev:** bump vite from 6.4.1 to 6.4.2 in
/data-management/viewer/frontend in the npm_and_yarn group across 1
directory
([#395](#395))
([6ec7f19](6ec7f19))
* **deps:** bump the github-actions group across 1 directory with 7
updates
([#370](#370))
([4d1b951](4d1b951))
* **deps:** bump the uv group across 2 directories with 1 update
([#373](#373))
([ba66ed9](ba66ed9))


### 🔒 Security

* **deps-dev:** bump brace-expansion from 1.1.12 to 1.1.13 in
/docs/docusaurus in the npm_and_yarn group across 1 directory
([#389](#389))
([27129d9](27129d9))
* **deps-dev:** bump the npm_and_yarn group across 2 directories with 2
updates
([#363](#363))
([aeae624](aeae624))
* **deps-dev:** bump the python-dependencies group with 5 updates
([#403](#403))
([bb85560](bb85560))
* **deps:** bump cryptography from 46.0.5 to 46.0.6 in /training/rl
([#367](#367))
([a82dd68](a82dd68))
* **deps:** bump the inference-dependencies group in /evaluation with 2
updates
([#401](#401))
([c88d253](c88d253))
* **deps:** bump the pip group across 4 directories with 2 updates
([#411](#411))
([1230fe0](1230fe0))
* **deps:** bump the training-dependencies group across 1 directory with
67 updates
([#375](#375))
([8e05172](8e05172))
* **deps:** bump the uv group across 2 directories with 1 update
([#382](#382))
([b6c7aea](b6c7aea))
* **deps:** update marshmallow requirement from &lt;4.3.0,&gt;=3.5 to
&gt;=3.5,&lt;4.4.0 in /evaluation in the inference-dependencies group
([#393](#393))
([599c7eb](599c7eb))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

---------

Co-authored-by: physical-ai-toolchain-release[bot] <267194360+physical-ai-toolchain-release[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants