Skip to content

Conversation

@dagil-nvidia
Copy link
Contributor

🎯 Overview

Implements comprehensive automated dependency tracking across all Dynamo components with full modularization, unit tests, and automated documentation generation.

Key Capabilities

  • 🔍 10 Source Types: Dockerfiles, requirements.txt, pyproject.toml, go.mod, Helm charts, docker-compose, rust-toolchain, Cargo.toml, K8s recipes, shell scripts
  • 📊 Smart CSV Output: 13 columns tracking 262 dependencies with critical flagging
  • 🔄 Nightly Automation: Runs at 2 AM UTC, creates PRs only when changes detected
  • 📸 Release Snapshots: Auto-triggers on release/* branches
  • ⚠️ Version Discrepancy Detection: Flags conflicts across repo
  • 🔔 Failure Monitoring: Auto-creates GitHub issues on failures
  • 🧩 Modular Architecture: 1,900+ lines extracted into 9 reusable modules
  • Unit Tests: 145+ test cases (@pytest.mark.weekly, non-blocking)
  • 📄 Auto-Generated Docs: framework_versions.md updated nightly

📦 Modular Architecture

Core Modules (1,900+ lines extracted):

  1. constants.py (100 lines) - NVIDIA_INDICATORS, NORMALIZATIONS, COMPONENT_ORDER
  2. utils/formatting.py (330 lines) - Name formatting and normalization
  3. utils/comparison.py (170 lines) - Version discrepancy detection
  4. utils/urls.py (120 lines) - URL generation (PyPI, NGC, Docker Hub)
  5. extractors/base.py (130 lines) - Base extractor class
  6. extractors/python_deps.py (230 lines) - Python dependency extractor
  7. tests/test_formatting.py (180 lines) - 95 test cases
  8. tests/test_python_extractor.py (120 lines) - 50 test cases

📁 Files (22 total, all relevant)

New Files (21)

  1. .github/workflows/extract_dependency_versions.py (2,491 lines) - Main extraction script
  2. .github/dependency-extraction/config.yaml (173 lines) - Component configuration
  3. .github/workflows/dependency-extraction.yml (340 lines) - Unified workflow
  4. .github/actions/dependency-extraction-setup/action.yml (48 lines) - Composite action
  5. .github/reports/README.md (154 lines) - CSV documentation
  6. .github/reports/dependency_versions_latest.csv (262 dependencies) - Latest snapshot
  7. .github/reports/releases/.gitkeep - Release snapshots directory
  8. .github/scripts/dependency-extraction/constants.py (100 lines) - Hardcoded values
  9. .github/scripts/dependency-extraction/utils/formatting.py (330 lines)
  10. .github/scripts/dependency-extraction/utils/comparison.py (170 lines)
  11. .github/scripts/dependency-extraction/utils/urls.py (120 lines)
  12. .github/scripts/dependency-extraction/extractors/base.py (130 lines)
  13. .github/scripts/dependency-extraction/extractors/python_deps.py (230 lines)
  14. .github/scripts/dependency-extraction/tests/test_formatting.py (180 lines)
  15. .github/scripts/dependency-extraction/tests/test_python_extractor.py (120 lines)
  16. .github/scripts/dependency-extraction/generate_framework_versions.py (320 lines)
  17. .github/scripts/dependency-extraction/README.md (290 lines)
  18. framework_versions.md (101 lines) - Auto-generated framework versions

Modified Files (1)

  • .gitignore - Added dependency extraction patterns

✨ Key Features

1. Dynamic Framework Versions Doc (NEW!)

Auto-generates framework_versions.md nightly with:

  • Quick reference table (vLLM, TensorRT-LLM, SGLang, CUDA, Python)
  • Latest vs release comparison
  • CUDA versions extracted from image tags
  • Build locations and configurations
  • Links to full dependency reports

Addresses PR #3572 but with automatic generation instead of manual updates.

2. Comprehensive Dependency Tracking

  • 262 dependencies tracked across 5 components
  • 55 critical dependencies flagged
  • 21 base/runtime images documented
  • Extracts from 10 different source types

3. Version Discrepancy Detection

Automatically detects and warns when dependencies are pinned differently:

  • PyTorch: 2.8.0 (trtllm) vs 2.7.1+cu128 (vllm)
  • Outputs GitHub Actions warnings in CI
  • Documents known discrepancies in config

4. Modular, Testable Code

  • Extracted utilities for reuse
  • Base extractor pattern for new sources
  • 145+ unit tests ensuring correctness
  • Clear separation of concerns

🧪 Testing

Unit Tests: 145+ test cases

  • test_formatting.py: 95 tests for name formatting/normalization
  • test_python_extractor.py: 50 tests for requirements.txt/pyproject.toml parsing

Markers: @pytest.mark.unit, @pytest.mark.weekly, @pytest.mark.gpu_0

  • Runs weekly (non-blocking, won't delay PR merges)
  • CPU-only (no GPU resources needed)

📚 Documentation

  • Architecture: .github/scripts/dependency-extraction/README.md (290 lines)
  • CSV Structure: .github/reports/README.md (154 lines)
  • Framework Versions: framework_versions.md (101 lines, auto-generated)
  • Configuration: .github/dependency-extraction/config.yaml

🔄 Workflows

Nightly Mode (2 AM UTC):

  • Extracts dependencies → CSV
  • Generates framework_versions.md
  • Creates PR if changes detected
  • Uploads artifacts (90-day retention)

Release Mode (on release/* push):

  • Creates versioned snapshot
  • Stores in .github/reports/releases/
  • Creates PR for review

✅ Compliance

  • ✅ All commits DCO signed
  • ✅ Conventional commits (feat: prefix)
  • ✅ Pre-commit hooks (black, isort, ruff, end-of-file-fixer)
  • ✅ SPDX headers on all files
  • ✅ Pytest markers (unit, weekly, gpu_0)
  • ✅ No unrelated files (clean PR diff)

Related Issues

Fixes #DYN-1235
Addresses PR #3572 (framework versions doc, but auto-generated)
Supersedes PR #3547 (had merge conflicts)


Credits

Addresses review feedback from:

Implements comprehensive dependency tracking across all Dynamo components with
full modularization, unit tests, and automated documentation generation.

Core System:
- Automated extraction from 10 source types
- Nightly workflow with smart PR creation
- Release snapshot system
- Version discrepancy detection
- Automated failure monitoring

Modular Architecture (1,900+ lines extracted):
- constants.py: Centralized hardcoded values
- utils/: formatting, comparison, URL generation
- extractors/: Base class + Python extractor (pattern for future)
- tests/: 145+ unit tests (@pytest.mark.weekly)

Documentation:
- framework_versions.md: Auto-generated framework versions (101 lines)
- .github/reports/README.md: CSV structure and workflows
- .github/scripts/dependency-extraction/README.md: Architecture (290 lines)

Workflows:
- Unified dependency-extraction.yml (nightly + release modes)
- Composite action for reusable setup
- Auto-creates GitHub issues on failure

Files (22 total):
- 18 new files (scripts, config, docs, tests, modules)
- 1 modified (.gitignore)
- 1 generated (dependency_versions_latest.csv)
- 1 auto-generated (framework_versions.md)

Testing:
- 145+ unit tests with proper pytest markers
- Tests run weekly (non-blocking)
- Coverage for formatting and Python extractor

Addresses:
- DYN-1235: Automated dependency tracking
- PR #3572: Framework versions doc (but auto-generated)
- Review feedback from @nv-anants (all 6 comments)

Supersedes: PR #3547 (had merge conflicts and unrelated files)

Signed-off-by: Dan Gil <[email protected]>
@dagil-nvidia dagil-nvidia requested a review from a team as a code owner October 22, 2025 05:12
@copy-pr-bot
Copy link

copy-pr-bot bot commented Oct 22, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@github-actions github-actions bot added the feat label Oct 22, 2025
@dagil-nvidia dagil-nvidia changed the title feat(deps): add automated dependency tracking and extraction system feat: add automated dependency tracking and extraction system Oct 22, 2025
@dagil-nvidia dagil-nvidia self-assigned this Oct 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant