Skip to content

Conversation

@deanq
Copy link
Contributor

@deanq deanq commented Aug 16, 2025

Summary

Implements parallel-connection download acceleration strategies for the Tetra boot process:

  • Nala for system dependencies
  • UV for Python dependencies
  • hf_transfer/hf_xet or manual parallel file downloads (tetra strategy)

...also all of the above are performed in parallel

Test Plan

  • Unit tests for all download strategies and components
  • Integration tests for HF model acceleration workflows
  • Edge case testing for network failures and fallbacks
  • Performance testing with large models (>1GB)
  • Backward compatibility verification
  • Local handler testing with comprehensive test cases

Requires release of runpod/tetra-rp#83 first

deanq added 7 commits August 15, 2025 17:05
Add core download acceleration modules with aria2c integration:
- download_accelerator.py: Main acceleration classes with multi-connection downloads
- huggingface_accelerator.py: Specialized HF model acceleration
- constants.py: Download acceleration configuration constants
- __init__.py: Package structure for src module
Enhanced dependency installation with intelligent acceleration:
- Auto-detects large packages for acceleration (torch, transformers, etc.)
- Integrates with remote executor for acceleration control
- Maintains backward compatibility with existing workflows
- Provides graceful fallback when aria2c unavailable
Enhanced workspace manager with HuggingFace model pre-caching:
- Pre-cache specified HF models before function execution
- Integrates with volume-aware caching system
- Optimizes cold start times for ML workloads
Comprehensive test suite for download acceleration:
- Integration tests for aria2 detection and fallback behavior
- HF model acceleration testing with authentication
- Volume-aware acceleration scenarios
- Error handling and performance validation
- Update test files moved to src/ directory
- Enhanced test coverage for acceleration features
- Updated dependencies and documentation
- Submodule updates for tetra-rp
@pandyamarut pandyamarut self-requested a review August 18, 2025 22:03
Copy link
Contributor

@pandyamarut pandyamarut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments on the library/model lists.

@deanq deanq marked this pull request as ready for review August 19, 2025 17:49
@deanq deanq requested a review from pandyamarut August 19, 2025 17:50
deanq added 12 commits August 19, 2025 16:07
- Added nala accelerated installation for large system packages
- Enhanced DependencyInstaller with automatic nala fallback to apt-get
- Updated Docker images to include nala package manager
- Added comprehensive system package acceleration tests
- Improved acceleration logging with system package status
Simplify dependency installation by removing aria2c acceleration for Python packages.
UV's built-in parallel downloading and caching is superior and eliminates the need
for additional complexity.

Changes:
- Remove LARGE_PACKAGE_PATTERNS from constants.py
- Simplify DependencyInstaller.install_dependencies() to single parameter
- Remove Python package acceleration logic and related methods
- Update RemoteExecutor to use simplified API
- Update tests to match new simplified interface

System package acceleration (nala) and HuggingFace model acceleration remain intact
as they provide meaningful performance benefits over standard tools.

Core functionality verified:
- All handler tests pass (8/8)
- All unit tests pass (98/98)
- Code quality checks pass (format, lint, typecheck)
Add conditional acceleration logic - passes accelerate_downloads to installers, HF model caching only when accelerated + models specified
…bled

Implement _install_with_pip() method and route between UV (accelerated) vs pip (standard) based on accelerate_downloads parameter
Add HfXetDownloader for subsequent downloads, implement smart strategy: hf_xet for cached files → hf_transfer for fresh downloads → fallback
Add tests for both acceleration enabled/disabled scenarios, verify UV vs pip routing, update existing test assertions
Update test expectations to handle accelerate_downloads parameter in integration scenarios
Update build files and dependency locks to support new acceleration functionality
Always use UV for Python package installation regardless of acceleration setting.
The _install_with_pip method has been removed as UV provides more reliable
virtual environment handling and package management.

- Remove _install_with_pip() method (70 lines)
- Simplify install_dependencies() to always use UV
- Maintain differential installation when acceleration is enabled
Update dependency installer tests to reflect the removal of pip support:
- Fix test_install_dependencies_with_acceleration_disabled to expect UV
- Rename test_install_dependencies_pip_failure to test_install_dependencies_uv_failure
- Update assertions to check for "uv pip" commands
- Update test descriptions and expected error messages

All tests now correctly validate UV-only package installation behavior.
Rename test_pip_no_acceleration.json to test_uv_no_acceleration.json
and update content to reflect UV-only package installation:
- Update function name from test_pip_installation_without_acceleration
  to test_uv_installation_without_acceleration
- Update success message to reference UV instead of pip
- Maintain same test logic for package import validation

This test validates that packages installed with accelerate_downloads=False
are properly available using UV package manager.
deanq added 3 commits August 20, 2025 23:06
Add parallel installation of dependencies when acceleration is enabled:
- Add async wrappers for dependency and model download methods
- Implement _install_dependencies_parallel() using asyncio.gather()
- Add _install_dependencies_sequential() for non-accelerated path
- Add _process_parallel_results() for error handling
- Route between parallel/sequential execution based on accelerate_downloads flag

When accelerate_downloads=True, system packages, Python packages, and HF model
downloads execute concurrently for improved performance.
Add accelerate_model_download_async() method to WorkspaceManager to support
parallel execution of model downloads when acceleration is enabled.

This async wrapper allows HF model downloads to run concurrently with
dependency installations for improved performance.
Update test mocks and expectations for parallel execution implementation:
- Fix AsyncMock setup for async dependency installation methods
- Update test_dependency_management.py for async method calls
- Update test_download_acceleration_integration.py for parallel execution
- Update test_remote_executor.py with proper AsyncMock usage

All tests now properly mock async methods and validate parallel execution
behavior when acceleration is enabled.
@deanq deanq changed the title feat: Add download acceleration with aria2c integration feat: Add download acceleration for dependencies & hugging face Aug 21, 2025
deanq added 4 commits August 21, 2025 03:33
- Remove 4 obsolete test files (debug logging, subprocess debug, vLLM symlink, redundant HF)
- Add 6 new comprehensive test files covering advanced functionality:
  * test_system_dependencies.json - System package installation
  * test_class_persistence.json - Instance reuse with instance_id
  * test_function_args.json - Serialized arguments/kwargs testing
  * test_mixed_dependencies.json - Combined system + Python dependencies
  * test_class_custom_method.json - Custom method execution
  * test_error_scenarios.json - Error handling and edge cases
- Update CLAUDE.md to fix test file location references

Total test coverage: 11 files (was 5) covering all handler functionality
- Remove custom HfXetDownloader class (~160 lines) - now redundant
- Update huggingface_hub requirement to >=0.32.0 for automatic hf_xet
- Leverage HF Hub's native snapshot_download() with transparent acceleration
- Simplify HuggingFaceAccelerator to use HF's built-in caching and Xet support
- Update workspace_manager to trust HF's cache hierarchy (HF_HOME only)
- Remove manual Xet detection and file-by-file download logic
- Update tests to reflect native HF Hub integration approach
- Add documentation for automatic HF acceleration features

Benefits:
- Automatic chunk-level deduplication via native hf_xet integration
- Simplified codebase with 332 fewer lines of redundant code
- Better performance using HF's battle-tested acceleration
- Future-proof - automatically works with new Xet-enabled repos
- Transparent operation - no code changes needed for acceleration
- Add strategy pattern for HF model downloads with tetra and native implementations
- Implement model pattern matching for selective acceleration
- Add comprehensive test coverage for download strategies
- Integrate with existing workspace and cache management systems
@deanq deanq merged commit f17e013 into main Aug 27, 2025
14 checks passed
@deanq deanq deleted the deanq/ae-1075-download-accelerator branch August 27, 2025 21:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants