generated from runpod-workers/worker-template
-
Notifications
You must be signed in to change notification settings - Fork 0
feat: Add download acceleration for dependencies & hugging face #22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add core download acceleration modules with aria2c integration: - download_accelerator.py: Main acceleration classes with multi-connection downloads - huggingface_accelerator.py: Specialized HF model acceleration - constants.py: Download acceleration configuration constants - __init__.py: Package structure for src module
Enhanced dependency installation with intelligent acceleration: - Auto-detects large packages for acceleration (torch, transformers, etc.) - Integrates with remote executor for acceleration control - Maintains backward compatibility with existing workflows - Provides graceful fallback when aria2c unavailable
Enhanced workspace manager with HuggingFace model pre-caching: - Pre-cache specified HF models before function execution - Integrates with volume-aware caching system - Optimizes cold start times for ML workloads
Comprehensive test suite for download acceleration: - Integration tests for aria2 detection and fallback behavior - HF model acceleration testing with authentication - Volume-aware acceleration scenarios - Error handling and performance validation
- Update test files moved to src/ directory - Enhanced test coverage for acceleration features - Updated dependencies and documentation - Submodule updates for tetra-rp
pandyamarut
approved these changes
Aug 18, 2025
pandyamarut
suggested changes
Aug 18, 2025
Contributor
pandyamarut
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comments on the library/model lists.
- Added nala accelerated installation for large system packages - Enhanced DependencyInstaller with automatic nala fallback to apt-get - Updated Docker images to include nala package manager - Added comprehensive system package acceleration tests - Improved acceleration logging with system package status
Simplify dependency installation by removing aria2c acceleration for Python packages. UV's built-in parallel downloading and caching is superior and eliminates the need for additional complexity. Changes: - Remove LARGE_PACKAGE_PATTERNS from constants.py - Simplify DependencyInstaller.install_dependencies() to single parameter - Remove Python package acceleration logic and related methods - Update RemoteExecutor to use simplified API - Update tests to match new simplified interface System package acceleration (nala) and HuggingFace model acceleration remain intact as they provide meaningful performance benefits over standard tools. Core functionality verified: - All handler tests pass (8/8) - All unit tests pass (98/98) - Code quality checks pass (format, lint, typecheck)
Add conditional acceleration logic - passes accelerate_downloads to installers, HF model caching only when accelerated + models specified
…bled Implement _install_with_pip() method and route between UV (accelerated) vs pip (standard) based on accelerate_downloads parameter
Add HfXetDownloader for subsequent downloads, implement smart strategy: hf_xet for cached files → hf_transfer for fresh downloads → fallback
Add tests for both acceleration enabled/disabled scenarios, verify UV vs pip routing, update existing test assertions
Update test expectations to handle accelerate_downloads parameter in integration scenarios
Update build files and dependency locks to support new acceleration functionality
Always use UV for Python package installation regardless of acceleration setting. The _install_with_pip method has been removed as UV provides more reliable virtual environment handling and package management. - Remove _install_with_pip() method (70 lines) - Simplify install_dependencies() to always use UV - Maintain differential installation when acceleration is enabled
Update dependency installer tests to reflect the removal of pip support: - Fix test_install_dependencies_with_acceleration_disabled to expect UV - Rename test_install_dependencies_pip_failure to test_install_dependencies_uv_failure - Update assertions to check for "uv pip" commands - Update test descriptions and expected error messages All tests now correctly validate UV-only package installation behavior.
Rename test_pip_no_acceleration.json to test_uv_no_acceleration.json and update content to reflect UV-only package installation: - Update function name from test_pip_installation_without_acceleration to test_uv_installation_without_acceleration - Update success message to reference UV instead of pip - Maintain same test logic for package import validation This test validates that packages installed with accelerate_downloads=False are properly available using UV package manager.
Add parallel installation of dependencies when acceleration is enabled: - Add async wrappers for dependency and model download methods - Implement _install_dependencies_parallel() using asyncio.gather() - Add _install_dependencies_sequential() for non-accelerated path - Add _process_parallel_results() for error handling - Route between parallel/sequential execution based on accelerate_downloads flag When accelerate_downloads=True, system packages, Python packages, and HF model downloads execute concurrently for improved performance.
Add accelerate_model_download_async() method to WorkspaceManager to support parallel execution of model downloads when acceleration is enabled. This async wrapper allows HF model downloads to run concurrently with dependency installations for improved performance.
Update test mocks and expectations for parallel execution implementation: - Fix AsyncMock setup for async dependency installation methods - Update test_dependency_management.py for async method calls - Update test_download_acceleration_integration.py for parallel execution - Update test_remote_executor.py with proper AsyncMock usage All tests now properly mock async methods and validate parallel execution behavior when acceleration is enabled.
- Remove 4 obsolete test files (debug logging, subprocess debug, vLLM symlink, redundant HF) - Add 6 new comprehensive test files covering advanced functionality: * test_system_dependencies.json - System package installation * test_class_persistence.json - Instance reuse with instance_id * test_function_args.json - Serialized arguments/kwargs testing * test_mixed_dependencies.json - Combined system + Python dependencies * test_class_custom_method.json - Custom method execution * test_error_scenarios.json - Error handling and edge cases - Update CLAUDE.md to fix test file location references Total test coverage: 11 files (was 5) covering all handler functionality
- Remove custom HfXetDownloader class (~160 lines) - now redundant - Update huggingface_hub requirement to >=0.32.0 for automatic hf_xet - Leverage HF Hub's native snapshot_download() with transparent acceleration - Simplify HuggingFaceAccelerator to use HF's built-in caching and Xet support - Update workspace_manager to trust HF's cache hierarchy (HF_HOME only) - Remove manual Xet detection and file-by-file download logic - Update tests to reflect native HF Hub integration approach - Add documentation for automatic HF acceleration features Benefits: - Automatic chunk-level deduplication via native hf_xet integration - Simplified codebase with 332 fewer lines of redundant code - Better performance using HF's battle-tested acceleration - Future-proof - automatically works with new Xet-enabled repos - Transparent operation - no code changes needed for acceleration
- Add strategy pattern for HF model downloads with tetra and native implementations - Implement model pattern matching for selective acceleration - Add comprehensive test coverage for download strategies - Integrate with existing workspace and cache management systems
pandyamarut
approved these changes
Aug 27, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Implements parallel-connection download acceleration strategies for the Tetra boot process:
...also all of the above are performed in parallel
Test Plan
Requires release of runpod/tetra-rp#83 first