Skip to content

Conversation

@deanq
Copy link
Member

@deanq deanq commented Aug 15, 2025

Introduces support for download acceleration from Tetra runtime. This also speeds up remote execution startup times by pre-caching pip dependencies and HuggingFace models.

  • Parallel downloads for large files
  • Applies accelerated downloads to known large pip libraries and hugging face models
  • Smart-caching of hugging face models in the container or network volume

Includes new accelerate_downloads and hf_models_to_cache parameters for the @remote decorator with backward compatibility.

# Accelerated downloads are on by default. To turn it off...
from tetra_rp import remote, LiveServerless

gpu_config = LiveServerless(name="my_server")

@remote(
    resource_config=gpu_config,
    dependencies=["vllm"],
    accelerate_downloads=False,
)
...
# cache the model to the container after the first time it is downloaded
from tetra_rp import remote, LiveServerless

gpu_config = LiveServerless(name="my_server")

@remote(
    resource_config=gpu_config,
    dependencies=["vllm"],
    hf_models_to_cache=["facebook/opt-125m"],
)
...
# cache the model to the volume after the first time it is downloaded
from tetra_rp import remote, LiveServerless, NetworkVolume

gpu_config = LiveServerless(
    name="my_server",
    networkVolume=NetworkVolume(name="my_volume")
)

@remote(
    resource_config=gpu_config,
    dependencies=["diffusers", "torch", "transformers", "accelerate", "xformers"],
    hf_models_to_cache=["runwayml/stable-diffusion-v1-5"]
)
...

Related to runpod-workers/worker-tetra#22

deanq added 6 commits August 15, 2025 16:53
- Add pydantic>=2.0.0 for enhanced protocol models
- Version bump to 0.10.0 for download acceleration feature
- Add accelerate_downloads and hf_models_to_cache fields to FunctionRequest
- Enhance Pydantic models with improved type annotations and documentation
- Maintain backward compatibility with existing protocol
- Support HuggingFace model pre-caching for faster inference startup
- Add accelerate_downloads and hf_models_to_cache parameters to @Remote decorator
- Update function and class decoration to pass acceleration options
- Extend docstring with comprehensive parameter documentation
- Enable HuggingFace model pre-caching through decorator configuration
- Add acceleration parameters to create_remote_class function
- Store acceleration settings in RemoteClassWrapper instances
- Pass acceleration options through to remote execution requests
- Maintain compatibility with existing class decoration patterns
- Extend prepare_request methods to accept acceleration parameters
- Update request building to include new acceleration fields
- Maintain consistency across execution pathways
- Preserve existing stub interface contracts
- Update create_remote_class calls to include new acceleration parameters
- Ensure all existing tests pass with enhanced function signatures
- Add proper parameter defaults for backward compatibility
- Maintain test coverage for class execution patterns
Copy link
Contributor

@pandyamarut pandyamarut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/LGTM

@deanq deanq merged commit e47c9e3 into main Aug 19, 2025
7 checks passed
@deanq deanq deleted the deanq/ae-1075-download-accelerator branch August 19, 2025 00:17
@github-actions github-actions bot mentioned this pull request Aug 19, 2025
pandyamarut pushed a commit that referenced this pull request Sep 9, 2025
…ls (#83)

* feat: add pydantic dependency and bump to v0.10.0

- Add pydantic>=2.0.0 for enhanced protocol models
- Version bump to 0.10.0 for download acceleration feature

* feat: extend protobuf protocol for download acceleration

- Add accelerate_downloads and hf_models_to_cache fields to FunctionRequest
- Enhance Pydantic models with improved type annotations and documentation
- Maintain backward compatibility with existing protocol
- Support HuggingFace model pre-caching for faster inference startup

* feat: implement download acceleration in client interface

- Add accelerate_downloads and hf_models_to_cache parameters to @Remote decorator
- Update function and class decoration to pass acceleration options
- Extend docstring with comprehensive parameter documentation
- Enable HuggingFace model pre-caching through decorator configuration

* feat: update class execution system for download acceleration

- Add acceleration parameters to create_remote_class function
- Store acceleration settings in RemoteClassWrapper instances
- Pass acceleration options through to remote execution requests
- Maintain compatibility with existing class decoration patterns

* feat: update stubs to support download acceleration parameters

- Extend prepare_request methods to accept acceleration parameters
- Update request building to include new acceleration fields
- Maintain consistency across execution pathways
- Preserve existing stub interface contracts

* test: update tests for download acceleration compatibility

- Update create_remote_class calls to include new acceleration parameters
- Ensure all existing tests pass with enhanced function signatures
- Add proper parameter defaults for backward compatibility
- Maintain test coverage for class execution patterns
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants