Skip to content

Share HuggingFace downloads between test runs#8

Merged
khluu merged 1 commit into
vllm-project:mainfrom
DarkLight1337:cache-hf
Jun 25, 2024
Merged

Share HuggingFace downloads between test runs#8
khluu merged 1 commit into
vllm-project:mainfrom
DarkLight1337:cache-hf

Conversation

@DarkLight1337
Copy link
Copy Markdown
Member

@DarkLight1337 DarkLight1337 commented Jun 25, 2024

The model tests take over 50 minutes, which is quite long and runs the risk of getting interrupted. This PR (ported from vllm-project/vllm#4874) attempts to reduce the running time by sharing the HuggingFace cache between test runs so that models need not be downloaded each time.

Please share any concerns you may have regarding this approach. I'm also not sure how to test the resulting speed-up since there is no guarantee that model tests are re-run in the same machine (and hence able to utilize the cache effectively).

Note: hostPath volumes in Kubernetes have associated security risks. Is there another way for agent-stack-k8s to use a persistent volume?

@DarkLight1337
Copy link
Copy Markdown
Member Author

DarkLight1337 commented Jun 26, 2024

I haven't noticed any significant improvement for models-test so far...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants