-
Notifications
You must be signed in to change notification settings - Fork 492
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] deepseek v3 deployment on h200 #3049
Comments
lmdeploy serve api_server deepseek-ai/DeepSeek-V3 --tp 8 --backend pytorch --log-level INFO |
Errr, it is just Desperately SLOW. We have not performed much optimization on weight loading. |
I did a test, and it still takes over 20 minutes to finish loading. Is this within expectations? |
@zhyncs hi, we don't have deepseekv3 so it tested on deepseekv2-chat with tp=8 and loading timing can reduce from 15min to 8 min. |
Hi @RunningLeon, I've granted @grimoire access to the H200. Could you @grimoire please help verify? Thanks! |
Is the model placed on NFS? The uncached first-time load could be slow. |
@grimoire I think it's placed on SSD. |
Checklist
Describe the bug
The phenomenon is stuck here.
Reproduction
as mentioned above
Environment
Error traceback
The text was updated successfully, but these errors were encountered: