-
Notifications
You must be signed in to change notification settings - Fork 485
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] lmdeploy Turbomind 推理导致jupyter 崩溃 #3128
Comments
@lvhan028 试过好几次了。必定崩溃。唯一一次没有崩溃的是没加quant_policy=4, 这个配置 |
pytorch engine没有支持 gptq 量化算法 |
换成awq 以后有 |
类似 loc("/root/anaconda3/lib/python3.12/site-packages/lmdeploy/pytorch/kernels/cuda/pagedattention.py":475:11): error: operation scheduled before its operands 可以不用管 |
Checklist
Describe the bug
在vscode上python3.12 使用lmdeploy 推理。导致jupyter内核崩溃。不知道是哪里的问题
Reproduction
Environment
Error traceback
改成pytorch 推理后端报下列错误。是需要使用awq量化?
换成awq 和 pytorch 以后得到下列错误提示
awq 和 Turbomind同样jupyter崩溃
The text was updated successfully, but these errors were encountered: