Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

internvl2.5-2B量化后推理速度无明显提升 #3135

Open
nzomi opened this issue Feb 12, 2025 · 3 comments
Open

internvl2.5-2B量化后推理速度无明显提升 #3135

nzomi opened this issue Feb 12, 2025 · 3 comments

Comments

@nzomi
Copy link

nzomi commented Feb 12, 2025

开发者你好,我部署后用opanai api中转了一下结果,常规测试 swift 框架推理的结果一切正常,但是lmdeploy+openai接口的结果会重复输出,repetition_penalty等参数对齐了还是会重复输出,请问是什么bug吗?
base是Internvl2.5-2B。

@lvhan028
Copy link
Collaborator

标题提到量化模型,正文中没看到量化。所以这个模型有量化么?
试试指定chat_template为 internvl2_5呢?

@nzomi
Copy link
Author

nzomi commented Feb 13, 2025

@lvhan028 是的,从intern2.8-2B量化过来的。确实是chat_template的问题,改成internl2_5结果就正常了。非常感谢开发者

@nzomi nzomi closed this as completed Feb 13, 2025
@nzomi nzomi reopened this Feb 13, 2025
@nzomi nzomi changed the title 量化模型推理结果重复 internvl2.5-2B量化后推理速度无明显提升 Feb 13, 2025
@nzomi
Copy link
Author

nzomi commented Feb 13, 2025

@lvhan028 另外这边测试2B模型量化前后推理同一批图片速度没差异,8B是能明显看到量化后速度提升一倍的,这是2B模型本身的缺陷吗?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants