Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

试了一下量化8位和4位没有明显感觉到速度差距 #22

Open
liaoweiguo opened this issue Jul 4, 2023 · 1 comment
Open

Comments

@liaoweiguo
Copy link

13900KF
精度不太好说,没有测试

@li-plus
Copy link
Owner

li-plus commented Jul 13, 2023

可以加上-v参数,测试下每个token所需要的耗时,参考 #31 ,CPU上使用int4相比int8应该会有明显加速。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants