Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Importance Matrix calculation for AWQ as well? #305

Closed
sorasoras opened this issue Jan 12, 2024 · 2 comments
Closed

Importance Matrix calculation for AWQ as well? #305

sorasoras opened this issue Jan 12, 2024 · 2 comments

Comments

@sorasoras
Copy link

ggerganov/llama.cpp#4856
ggerganov/llama.cpp#4861

There might be some interesting idea you can use to improve AutoAWQ
It does looks pretty promising
or combine them to generate a quantization.

@casper-hansen
Copy link
Owner

Hi @sorasoras, I appreciate the excitement over the new llama.cpp K-bit quants that are slowly rolling in. At the current stage of AutoAWQ, I cannot take on these large things on my own. Although I wish I had the time (and compute+$), I do not. So the priority is to make AutoAWQ smooth to work with and to provide quantization of newer models as they come out.

@casper-hansen
Copy link
Owner

Closing this for now as it's not a planned update. However, I am open for all PRs that can use importance matrices to improve AWQ!

@casper-hansen casper-hansen closed this as not planned Won't fix, can't repro, duplicate, stale Jan 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants