Importance Matrix calculation for AWQ as well? #305

sorasoras · 2024-01-12T17:48:41Z

ggerganov/llama.cpp#4856
ggerganov/llama.cpp#4861

There might be some interesting idea you can use to improve AutoAWQ
It does looks pretty promising
or combine them to generate a quantization.

casper-hansen · 2024-01-12T17:57:00Z

Hi @sorasoras, I appreciate the excitement over the new llama.cpp K-bit quants that are slowly rolling in. At the current stage of AutoAWQ, I cannot take on these large things on my own. Although I wish I had the time (and compute+$), I do not. So the priority is to make AutoAWQ smooth to work with and to provide quantization of newer models as they come out.

casper-hansen · 2024-01-19T21:09:28Z

Closing this for now as it's not a planned update. However, I am open for all PRs that can use importance matrices to improve AWQ!

casper-hansen closed this as not planned Won't fix, can't repro, duplicate, stale Jan 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Importance Matrix calculation for AWQ as well? #305

Importance Matrix calculation for AWQ as well? #305

sorasoras commented Jan 12, 2024

casper-hansen commented Jan 12, 2024

casper-hansen commented Jan 19, 2024

Importance Matrix calculation for AWQ as well? #305

Importance Matrix calculation for AWQ as well? #305

Comments

sorasoras commented Jan 12, 2024

casper-hansen commented Jan 12, 2024

casper-hansen commented Jan 19, 2024