-
Notifications
You must be signed in to change notification settings - Fork 157
Sanitize imatrix #735
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sanitize imatrix #735
Conversation
|
Thanks this looks like a great feature given no need to re-make an imatrix. I've tested this to re-make two previously broken/nan quants:
Unfortunately, it didn't help here given the nans were in I'll tried this a second time back at my desk, but still not passing the
Yes, it does seem to fix this one. I re-quantizied using the existing imatrix with this PR and the resulting gguf passes the new I released this model now here: https://huggingface.co/ubergarm/Qwen3-Coder-480B-A35B-Instruct-GGUF#iq2_kl-169597-gib-3034-bpw |
|
I added even more checks specifically for |
Great! Yes! Just re-build this PR735 @ I was not able to rebase #624 on it now cleanly so didn't use the quantization tweaks despite this being IQ3_K which would likely benefit from that PR. I do not have any broken nan quants left now, so can't test Thanks for adding this feature! |
|
How can I identify if some of the GGUFs I've produced contain Nan? |
Just add |
This PR "sanitizes" the provided imatrix before using it for quantization, so that hopefully we will no longer find NaNs in quantized models.
For now we don't cover quantization types where the quantization is done via a function
ggml-quants.c. These are basically all legacy quants, k-quants, and i-quants (but repacked variants of these are covered).@ubergarm Hopefully this PR prevents NaNs in your models where we have observed them.