Skip to content

Add python script that converts GGUF imatrix files to the format supported here#1405

Merged
ikawrakow merged 1 commit intomainfrom
s6/imatrix_conv
Mar 11, 2026
Merged

Add python script that converts GGUF imatrix files to the format supported here#1405
ikawrakow merged 1 commit intomainfrom
s6/imatrix_conv

Conversation

@saood06
Copy link
Collaborator

@saood06 saood06 commented Mar 11, 2026

As mentioned here the new GGUF imatrix files can now be found on a lot of recent model releases on huggingface. This mostly vibe coded script allows for conversion without needing to have a compiled version of mainline.

By default swaps the .gguf extension for .dat but you can use --outfile to specify an output file.

Tested by converting and using the resulting imatrix to quantize but didn't do any objective tests (like perplexity).

@saood06 saood06 requested a review from ikawrakow March 11, 2026 13:53
@ikawrakow ikawrakow merged commit 2161ee0 into main Mar 11, 2026
@ubergarm
Copy link
Contributor

Thanks @saood06 its good to have this ability here in addition to the mainline llama-imatrix --output-format dat --in-file imatrix.gguf --output-file imatrix.dat conversion method.

I haven't tried it yet, but am curious in general if imatrix files computed against the new mainline pre-merged ffn_(gate|up)_exps bf16s will apply to non pre-merged bf16 ggufs...

I'm guessing not unfortunately...

@ikawrakow
Copy link
Owner

I'm guessing not unfortunately...

The imatrix for ffn_up and ffn_gate tensors is exactly the same as these two tensors "see" the exact same activations. Hence, it would be a matter of a quick hack to make it work.

Having said that, I don't think such nonsense should be supported.

Oh, is somebody going to tell AesSedai that his new GGUFs do not work here?

@saood06
Copy link
Collaborator Author

saood06 commented Mar 11, 2026

Thanks @saood06 its good to have this ability here

Well like I said the script was mostly vibe coded. The reason the PR was delayed was the original working script was too ugly. The current script was made with newer models.

I haven't tried it yet, but am curious in general if imatrix files computed against the new mainline pre-merged ffn_(gate|up)_exps bf16s will apply to non pre-merged bf16 ggufs...

I'm guessing not unfortunately...

Your guess is correct. The script does only basic conversion, nothing fancy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants