Move duplicated imatrix code into single common imatrix-loader.cpp by bartowski1182 · Pull Request #22445 · ggml-org/llama.cpp

bartowski1182 · 2026-04-27T19:39:08Z

Overview

quantize.cpp and imatrix.cpp duplicated the same code for loading the imatrix

This change pulls those functions out to a common file with the same imatrix and legacy imatrix loading functions

Each handled post-loading differently with imatrix.cpp keeping the raw sums and counts so they can be merged (with --in-file), whereas quantize.cpp normalized immediately, so the new common_imatrix struct stores the information as it's loaded and each can do with it as they need

Also fixed a minor bug that caused --show-statistics to fail (params weren't set before being used) so that I could use it to compare before and after

Additional information

Tested parity of before and after by running the following scenarios and running sha256sum on the results:

./build/bin/llama-imatrix --in-file ../google_gemma-4-E4B-it-imatrix.gguf -m ../google_gemma-4-E4B-it-bf16.gguf -f ../groups_merged.txt -o test.gguf
./build/bin/llama-imatrix -m ../google_gemma-4-E4B-it-bf16.gguf -f ../groups_merged.txt -o test-legacy.dat --output-format dat
./build/bin/llama-imatrix --in-file test-legacy.dat -m ../google_gemma-4-E4B-it-bf16.gguf -f ../groups_merged.txt -o test-legacy-in.dat--output-format dat
./build/bin/llama-quantize --imatrix test.gguf ../google_gemma-4-E4B-it-bf16.gguf ./gemma-4-test-Q4_K_M.gguf Q4_K_M
./build/bin/llama-quantize --imatrix test-legacy.dat ../google_gemma-4-E4B-it-bf16.gguf ./gemma-4-test-Q4_K_M.gguf Q4_K_M

I also ran --show-statistics since that's the other place the loader is used with imatrix.cpp and the results are the same before and after as well

Requirements

I have read and agree with the contributing guidelines
AI usage disclosure: YES, primary method of deduplication, all testing ran manually

(ironically a duplicated PR because I'm bad at git)

bartowski1182 · 2026-06-04T14:12:53Z

@ggerganov can I get another review on this one?

* origin/master: (57 commits) server : disable on-device spec checkpoints (ggml-org#24108) arg: fix double mtp downloads (ggml-org#24128) webui: [a11y] fix keyboard navigation issues in chat interface and sidebar (ggml-org#23132) Move duplicated imatrix code into single common imatrix-loader.cpp (ggml-org#22445) ui: Fixed packages (ggml-org#24119) ui: added single line reasoning preview (ggml-org#23601) return filter to save memory (ggml-org#24125) convert: Fix Gemma 4 Unified conversion (ggml-org#24118) ggml: vectorize ggml_vec_dot_q4_1_q8_1 with WASM SIMD128 (ggml-org#22209) server: avoid unnecessary checkpoint restore when new tokens are present (ggml-org#24110) agents: refactor, include more guidelines (ggml-org#24111) webui: fix tool selector toggle/counter, key tools by stable identity (ggml-org#24065) build : use umbrella Headers directory for XCFramework module map (ggml-org#23974) server : add header to tools/server/server-http.h (ggml-org#24089) cmake: skip cvector-generator and export-lora when CPU backend is disabled (ggml-org#24053) fix(mtmd): handle Gemma 4 audio projector embedding size (ggml-org#24091) readme : add status badges (ggml-org#24104) tests : refactor test-save-load-state to accept token input (ggml-org#24073) metal : reduce rset heartbeat from 500ms -> 5ms (ggml-org#24074) ggml-webgpu: FlashAttention refactor + standardize quantization support (ggml-org#23834) ...

Deduplicate imatrix loading code

5f096e9

github-actions Bot added the examples label Apr 27, 2026

Add back LLAMA_TRACE, early exit on quantize missing metadata

dd38038

bartowski1182 marked this pull request as ready for review April 28, 2026 16:14

bartowski1182 requested review from a team and ggerganov as code owners April 28, 2026 16:14

pwilkin approved these changes Apr 28, 2026

View reviewed changes

ggerganov approved these changes Jun 4, 2026

View reviewed changes

pwilkin merged commit e7bcf1c into ggml-org:master Jun 4, 2026
40 of 46 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move duplicated imatrix code into single common imatrix-loader.cpp#22445

Move duplicated imatrix code into single common imatrix-loader.cpp#22445
pwilkin merged 2 commits into
ggml-org:masterfrom
bartowski1182:imatrix-dup

bartowski1182 commented Apr 27, 2026

Uh oh!

bartowski1182 commented Jun 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

bartowski1182 commented Apr 27, 2026

Overview

Additional information

Requirements

Uh oh!

bartowski1182 commented Jun 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants