UPSTREAM PR #17728: ggml-cpu: aarm64: remove asserts always evaluating to false in repack by loci-dev · Pull Request #413 · auroralabs-loci/llama.cpp

loci-dev · 2025-12-03T10:40:47Z

I got an email reporting the issue. I can't find the original comment, but for gemv these asserts don't make sense, as nr will always be 1

loci-review · 2025-12-03T11:24:27Z

Explore the complete analysis inside the Version Insights

Performance Review Summary - PR #413

Analysis: This PR removes two incorrect assertions from ARM64 GEMV functions (ggml_gemv_q4_K_8x4_q8_K and ggml_gemv_q4_K_8x8_q8_K) in repack.cpp. The assertions assert(nr % 4 == 0) were semantically wrong for vector operations where nr = 1 is expected.

Performance Impact: The modified GEMV functions are not among the top performance-impacted functions. The observed performance changes in the version comparison (parameter handling functions showing 7-12 ns increases in throughput) are unrelated to this PR. The assertion removal eliminates 1-2 instructions per call, resulting in negligible improvement (<0.1 ns).

Inference Impact: No impact on tokens per second. The core inference functions (llama_decode, llama_encode, llama_tokenize) are unchanged. GEMV operations are utility functions in the quantized computation pipeline, not direct inference entry points.

Power Consumption: The 0.21% reduction in libggml-cpu.so power consumption (244 nJ decrease) is driven by the 88 ns throughput improvement in ggml_compute_forward_silu_back_f16, not by this PR's changes.

Conclusion: This is a correctness fix with no measurable performance impact. The change enables proper execution of GEMV operations without affecting inference throughput or power consumption.

ggml-cpu: remove asserts always evaluating to false

ab5938c

loci-dev temporarily deployed to PROD__AL_DEMO December 3, 2025 10:40 — with GitHub Actions Inactive

loci-dev force-pushed the main branch from bf57f85 to 6eae205 Compare December 3, 2025 11:08

loci-dev force-pushed the main branch 26 times, most recently from ca4155f to b86b588 Compare December 5, 2025 22:08

loci-dev force-pushed the main branch 30 times, most recently from 048ad94 to 6c1fde6 Compare February 3, 2026 13:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UPSTREAM PR #17728: ggml-cpu: aarm64: remove asserts always evaluating to false in repack#413

UPSTREAM PR #17728: ggml-cpu: aarm64: remove asserts always evaluating to false in repack#413
loci-dev wants to merge 1 commit intomainfrom
upstream-PR17728-branch_Alcpz-Alcpz/fix_wrong_asserts

loci-dev commented Dec 3, 2025

Uh oh!

loci-review bot commented Dec 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

loci-dev commented Dec 3, 2025

Uh oh!

loci-review bot commented Dec 3, 2025

Performance Review Summary - PR #413

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants