Conversation
|
Explore the complete analysis inside the Version Insights Performance Review Summary - PR #413Analysis: This PR removes two incorrect assertions from ARM64 GEMV functions ( Performance Impact: The modified GEMV functions are not among the top performance-impacted functions. The observed performance changes in the version comparison (parameter handling functions showing 7-12 ns increases in throughput) are unrelated to this PR. The assertion removal eliminates 1-2 instructions per call, resulting in negligible improvement (<0.1 ns). Inference Impact: No impact on tokens per second. The core inference functions ( Power Consumption: The 0.21% reduction in Conclusion: This is a correctness fix with no measurable performance impact. The change enables proper execution of GEMV operations without affecting inference throughput or power consumption. |
ca4155f to
b86b588
Compare
048ad94 to
6c1fde6
Compare
Mirrored from ggml-org/llama.cpp#17728
I got an email reporting the issue. I can't find the original comment, but for gemv these asserts don't make sense, as nr will always be 1