UPSTREAM PR #17827: common : change --color to accept on/off/auto, default to auto#470
UPSTREAM PR #17827: common : change --color to accept on/off/auto, default to auto#470
Conversation
|
Explore the complete analysis inside the Version Insights Performance Analysis Summary: PR #470OverviewPR #470 introduces argument parsing enhancements for the Key FindingsPerformance-Critical Areas Impact: The analyzed functions showing performance changes are exclusively located in Function-Level Changes: The top 10 functions by response time change are argument parsing lambdas with increases ranging from 13,400 ns to 19,600 ns absolute change. The highest throughput change observed is 269 ns (from 12 ns to 269 ns). All affected functions are lambda operators in Inference Performance Impact: No impact on tokens per second. The modified code paths execute during initialization only, before model loading and inference begin. Functions responsible for tokenization and inference (llama_decode, llama_encode, llama_tokenize) show no changes in response time or throughput. The reference metric of 7% tokens per second reduction per 2 ms llama_decode slowdown is not applicable as llama_decode remains unmodified. Power Consumption Analysis: Power consumption changes across all binaries remain within measurement noise (< 0.2%). The llama-cvector-generator binary shows a 0.067% improvement (249,347 nJ to 249,179 nJ). All other binaries show negligible changes: llama-gguf-split (+0.187%), llama-tokenize (+0.157%), llama-quantize (+0.120%), llama-run (+0.093%), llama-bench (+0.091%), llama-tts (+0.086%). Core inference libraries (libllama.so, libggml.so) show zero change. Code Implementation: The PR adds |
a9fcc24 to
ea62cd5
Compare
4b559d8 to
23789fa
Compare
Mirrored from ggml-org/llama.cpp#17827
Change
--colorto accepton/off/autojust like-faand--log-colors.Default to
autojust like--log-colors.