Skip to content

logs : reduce#23021

Merged
ggerganov merged 7 commits into
masterfrom
gg/logs-reduce
May 14, 2026
Merged

logs : reduce#23021
ggerganov merged 7 commits into
masterfrom
gg/logs-reduce

Conversation

@ggerganov
Copy link
Copy Markdown
Member

@ggerganov ggerganov commented May 13, 2026

Overview

Reducing the amount of logs that we print by default. Feedback is welcome about what to remove further and what to keep.

Additional information

  • Add new log level for the llama tools and examples: LOG_LEVEL_TRACE = 4
  • Update LOG_LEVEL_DEBUG from 4 to 5
  • Change all INFO logs coming from libllama, libmtmd and libggml* to LOG_LEVEL_TRACE
  • Update some of the logs in libcommon accordingly
  • Enable timestamps by default
  • Print available devices on startup
  • Add periodic slot generation speed logs
  • Print the selected log verbosity on start
  • Print server tasks sampling parameters on start in the trace logs

The old level of logging can be enabled by adding -lv 4 to the CLI args.

Requirements

@ggerganov ggerganov requested review from a team and JohannesGaessler as code owners May 13, 2026 19:08
Comment on lines -752 to -754
if (!is_resume) {
mtmd_helper_log_set(common_log_default_callback, nullptr);
}
Copy link
Copy Markdown
Member Author

@ggerganov ggerganov May 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ngxson Wasn't sure what is the intent here of gating this with is_resume. I just moved the mtmd log initialization to the constructor server_context_impl(). Please confirm this is OK.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the reason why it's gated was because mtmd_helper_log_set is not thread-safe, so it should only be called once when the program starts.

although, there is another way, you can call mtmd_helper_log_set unconditionally in server_context_impl::init() function

@github-actions github-actions Bot added examples server ggml changes relating to the ggml tensor library for machine learning Apple Metal https://en.wikipedia.org/wiki/Metal_(API) labels May 13, 2026
@ggerganov
Copy link
Copy Markdown
Member Author

ggerganov commented May 14, 2026

@ggml-org/maintainers After this change the server logs will be quite reduced by default. You can easily go back to the old amount of logs if you are used to them by adding -lv 4 to your command.

Copy link
Copy Markdown
Member

@pwilkin pwilkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good change, I like the added verbosity levels. Previously it was really hard to find a balance between standard mode that hid a lot of things and verbose mode that spammed tons of info on every token generated.

Comment on lines +622 to +625
server_context_impl() {
mtmd_helper_log_set(common_log_default_callback, nullptr);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should also be fine

Copy link
Copy Markdown
Member

@CISC CISC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice.

Comment thread common/log.h Outdated
Comment thread tools/mtmd/mtmd-helper.cpp Outdated
@ggerganov ggerganov merged commit 67b2b7f into master May 14, 2026
19 checks passed
@ggerganov ggerganov deleted the gg/logs-reduce branch May 14, 2026 10:05
xxmustafacooTR pushed a commit to xxPlayground/llama-cpp-turboquant that referenced this pull request May 14, 2026
* logs : reduce

* args : fix envs

* server : fix build

* common : print verbosity level at start

* server : clean-up logs

* server : print prompt processing timings + sampling params

* minor : whitespaces
dandm1 pushed a commit to dandm1/llama.cpp that referenced this pull request May 16, 2026
* logs : reduce

* args : fix envs

* server : fix build

* common : print verbosity level at start

* server : clean-up logs

* server : print prompt processing timings + sampling params

* minor : whitespaces
Bikkies added a commit to Bikkies/llama.cpp that referenced this pull request May 17, 2026
- Opt-in flag that surfaces the per-device free-memory probe table (currently TRACE-only after ggml-org#23021) at INFO when set; default off.
rsenthilkumar6 pushed a commit to rsenthilkumar6/llama.cpp that referenced this pull request May 19, 2026
* logs : reduce

* args : fix envs

* server : fix build

* common : print verbosity level at start

* server : clean-up logs

* server : print prompt processing timings + sampling params

* minor : whitespaces
ArberSephirotheca pushed a commit to ArberSephirotheca/llama.cpp that referenced this pull request May 19, 2026
* logs : reduce

* args : fix envs

* server : fix build

* common : print verbosity level at start

* server : clean-up logs

* server : print prompt processing timings + sampling params

* minor : whitespaces
baramofme pushed a commit to baramofme/llama-cpp-turboquant that referenced this pull request May 23, 2026
* logs : reduce

* args : fix envs

* server : fix build

* common : print verbosity level at start

* server : clean-up logs

* server : print prompt processing timings + sampling params

* minor : whitespaces
winstonma pushed a commit to winstonma/llama.cpp that referenced this pull request May 27, 2026
* logs : reduce

* args : fix envs

* server : fix build

* common : print verbosity level at start

* server : clean-up logs

* server : print prompt processing timings + sampling params

* minor : whitespaces
fewtarius pushed a commit to fewtarius/llama.cpp that referenced this pull request May 30, 2026
* logs : reduce

* args : fix envs

* server : fix build

* common : print verbosity level at start

* server : clean-up logs

* server : print prompt processing timings + sampling params

* minor : whitespaces
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Apple Metal https://en.wikipedia.org/wiki/Metal_(API) examples ggml changes relating to the ggml tensor library for machine learning server

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants