logs : reduce by ggerganov · Pull Request #23021 · ggml-org/llama.cpp

ggerganov · 2026-05-13T19:08:28Z

Overview

Reducing the amount of logs that we print by default. Feedback is welcome about what to remove further and what to keep.

Additional information

Add new log level for the llama tools and examples: LOG_LEVEL_TRACE = 4
Update LOG_LEVEL_DEBUG from 4 to 5
Change all INFO logs coming from libllama, libmtmd and libggml* to LOG_LEVEL_TRACE
Update some of the logs in libcommon accordingly
Enable timestamps by default
Print available devices on startup
Add periodic slot generation speed logs
Print the selected log verbosity on start
Print server tasks sampling parameters on start in the trace logs

The old level of logging can be enabled by adding -lv 4 to the CLI args.

Requirements

I have read and agree with the contributing guidelines
AI usage disclosure: NO

ggerganov · 2026-05-13T19:12:32Z

-            if (!is_resume) {
-                mtmd_helper_log_set(common_log_default_callback, nullptr);
-            }


@ngxson Wasn't sure what is the intent here of gating this with is_resume. I just moved the mtmd log initialization to the constructor server_context_impl(). Please confirm this is OK.

the reason why it's gated was because mtmd_helper_log_set is not thread-safe, so it should only be called once when the program starts.

although, there is another way, you can call mtmd_helper_log_set unconditionally in server_context_impl::init() function

ggerganov · 2026-05-14T08:48:11Z

@ggml-org/maintainers After this change the server logs will be quite reduced by default. You can easily go back to the old amount of logs if you are used to them by adding -lv 4 to your command.

pwilkin

Good change, I like the added verbosity levels. Previously it was really hard to find a balance between standard mode that hid a lot of things and verbose mode that spammed tons of info on every token generated.

ngxson · 2026-05-14T09:03:48Z

+    server_context_impl() {
+        mtmd_helper_log_set(common_log_default_callback, nullptr);
+    }
+


this should also be fine

CISC

Very nice.

* logs : reduce * args : fix envs * server : fix build * common : print verbosity level at start * server : clean-up logs * server : print prompt processing timings + sampling params * minor : whitespaces

- Opt-in flag that surfaces the per-device free-memory probe table (currently TRACE-only after ggml-org#23021) at INFO when set; default off.

* logs : reduce * args : fix envs * server : fix build * common : print verbosity level at start * server : clean-up logs * server : print prompt processing timings + sampling params * minor : whitespaces

ggerganov requested review from a team and JohannesGaessler as code owners May 13, 2026 19:08

ggerganov commented May 13, 2026

View reviewed changes

logs : reduce

98ce6ce

ggerganov force-pushed the gg/logs-reduce branch from cfa612e to 98ce6ce Compare May 13, 2026 19:22

args : fix envs

1489bb0

github-actions Bot added examples server ggml changes relating to the ggml tensor library for machine learning Apple Metal https://en.wikipedia.org/wiki/Metal_(API) labels May 13, 2026

ggerganov added 4 commits May 14, 2026 08:04

server : fix build

7c3a906

common : print verbosity level at start

bda35e6

server : clean-up logs

8ac7e34

server : print prompt processing timings + sampling params

c044027

ServeurpersoCom approved these changes May 14, 2026

View reviewed changes

pwilkin approved these changes May 14, 2026

View reviewed changes

ngxson reviewed May 14, 2026

View reviewed changes

ngxson approved these changes May 14, 2026

View reviewed changes

CISC approved these changes May 14, 2026

View reviewed changes

Comment thread common/log.h Outdated

Comment thread tools/mtmd/mtmd-helper.cpp Outdated

minor : whitespaces

fed555d

ggerganov merged commit 67b2b7f into master May 14, 2026
19 checks passed

ggerganov deleted the gg/logs-reduce branch May 14, 2026 10:05

ServeurpersoCom mentioned this pull request May 16, 2026

server: skip device enumeration in router mode to avoid creating CUDA… #23137

Merged

taronaeo mentioned this pull request May 17, 2026

Misc. bug: Why change output of llama.cpp server? #23162

Closed

Bikkies mentioned this pull request May 17, 2026

fit : add --fit-show-mem to print probe table at INFO #23232

Open

Kangaroux mentioned this pull request May 18, 2026

Misc. bug: -lv missing trace in --help, trace logs use I instead of T #23290

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

logs : reduce#23021

logs : reduce#23021
ggerganov merged 7 commits into
masterfrom
gg/logs-reduce

ggerganov commented May 13, 2026 •

edited

Loading

Uh oh!

ggerganov May 13, 2026 •

edited

Loading

Uh oh!

ngxson May 14, 2026

Uh oh!

ggerganov commented May 14, 2026 •

edited

Loading

Uh oh!

pwilkin left a comment

Uh oh!

ngxson May 14, 2026

Uh oh!

CISC left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

ggerganov commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Additional information

Requirements

Uh oh!

ggerganov May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ngxson May 14, 2026

Choose a reason for hiding this comment

Uh oh!

ggerganov commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pwilkin left a comment

Choose a reason for hiding this comment

Uh oh!

ngxson May 14, 2026

Choose a reason for hiding this comment

Uh oh!

CISC left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

ggerganov commented May 13, 2026 •

edited

Loading

ggerganov May 13, 2026 •

edited

Loading

ggerganov commented May 14, 2026 •

edited

Loading