server: bump timeout to 3600s by ngxson · Pull Request #23842 · ggml-org/llama.cpp

ngxson · 2026-05-28T21:49:07Z

Overview

IMPORTANT: server's --timeout works fine. For users who reported the problem related to timeout, check your client code first, some HTTP framework and browsers may have a default client-side timeout.

Ref discussion from #22907

Fix #23832

Fix #22997

Bump timeout to one hour. This "ought to be enough for anybody"

Also print a message to remind about client's timeout.

How I tested this change

Here is how I test it:

add this to either before or after llama_decode() in server-context.cpp: std::this_thread::sleep_for(std::chrono::seconds(1000000));
that will simulate a long blocking task
remember to configure timeout of the client. in my case, postman: https://stackoverflow.com/questions/36355732/how-to-increase-postman-client-request-timeout
send a request and wait
17 minutes later, I stop the request:

0.36.243.889 I srv          load:  - looking for better prompt, base f_keep = -1.000, sim = 0.000
0.36.243.895 I srv        update:  - cache state: 0 prompts, 0.000 MiB (limits: 8192.000 MiB, 131072 tokens, 8589934592 est)
0.36.243.899 I srv  get_availabl: prompt cache update took 0.04 ms
0.36.244.264 I slot launch_slot_: id  3 | task 0 | processing task, is_child = 0
17.11.781.940 W srv          next: request cancelled after 30s, likely a client-side timeout; please check your client's code
17.11.781.948 W srv          stop: cancel task, id_task = 0

Requirements

I have read and agree with the contributing guidelines
AI usage disclosure: nope

ngxson · 2026-05-28T21:51:42Z

+                if (time_elapsed_ms > 30000) {
+                    SRV_WRN("%s", "request cancelled after 30s, likely a client-side timeout; please check your client's code\n");
+                }


note: it would be better to detect if time_elapsed_ms > server's --timeout here, then log another message. but due to the way things are structured, this proved to be quite complicated

Would the client-set timeout for request be easier to log? It would have avoided the confusion

but how? AFAIK client never communicate such info to server

I might misunderstand the PR but isn't the log request cancelled after 30s a bit confusing?
It could show this message with a time_elapsed_ms anywhere between 30s & 3600s, but for an user reading this log it looks like it's exactly 30s.
Why not use the actual time_elapsed_ms value?

but how? AFAIK client never communicate such info to server

You are right, I should have checked before asking. The whole log messaging was just implying that request is cancelled on server initiative.

* origin/master: vocab : support tokenizer for LFM2.5-8B-A1B (ggml-org#23826) graph : ensure DS32 kq_mask_lid is F32 (ggml-org#23864) server: remove obsolete scripts (ggml-org#23870) ci : update macos release to use macos-26 runner (ggml-org#23878) download: add option to skip_download (ggml-org#23059) mtmd: Add DeepSeekOCR 2 Support (ggml-org#20975) CUDA: Check PTX version on host side to guard PDL dispatch (ggml-org#23530) server: bump timeout to 3600s (ggml-org#23842) model : support for DeepseekV32ForCausalLM with generic DeepSeek Sparse Attention (DSA) implementation (ggml-org#23346) llama: use f16 mask for FA to save VRAM (ggml-org#23764) sync : ggml ggml : bump version to 0.13.1 (ggml/1523) ngram-mod : Add missing include (ggml-org#23857) llama: add llm_graph_input_mtp (ggml-org#23643) app : move licences to llama-app (ggml-org#23824) cuda : disables launch_fattn PDL enrollment due to compiler bug (ggml-org#23825) meta : Add missing `buffer` set in allreduce fallback !COMPUTE clear (ggml-org#23480)

* server: bump timeout to 3600s * nits: change wording

server: bump timeout to 3600s

cb71f50

ngxson requested review from a team as code owners May 28, 2026 21:49

ngxson commented May 28, 2026

View reviewed changes

aldehir approved these changes May 28, 2026

View reviewed changes

github-actions Bot added examples server labels May 28, 2026

nits: change wording

9bc8240

ServeurpersoCom approved these changes May 29, 2026

View reviewed changes

ngxson merged commit cb47092 into master May 29, 2026
27 checks passed

fewtarius pushed a commit to fewtarius/llama.cpp that referenced this pull request May 30, 2026

server: bump timeout to 3600s (ggml-org#23842)

4ca16a5

* server: bump timeout to 3600s * nits: change wording

turbo-tan pushed a commit to turbo-tan/llama.cpp-tq3 that referenced this pull request Jun 2, 2026

server: bump timeout to 3600s (ggml-org#23842)

0e745ec

* server: bump timeout to 3600s * nits: change wording

ngxson mentioned this pull request Jun 2, 2026

server: add SSE ping interval #24013

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

server: bump timeout to 3600s#23842

server: bump timeout to 3600s#23842
ngxson merged 2 commits into
masterfrom
xsn/server_bump_timeout

ngxson commented May 28, 2026 •

edited

Loading

Uh oh!

ngxson May 28, 2026

Uh oh!

sasa7812 May 29, 2026 •

edited

Loading

Uh oh!

ngxson May 29, 2026 •

edited

Loading

Uh oh!

bonswouar May 29, 2026 •

edited

Loading

Uh oh!

sasa7812 May 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

ngxson commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

How I tested this change

Requirements

Uh oh!

ngxson May 28, 2026

Choose a reason for hiding this comment

Uh oh!

sasa7812 May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ngxson May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bonswouar May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sasa7812 May 29, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

ngxson commented May 28, 2026 •

edited

Loading

sasa7812 May 29, 2026 •

edited

Loading

ngxson May 29, 2026 •

edited

Loading

bonswouar May 29, 2026 •

edited

Loading