Skip to content

server: handle If-None-Match weak ETags#23916

Merged
aldehir merged 1 commit into
ggml-org:masterfrom
EZForever:server-weak-etag
May 31, 2026
Merged

server: handle If-None-Match weak ETags#23916
aldehir merged 1 commit into
ggml-org:masterfrom
EZForever:server-weak-etag

Conversation

@EZForever
Copy link
Copy Markdown
Contributor

@EZForever EZForever commented May 30, 2026

Overview

See #23849 for details. In short, current logic of comparing ETags in If-None-Match HTTP header does not consider "weak" ETags (prepended with W/) to be the same as "strong" ones, while HTTP specs requires this. This causes reverse proxies which compress HTTP responses (and "weakens" the ETag in the process) to break browser cache validation.

This PR provides a "quick" fix, which assumes llama-server never generate weak ETags by itself. While HTTP specs requires handling more cases (e.g. * wildcard, or multiple ETags), I don't think they are worth to implement here.

Fixes #23849.

Requirements

@EZForever EZForever requested a review from a team as a code owner May 30, 2026 17:15
@aldehir aldehir merged commit 6f165c1 into ggml-org:master May 31, 2026
25 of 27 checks passed
gabe-l-hart added a commit to gabe-l-hart/llama.cpp that referenced this pull request Jun 1, 2026
* origin/master: (36 commits)
vendor : update cpp-httplib to 0.46.1 (ggml-org#23980)
llama: limit max outputs of `llama_context` (ggml-org#23861)
metal: template GLU kernels to support f16/f32 (ggml-org#23882)
vulkan: don't hold the device mutex while compiling pipelines (ggml-org#23641)
vulkan: reduce host memory lock contention (ggml-org#23376)
vocab: add normalizer.lowercase support to WPM (ggml-org#23899)
TP: quantized KV cache support (ggml-org#23792)
security : disable private disclosures (ggml-org#23963)
model: Add EXAONE 4.5 implementations (ggml-org#21733)
vulkan: Block-load Q3_K/Q6_K block data and subtract on 32b ints (ggml-org#23056)
vulkan: Removed unused functions (ggml-org#23175)
common : support manually triggering the reasoning budget end sequence (ggml-org#23949)
ci : add missing Linux label to cpu-x64-high-perf runner (ggml-org#23958)
[SYCL] Support Q4_1, Q5_0, Q5_1 in Flash-attention (ggml-org#23812)
[SYCL] Add more types in GET_ROWS OP (ggml-org#23710)
sycl : Optimize Q3_K mul_mat by reorder (ggml-org#23725)
ci: remove redundant or duplicate jobs (ggml-org#23927)
server : handle If-None-Match weak ETags (ggml-org#23916)
ci : limit trigger paths for the CPU workflow (ggml-org#23938)
vocab : add tokenizer support for jina-embeddings-v2-base-zh (ggml-org#18756)
...
turbo-tan pushed a commit to turbo-tan/llama.cpp-tq3 that referenced this pull request Jun 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Misc. bug: server: Insufficient handling of If-None-Match header fails browser cache validation

3 participants