download: add option to skip_download by ngxson · Pull Request #23059 · ggml-org/llama.cpp

ngxson · 2026-05-14T15:15:59Z

Overview

Add a new flag skip_download to the common_params_handle_models function. This is a clean up for the upcoming model download / management API (cc @allozaur ). It is useful to know if a download is required before running a model.

Its meaning:

offline = false --> normal case, ETag is validated and if mismatch, redownload the GGUF
offline = false and skip_download = true --> validation will be performed, but skip download if ETag mismatch
offline = true --> no validation or download will be performed (also implies skip_download)

Note:

I wanted to expose this as a new need_download field in /v1/models, but it takes too much time to validate the ETag, so in the end I did not add it. Probably will revisit this in another PR.

Requirements

I have read and agree with the contributing guidelines
AI usage disclosure: no

ngxson · 2026-05-19T15:05:10Z

hey @angt , can you have a quick look at this PR? thanks!

angt · 2026-05-19T18:59:16Z

I think skip_download should also skip the download if the file doesn't exist ?

ngxson · 2026-05-19T21:33:13Z

hmm yes that was my intention, file not found is the same as ETag mismatch case, right? (or maybe I missed something?)

angt · 2026-05-20T15:28:22Z

Maybe i badly tested it 🤔

I did this:

$ git df
diff --git a/common/arg.cpp b/common/arg.cpp
index 85ef58296..61e7dfa8c 100644
--- a/common/arg.cpp
+++ b/common/arg.cpp
@@ -347,7 +347,7 @@ static handle_model_result common_params_handle_model(struct common_params_model
     common_download_opts opts;
     opts.bearer_token  = params.hf_token;
     opts.offline       = params.offline;
-    opts.skip_download = params.skip_download;
+    opts.skip_download = true; //params.skip_download;

     if (!model.docker_repo.empty()) {
         model.path = common_docker_resolve_model(model.docker_repo);
diff --git a/common/download.cpp b/common/download.cpp
index 5a5704fe1..aa50a8337 100644
--- a/common/download.cpp
+++ b/common/download.cpp
@@ -287,6 +287,10 @@ static int common_download_file_single_online(const std::string & url,
                                               const std::string & path,
                                               const common_download_opts & opts,
                                               bool skip_etag) {
+    if (opts.skip_download) {
+        LOG_WRN("SKIP DOWNLOAD\n");
+    }
+
     static const int max_attempts        = 3;
     static const int retry_delay_seconds = 2;

and got:

$ LLAMA_CACHE=nothing ./build/bin/llama-server -hf unsloth/Qwen3.5-0.8B-GGUF
0.00.104.406 W SKIP DOWNLOAD
0.00.104.481 W SKIP DOWNLOAD
Downloading mmproj-BF16.gguf ─────────────────────────────────────── 100%
Downloading Qwen3.5-0.8B-Q4_K_M.gguf ─────────────────────────────── 100%

ngxson · 2026-05-23T10:03:32Z

@angt I addressed the problem in ec6a687 , could you take a look? Thanks!

ngxson · 2026-05-29T13:38:29Z

I need to merge this to unblock some other task. @angt could you review & give the 2nd approval?

* origin/master: vocab : support tokenizer for LFM2.5-8B-A1B (ggml-org#23826) graph : ensure DS32 kq_mask_lid is F32 (ggml-org#23864) server: remove obsolete scripts (ggml-org#23870) ci : update macos release to use macos-26 runner (ggml-org#23878) download: add option to skip_download (ggml-org#23059) mtmd: Add DeepSeekOCR 2 Support (ggml-org#20975) CUDA: Check PTX version on host side to guard PDL dispatch (ggml-org#23530) server: bump timeout to 3600s (ggml-org#23842) model : support for DeepseekV32ForCausalLM with generic DeepSeek Sparse Attention (DSA) implementation (ggml-org#23346) llama: use f16 mask for FA to save VRAM (ggml-org#23764) sync : ggml ggml : bump version to 0.13.1 (ggml/1523) ngram-mod : Add missing include (ggml-org#23857) llama: add llm_graph_input_mtp (ggml-org#23643) app : move licences to llama-app (ggml-org#23824) cuda : disables launch_fattn PDL enrollment due to compiler bug (ggml-org#23825) meta : Add missing `buffer` set in allreduce fallback !COMPUTE clear (ggml-org#23480)

* download: add option to skip_download * fix * fix 2 * if file doesn't exist, respect skip_download flag

ggerganov · 2026-06-04T09:45:56Z

@ngxson This change caused a regression in the use case where we load a separate MTP draft model file. I use this command:

./bin/llama serve \
  -hf ggml-org/Qwen3.6-35B-A3B-GGUF:Q8_0 \
  --spec-type draft-mtp \
  --spec-draft-hf ggml-org/Qwen3.6-35B-A3B-GGUF \
  --spec-draft-model mtp-Qwen3.6-35B-A3B-Q4_0.gguf

After this change, something causes the mtp- model file to be downloaded two times at the same time:

Which later leads to corruption of the data:

download: add option to skip_download

5de5ce4

ngxson requested a review from angt May 14, 2026 15:15

ngxson requested review from a team as code owners May 14, 2026 15:16

ServeurpersoCom approved these changes May 14, 2026

View reviewed changes

github-actions Bot added examples server labels May 14, 2026

ngxson added 4 commits May 23, 2026 11:54

Merge branch 'master' into xsn/opt_skip_download

59927a6

fix

57b29c6

fix 2

0c332aa

if file doesn't exist, respect skip_download flag

ec6a687

angt approved these changes May 29, 2026

View reviewed changes

ngxson merged commit 06d26df into ggml-org:master May 29, 2026
44 of 49 checks passed

fewtarius pushed a commit to fewtarius/llama.cpp that referenced this pull request May 30, 2026

download: add option to skip_download (ggml-org#23059)

e0c1c38

* download: add option to skip_download * fix * fix 2 * if file doesn't exist, respect skip_download flag

turbo-tan pushed a commit to turbo-tan/llama.cpp-tq3 that referenced this pull request Jun 2, 2026

download: add option to skip_download (ggml-org#23059)

dc47990

* download: add option to skip_download * fix * fix 2 * if file doesn't exist, respect skip_download flag

ngxson mentioned this pull request Jun 4, 2026

arg: fix double mtp downloads #24128

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

download: add option to skip_download#23059

download: add option to skip_download#23059
ngxson merged 5 commits into
ggml-org:masterfrom
ngxson:xsn/opt_skip_download

ngxson commented May 14, 2026 •

edited

Loading

Uh oh!

ngxson commented May 19, 2026

Uh oh!

angt commented May 19, 2026

Uh oh!

ngxson commented May 19, 2026 •

edited

Loading

Uh oh!

angt commented May 20, 2026

Uh oh!

ngxson commented May 23, 2026

Uh oh!

ngxson commented May 29, 2026

Uh oh!

Uh oh!

ggerganov commented Jun 4, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

ngxson commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Requirements

Uh oh!

ngxson commented May 19, 2026

Uh oh!

angt commented May 19, 2026

Uh oh!

ngxson commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

angt commented May 20, 2026

Uh oh!

ngxson commented May 23, 2026

Uh oh!

ngxson commented May 29, 2026

Uh oh!

Uh oh!

ggerganov commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ngxson commented May 14, 2026 •

edited

Loading

ngxson commented May 19, 2026 •

edited

Loading

ggerganov commented Jun 4, 2026 •

edited

Loading