Skip to content

server: (router) expose child model info from router's /v1/models#22683

Merged
ngxson merged 2 commits into
ggml-org:masterfrom
ngxson:xsn/router_models_more_info
May 8, 2026
Merged

server: (router) expose child model info from router's /v1/models#22683
ngxson merged 2 commits into
ggml-org:masterfrom
ngxson:xsn/router_models_more_info

Conversation

@ngxson
Copy link
Copy Markdown
Contributor

@ngxson ngxson commented May 4, 2026

Overview

Allow server router instance to reflect the child instance's /v1/models info in the list of all models. This is possible for loaded models only.

For example:

{
    "data": [
        {
            "id": "ggml-org/GLM-OCR-GGUF:Q8_0",
            "aliases": [],
            "tags": [],
            "object": "model",
            "owned_by": "llamacpp",
            "created": 1777909820,
            "status": {
                "value": "loaded",
                "args": [
                    ....
                ],
                "preset": "[ggml-org/GLM-OCR-GGUF:Q8_0]\nhf-repo = ggml-org/GLM-OCR-GGUF:Q8_0\n\n"
            },
            "meta": {
                "vocab_type": 2,
                "n_vocab": 59392,
                "n_ctx": 131072,
                "n_ctx_train": 131072,
                "n_embd": 1536,
                "n_params": 891138048,
                "size": 947159040
            }
        }
    }
}

Requirements

@ngxson ngxson requested a review from a team as a code owner May 4, 2026 15:59
// also handle status report from child process
if (stdout_file) {
char buffer[4096];
char buffer[128 * 1024]; // large buffer for storing info
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better allocate this on the heap

@ngxson ngxson merged commit 9dcf835 into ggml-org:master May 8, 2026
45 of 46 checks passed
cetarthoriphros pushed a commit to cetarthoriphros/llama.cpp that referenced this pull request May 9, 2026
…ml-org#22683)

* server: (router) expose child model info from router's /v1/models

* update docs
meh pushed a commit to meh/llama.cpp that referenced this pull request May 10, 2026
…ml-org#22683)

* server: (router) expose child model info from router's /v1/models

* update docs
rsenthilkumar6 pushed a commit to rsenthilkumar6/llama.cpp that referenced this pull request May 19, 2026
…ml-org#22683)

* server: (router) expose child model info from router's /v1/models

* update docs
baramofme pushed a commit to baramofme/llama-cpp-turboquant that referenced this pull request May 23, 2026
…ml-org#22683)

* server: (router) expose child model info from router's /v1/models

* update docs
carlosfundora pushed a commit to carlosfundora/llama.cpp-1-bit-turbo that referenced this pull request May 24, 2026
…ml-org#22683)

* server: (router) expose child model info from router's /v1/models

* update docs

(cherry picked from commit 9dcf835)
winstonma pushed a commit to winstonma/llama.cpp that referenced this pull request May 27, 2026
…ml-org#22683)

* server: (router) expose child model info from router's /v1/models

* update docs
fewtarius pushed a commit to fewtarius/llama.cpp that referenced this pull request May 30, 2026
…ml-org#22683)

* server: (router) expose child model info from router's /v1/models

* update docs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants