feat: add configured model context size to models API by sammcj · Pull Request #548 · mostlygeek/llama-swap

sammcj · 2026-02-28T23:24:48Z

@mostlygeek - What are your thoughts on llamaswap exposing the configured model context size (if detected) (and potentially other information?) on the /models endpoint?

The problem right now is that as far as I can tell the only way to get model configuration information is by querying the upstream server, which means the upstream server has to start and load the model, then passing that information with some middleware e.g
https://llamaswap.your.domain/upstream/<model_name>/slots

The reason this would be useful is that this would allow clients to properly configure their settings to the models configured context window (and potentially other parameters).

This change would have /v1/models endpoint would return:

{
  "data": [
    {
      "context_length": 131072,      <-- new
      "created": 1772705823,
      "id": "qwen3-5-4b-ud-q6kxl-128k:general-instruct-reasoning",
      "object": "model",
      "owned_by": "llama-swap",
      "supports_vision": true        <-- new
    },

This is roughly in line with what many model providers such as OpenRouter and LLM proxies such as Bifrost (context_length and other parameters) provide.

e.g.

curl https://bifrost.your.domain/v1/models|jq

{
  "data": [
    {
      "id": "mistral/pixtral-large-latest",
      "name": "pixtral-large-2411",
      "context_length": 131072,
      ...
    },

Thoughts?

coderabbitai · 2026-02-28T23:24:55Z

Walkthrough

The PR adds automatic detection and exposure of context_length and supports_vision fields in the model list API response by implementing command-line flag parsing logic in ModelConfig and integrating results into ProxyManager's listModelsHandler.

Changes

Cohort / File(s)	Summary
Documentation `config.example.yaml`, `docs/configuration.md`	Added explanatory comments describing automatically detected fields (context_length, supports_vision) and macro/environment variable support for model configuration.
Model Config Implementation `proxy/config/model_config.go`	Introduced public methods ContextSize() and SupportsVision() that parse sanitized command strings to extract context size from various flag formats (--ctx-size/-c/--max-model-len) and detect multimodal projector flags (--mmproj), with parallel adjustment support.
Model Config Tests `proxy/config/model_config_test.go`	Added three test functions covering ContextSize parsing across flag variants, environment macro resolution, parallel adjustments, and SupportsVision detection with and without multimodal flags.
ProxyManager Integration `proxy/proxymanager.go`	Extended listModelsHandler to include context_length and supports_vision fields in model list response when available from ModelConfig methods.
ProxyManager Tests `proxy/proxymanager_test.go`	Added integration test validating end-to-end behavior of context_length and supports_vision fields in /v1/models endpoint across multiple model configurations.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

proxy/config: add model level macros #330: Introduces foundational ModelConfig and macro expansion logic that this PR extends with new ContextSize/SupportsVision parsing methods.
Add name and description to v1/models list #183: Previously modified ProxyManager.listModelsHandler to add model metadata fields; this PR extends that pattern with command-derived fields.
Support Macro-In-Macro #337: Refactors macro substitution behavior in model_config.go that directly supports the environment variable expansion relied upon by ContextSize/SupportsVision methods.

Suggested labels

enhancement, configuration

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly summarizes the main change: exposing configured model context size on the /v1/models API endpoint. It accurately reflects the primary objective of the changeset.
Description check	✅ Passed	The description is directly related to the changeset, explaining the problem statement, proposed solution, and expected API response changes that align with the actual code modifications.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

mostlygeek · 2026-03-01T03:33:37Z

I like this idea.

Are there a set of params that are commonly supported or accepted by clients? Parsing out all the accepted values we can out of args to expose in the /models api would be great.

The model config currently supports a metadata field that can hold almost thing. That is mostly useful for bespoke clients though.

sammcj · 2026-03-01T05:14:15Z

The most common ones I've seen applications use are:

context_length (e.g. 192000)
description (e.g. "Ministral 3 14B Instruct.")
supports_vision (true|false)

I could add support for those in to this PR if you'd like?

sammcj · 2026-03-05T10:19:50Z

Added supports_vision if a mmproj is detected
Added detection of setting context via env var

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@proxy/proxymanager_test.go`:
- Around line 441-457: The test iterates result.Get("data").Array() and checks
properties for known model ids but never fails if a model is missing; update the
test to explicitly assert all expected models were present by tracking seen IDs
(e.g., create a set/map and mark id for each model processed in the loop where
you switch on id for "with-ctx", "with-ctx-short", "vision-model",
"plain-model") and after the loop assert that the set contains all four expected
IDs (or that its length equals 4), so a missing model will fail the test.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 1f504a96-9dc9-4a39-9bb5-0ff24b33d027

📥 Commits

Reviewing files that changed from the base of the PR and between 390a35b and 9b60fc8.

📒 Files selected for processing (6)

config.example.yaml
docs/configuration.md
proxy/config/model_config.go
proxy/config/model_config_test.go
proxy/proxymanager.go
proxy/proxymanager_test.go

coderabbitai · 2026-03-05T10:25:01Z

proxy/proxymanager_test.go

+	for _, model := range result.Get("data").Array() {
+		id := model.Get("id").String()
+		switch id {
+		case "with-ctx":
+			assert.Equal(t, int64(131072), model.Get("context_length").Int())
+			assert.False(t, model.Get("supports_vision").Exists())
+		case "with-ctx-short":
+			assert.Equal(t, int64(4096), model.Get("context_length").Int())
+			assert.False(t, model.Get("supports_vision").Exists())
+		case "vision-model":
+			assert.Equal(t, int64(8192), model.Get("context_length").Int())
+			assert.True(t, model.Get("supports_vision").Bool())
+		case "plain-model":
+			assert.False(t, model.Get("context_length").Exists(), "plain-model should not have context_length")
+			assert.False(t, model.Get("supports_vision").Exists(), "plain-model should not have supports_vision")
+		}
+	}


⚠️ Potential issue | 🟡 Minor

Add an explicit “all expected models were present” assertion.

Right now this loop validates known IDs but can still pass if one expected model is missing from data.

✅ Suggested test hardening

result := gjson.ParseBytes(w.Body.Bytes()) + expectedSeen := map[string]bool{ + "with-ctx": false, + "with-ctx-short": false, + "vision-model": false, + "plain-model": false, + } for _, model := range result.Get("data").Array() { id := model.Get("id").String() switch id { case "with-ctx": + expectedSeen[id] = true assert.Equal(t, int64(131072), model.Get("context_length").Int()) assert.False(t, model.Get("supports_vision").Exists()) case "with-ctx-short": + expectedSeen[id] = true assert.Equal(t, int64(4096), model.Get("context_length").Int()) assert.False(t, model.Get("supports_vision").Exists()) case "vision-model": + expectedSeen[id] = true assert.Equal(t, int64(8192), model.Get("context_length").Int()) assert.True(t, model.Get("supports_vision").Bool()) case "plain-model": + expectedSeen[id] = true assert.False(t, model.Get("context_length").Exists(), "plain-model should not have context_length") assert.False(t, model.Get("supports_vision").Exists(), "plain-model should not have supports_vision") + default: + t.Fatalf("unexpected model id in response: %s", id) } } + for id, seen := range expectedSeen { + assert.True(t, seen, "expected model %s to be present in /v1/models response", id) + }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@proxy/proxymanager_test.go` around lines 441 - 457, The test iterates result.Get("data").Array() and checks properties for known model ids but never fails if a model is missing; update the test to explicitly assert all expected models were present by tracking seen IDs (e.g., create a set/map and mark id for each model processed in the loop where you switch on id for "with-ctx", "with-ctx-short", "vision-model", "plain-model") and after the loop assert that the set contains all four expected IDs (or that its length equals 4), so a missing model will fail the test.

sammcj force-pushed the model_listing branch from 1913951 to 10cdb8f Compare February 28, 2026 23:25

feat: add configured model context size to models API

69cef9a

sammcj force-pushed the model_listing branch from 8c42730 to 69cef9a Compare February 28, 2026 23:27

sammcj added 2 commits March 2, 2026 11:14

Merge branch 'main' into model_listing

94bdd5d

feat: add configured model context size to models API

9b60fc8

sammcj marked this pull request as ready for review March 5, 2026 10:18

coderabbitai bot reviewed Mar 5, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add configured model context size to models API#548

feat: add configured model context size to models API#548
sammcj wants to merge 3 commits intomostlygeek:mainfrom
sammcj:model_listing

sammcj commented Feb 28, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Feb 28, 2026 •

edited

Loading

❌ Failed checks (1 warning)

Uh oh!

mostlygeek commented Mar 1, 2026

Uh oh!

sammcj commented Mar 1, 2026

Uh oh!

sammcj commented Mar 5, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Mar 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sammcj commented Feb 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot commented Feb 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

❌ Failed checks (1 warning)

Uh oh!

mostlygeek commented Mar 1, 2026

Uh oh!

sammcj commented Mar 1, 2026

Uh oh!

sammcj commented Mar 5, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sammcj commented Feb 28, 2026 •

edited

Loading

coderabbitai bot commented Feb 28, 2026 •

edited

Loading