Skip to content

feat: add configured model context size to models API#548

Open
sammcj wants to merge 3 commits intomostlygeek:mainfrom
sammcj:model_listing
Open

feat: add configured model context size to models API#548
sammcj wants to merge 3 commits intomostlygeek:mainfrom
sammcj:model_listing

Conversation

@sammcj
Copy link
Contributor

@sammcj sammcj commented Feb 28, 2026

@mostlygeek - What are your thoughts on llamaswap exposing the configured model context size (if detected) (and potentially other information?) on the /models endpoint?

The problem right now is that as far as I can tell the only way to get model configuration information is by querying the upstream server, which means the upstream server has to start and load the model, then passing that information with some middleware e.g
https://llamaswap.your.domain/upstream/<model_name>/slots

The reason this would be useful is that this would allow clients to properly configure their settings to the models configured context window (and potentially other parameters).

This change would have /v1/models endpoint would return:

{
  "data": [
    {
      "context_length": 131072,      <-- new
      "created": 1772705823,
      "id": "qwen3-5-4b-ud-q6kxl-128k:general-instruct-reasoning",
      "object": "model",
      "owned_by": "llama-swap",
      "supports_vision": true        <-- new
    },

This is roughly in line with what many model providers such as OpenRouter and LLM proxies such as Bifrost (context_length and other parameters) provide.

e.g.

curl https://bifrost.your.domain/v1/models|jq

{
  "data": [
    {
      "id": "mistral/pixtral-large-latest",
      "name": "pixtral-large-2411",
      "context_length": 131072,
      ...
    },

Thoughts?

@coderabbitai
Copy link

coderabbitai bot commented Feb 28, 2026

Walkthrough

The PR adds automatic detection and exposure of context_length and supports_vision fields in the model list API response by implementing command-line flag parsing logic in ModelConfig and integrating results into ProxyManager's listModelsHandler.

Changes

Cohort / File(s) Summary
Documentation
config.example.yaml, docs/configuration.md
Added explanatory comments describing automatically detected fields (context_length, supports_vision) and macro/environment variable support for model configuration.
Model Config Implementation
proxy/config/model_config.go
Introduced public methods ContextSize() and SupportsVision() that parse sanitized command strings to extract context size from various flag formats (--ctx-size/-c/--max-model-len) and detect multimodal projector flags (--mmproj), with parallel adjustment support.
Model Config Tests
proxy/config/model_config_test.go
Added three test functions covering ContextSize parsing across flag variants, environment macro resolution, parallel adjustments, and SupportsVision detection with and without multimodal flags.
ProxyManager Integration
proxy/proxymanager.go
Extended listModelsHandler to include context_length and supports_vision fields in model list response when available from ModelConfig methods.
ProxyManager Tests
proxy/proxymanager_test.go
Added integration test validating end-to-end behavior of context_length and supports_vision fields in /v1/models endpoint across multiple model configurations.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested labels

enhancement, configuration

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: exposing configured model context size on the /v1/models API endpoint. It accurately reflects the primary objective of the changeset.
Description check ✅ Passed The description is directly related to the changeset, explaining the problem statement, proposed solution, and expected API response changes that align with the actual code modifications.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@mostlygeek
Copy link
Owner

I like this idea.

Are there a set of params that are commonly supported or accepted by clients? Parsing out all the accepted values we can out of args to expose in the /models api would be great.

The model config currently supports a metadata field that can hold almost thing. That is mostly useful for bespoke clients though.

@sammcj
Copy link
Contributor Author

sammcj commented Mar 1, 2026

The most common ones I've seen applications use are:

  • context_length (e.g. 192000)
  • description (e.g. "Ministral 3 14B Instruct.")
  • supports_vision (true|false)

I could add support for those in to this PR if you'd like?

@sammcj sammcj marked this pull request as ready for review March 5, 2026 10:18
@sammcj
Copy link
Contributor Author

sammcj commented Mar 5, 2026

  • Added supports_vision if a mmproj is detected
  • Added detection of setting context via env var

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@proxy/proxymanager_test.go`:
- Around line 441-457: The test iterates result.Get("data").Array() and checks
properties for known model ids but never fails if a model is missing; update the
test to explicitly assert all expected models were present by tracking seen IDs
(e.g., create a set/map and mark id for each model processed in the loop where
you switch on id for "with-ctx", "with-ctx-short", "vision-model",
"plain-model") and after the loop assert that the set contains all four expected
IDs (or that its length equals 4), so a missing model will fail the test.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 1f504a96-9dc9-4a39-9bb5-0ff24b33d027

📥 Commits

Reviewing files that changed from the base of the PR and between 390a35b and 9b60fc8.

📒 Files selected for processing (6)
  • config.example.yaml
  • docs/configuration.md
  • proxy/config/model_config.go
  • proxy/config/model_config_test.go
  • proxy/proxymanager.go
  • proxy/proxymanager_test.go

Comment on lines +441 to +457
for _, model := range result.Get("data").Array() {
id := model.Get("id").String()
switch id {
case "with-ctx":
assert.Equal(t, int64(131072), model.Get("context_length").Int())
assert.False(t, model.Get("supports_vision").Exists())
case "with-ctx-short":
assert.Equal(t, int64(4096), model.Get("context_length").Int())
assert.False(t, model.Get("supports_vision").Exists())
case "vision-model":
assert.Equal(t, int64(8192), model.Get("context_length").Int())
assert.True(t, model.Get("supports_vision").Bool())
case "plain-model":
assert.False(t, model.Get("context_length").Exists(), "plain-model should not have context_length")
assert.False(t, model.Get("supports_vision").Exists(), "plain-model should not have supports_vision")
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add an explicit “all expected models were present” assertion.

Right now this loop validates known IDs but can still pass if one expected model is missing from data.

✅ Suggested test hardening
 	result := gjson.ParseBytes(w.Body.Bytes())
+	expectedSeen := map[string]bool{
+		"with-ctx":       false,
+		"with-ctx-short": false,
+		"vision-model":   false,
+		"plain-model":    false,
+	}
 
 	for _, model := range result.Get("data").Array() {
 		id := model.Get("id").String()
 		switch id {
 		case "with-ctx":
+			expectedSeen[id] = true
 			assert.Equal(t, int64(131072), model.Get("context_length").Int())
 			assert.False(t, model.Get("supports_vision").Exists())
 		case "with-ctx-short":
+			expectedSeen[id] = true
 			assert.Equal(t, int64(4096), model.Get("context_length").Int())
 			assert.False(t, model.Get("supports_vision").Exists())
 		case "vision-model":
+			expectedSeen[id] = true
 			assert.Equal(t, int64(8192), model.Get("context_length").Int())
 			assert.True(t, model.Get("supports_vision").Bool())
 		case "plain-model":
+			expectedSeen[id] = true
 			assert.False(t, model.Get("context_length").Exists(), "plain-model should not have context_length")
 			assert.False(t, model.Get("supports_vision").Exists(), "plain-model should not have supports_vision")
+		default:
+			t.Fatalf("unexpected model id in response: %s", id)
 		}
 	}
+	for id, seen := range expectedSeen {
+		assert.True(t, seen, "expected model %s to be present in /v1/models response", id)
+	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@proxy/proxymanager_test.go` around lines 441 - 457, The test iterates
result.Get("data").Array() and checks properties for known model ids but never
fails if a model is missing; update the test to explicitly assert all expected
models were present by tracking seen IDs (e.g., create a set/map and mark id for
each model processed in the loop where you switch on id for "with-ctx",
"with-ctx-short", "vision-model", "plain-model") and after the loop assert that
the set contains all four expected IDs (or that its length equals 4), so a
missing model will fail the test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants