feat: add configured model context size to models API#548
feat: add configured model context size to models API#548sammcj wants to merge 3 commits intomostlygeek:mainfrom
Conversation
WalkthroughThe PR adds automatic detection and exposure of Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
1913951 to
10cdb8f
Compare
8c42730 to
69cef9a
Compare
|
I like this idea. Are there a set of params that are commonly supported or accepted by clients? Parsing out all the accepted values we can out of args to expose in the /models api would be great. The model config currently supports a metadata field that can hold almost thing. That is mostly useful for bespoke clients though. |
|
The most common ones I've seen applications use are:
I could add support for those in to this PR if you'd like? |
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@proxy/proxymanager_test.go`:
- Around line 441-457: The test iterates result.Get("data").Array() and checks
properties for known model ids but never fails if a model is missing; update the
test to explicitly assert all expected models were present by tracking seen IDs
(e.g., create a set/map and mark id for each model processed in the loop where
you switch on id for "with-ctx", "with-ctx-short", "vision-model",
"plain-model") and after the loop assert that the set contains all four expected
IDs (or that its length equals 4), so a missing model will fail the test.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 1f504a96-9dc9-4a39-9bb5-0ff24b33d027
📒 Files selected for processing (6)
config.example.yamldocs/configuration.mdproxy/config/model_config.goproxy/config/model_config_test.goproxy/proxymanager.goproxy/proxymanager_test.go
| for _, model := range result.Get("data").Array() { | ||
| id := model.Get("id").String() | ||
| switch id { | ||
| case "with-ctx": | ||
| assert.Equal(t, int64(131072), model.Get("context_length").Int()) | ||
| assert.False(t, model.Get("supports_vision").Exists()) | ||
| case "with-ctx-short": | ||
| assert.Equal(t, int64(4096), model.Get("context_length").Int()) | ||
| assert.False(t, model.Get("supports_vision").Exists()) | ||
| case "vision-model": | ||
| assert.Equal(t, int64(8192), model.Get("context_length").Int()) | ||
| assert.True(t, model.Get("supports_vision").Bool()) | ||
| case "plain-model": | ||
| assert.False(t, model.Get("context_length").Exists(), "plain-model should not have context_length") | ||
| assert.False(t, model.Get("supports_vision").Exists(), "plain-model should not have supports_vision") | ||
| } | ||
| } |
There was a problem hiding this comment.
Add an explicit “all expected models were present” assertion.
Right now this loop validates known IDs but can still pass if one expected model is missing from data.
✅ Suggested test hardening
result := gjson.ParseBytes(w.Body.Bytes())
+ expectedSeen := map[string]bool{
+ "with-ctx": false,
+ "with-ctx-short": false,
+ "vision-model": false,
+ "plain-model": false,
+ }
for _, model := range result.Get("data").Array() {
id := model.Get("id").String()
switch id {
case "with-ctx":
+ expectedSeen[id] = true
assert.Equal(t, int64(131072), model.Get("context_length").Int())
assert.False(t, model.Get("supports_vision").Exists())
case "with-ctx-short":
+ expectedSeen[id] = true
assert.Equal(t, int64(4096), model.Get("context_length").Int())
assert.False(t, model.Get("supports_vision").Exists())
case "vision-model":
+ expectedSeen[id] = true
assert.Equal(t, int64(8192), model.Get("context_length").Int())
assert.True(t, model.Get("supports_vision").Bool())
case "plain-model":
+ expectedSeen[id] = true
assert.False(t, model.Get("context_length").Exists(), "plain-model should not have context_length")
assert.False(t, model.Get("supports_vision").Exists(), "plain-model should not have supports_vision")
+ default:
+ t.Fatalf("unexpected model id in response: %s", id)
}
}
+ for id, seen := range expectedSeen {
+ assert.True(t, seen, "expected model %s to be present in /v1/models response", id)
+ }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@proxy/proxymanager_test.go` around lines 441 - 457, The test iterates
result.Get("data").Array() and checks properties for known model ids but never
fails if a model is missing; update the test to explicitly assert all expected
models were present by tracking seen IDs (e.g., create a set/map and mark id for
each model processed in the loop where you switch on id for "with-ctx",
"with-ctx-short", "vision-model", "plain-model") and after the loop assert that
the set contains all four expected IDs (or that its length equals 4), so a
missing model will fail the test.
@mostlygeek - What are your thoughts on llamaswap exposing the configured model context size (if detected) (and potentially other information?) on the /models endpoint?
The problem right now is that as far as I can tell the only way to get model configuration information is by querying the upstream server, which means the upstream server has to start and load the model, then passing that information with some middleware e.g
https://llamaswap.your.domain/upstream/<model_name>/slotsThe reason this would be useful is that this would allow clients to properly configure their settings to the models configured context window (and potentially other parameters).
This change would have /v1/models endpoint would return:
This is roughly in line with what many model providers such as OpenRouter and LLM proxies such as Bifrost (
context_lengthand other parameters) provide.e.g.
Thoughts?