Add template_fill config type to generate model variants from one template#550
Add template_fill config type to generate model variants from one template#550kamjin3086 wants to merge 5 commits intomostlygeek:mainfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds template-based model variants: new Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
proxy/config/template_fill_test.go (1)
238-301: Add a regression case for mixed flag override formats.Please add a table case where
baseuses--temp 0.8andadduses--temp=1.0so this bug class is covered going forward.➕ Suggested test case
@@ { name: "equals format override", base: "server --temp=0.8 --ctx=4096", add: "--temp=1.0", expected: "server --temp=1.0 --ctx=4096", }, + { + name: "mixed format override", + base: "server --temp 0.8 --ctx 4096", + add: "--temp=1.0", + expected: "server --temp=1.0 --ctx 4096", + },🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@proxy/config/template_fill_test.go` around lines 238 - 301, Add a regression test in TestMergeCommands_Basic that covers mixed flag formats: add a test case entry where name is like "mixed flag override", base is "server --temp 0.8" and add is "--temp=1.0" with expected "server --temp=1.0"; place it alongside the other cases in the tests slice so mergeCommands is exercised for overriding a space-separated flag with an equals-form flag.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@proxy/config/config.go`:
- Around line 211-215: ExpandVariants currently replaces template entries in
config.Models which can break references in groups.members and
hooks.on_startup.preload; change the flow in the code that calls ExpandVariants
to either (a) return a mapping from a template ID to the generated variant IDs
and then replace any occurrences of that template ID in groups.members and
hooks.on_startup.preload with the appropriate variant IDs, or (b) detect when a
template ID is still referenced after expansion and return a clear validation
error. Concretely: modify ExpandVariants (or wrap it) to produce both the new
[]Model and a map[string][]string of templateID->variantIDs, then scan
config.Groups[*].Members and config.Hooks.OnStartup.Preload to substitute or
expand template IDs using that map (or collect missing references and return an
error), so no references remain dangling after config.Models is replaced.
In `@proxy/config/template_fill.go`:
- Around line 126-141: When replacing a separate-value form in template_fill.go
(variables: baseTokens, addTokens, baseIdx, i, flag, token, embeddedValue, and
helper isArgument), if you convert to "--flag=value" you must also remove the
stale following positional token from baseTokens (delete baseTokens[baseIdx+1]
or splice it out) so you don't leave the old value behind; adjust any indexes
accordingly and ensure i is advanced consistently. Apply the same fix to the
similar branch around the other occurrence (lines handling baseIdx+1 at the
other block) so both the "--flag=value overwrites --flag value" conversion
splices out the old value token rather than leaving it.
- Around line 13-26: ExpandVariants currently silently overwrites entries in
result when a generated variant ID collides with an existing key; change
ExpandVariants to return (map[string]ModelConfig, error), and before writing
into result (both the no-variant branch and when adding variantModelID created
by expandVariant), check if that key already exists in result and if so return a
descriptive error (include modelID and variantSuffix for collisions); ensure
callers are updated to handle the error.
---
Nitpick comments:
In `@proxy/config/template_fill_test.go`:
- Around line 238-301: Add a regression test in TestMergeCommands_Basic that
covers mixed flag formats: add a test case entry where name is like "mixed flag
override", base is "server --temp 0.8" and add is "--temp=1.0" with expected
"server --temp=1.0"; place it alongside the other cases in the tests slice so
mergeCommands is exercised for overriding a space-separated flag with an
equals-form flag.
ℹ️ Review info
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
config-schema.jsonconfig.example.yamlproxy/config/config.goproxy/config/model_config.goproxy/config/template_fill.goproxy/config/template_fill_test.go
There was a problem hiding this comment.
🧹 Nitpick comments (1)
proxy/config/template_fill_test.go (1)
303-342: Test name should include a descriptive suffix.Per the coding guidelines, test names should follow the pattern
Test<Component>_<TestCase>. Consider renaming to something likeTestTokenizeCommand_ParsingorTestTokenizeCommand_BasicCases.Suggested rename
-func TestTokenizeCommand(t *testing.T) { +func TestTokenizeCommand_Parsing(t *testing.T) {As per coding guidelines: "
proxy/**/*_test.go: Follow test naming conventions likeTestProxyManager_<test name>,TestProcessGroup_<test name>, etc."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@proxy/config/template_fill_test.go` around lines 303 - 342, Rename the test function TestTokenizeCommand to follow the project's test naming convention (e.g., TestTokenizeCommand_Parsing or TestTokenizeCommand_BasicCases) so it matches the required pattern; update the test declaration for TestTokenizeCommand and any references to it while keeping the same test body and calls to tokenizeCommand and assert.Equal.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@proxy/config/template_fill_test.go`:
- Around line 303-342: Rename the test function TestTokenizeCommand to follow
the project's test naming convention (e.g., TestTokenizeCommand_Parsing or
TestTokenizeCommand_BasicCases) so it matches the required pattern; update the
test declaration for TestTokenizeCommand and any references to it while keeping
the same test body and calls to tokenizeCommand and assert.Equal.
Apply fixes from code review: collision detection, stale command tokens, and template refs in groups/preload. - ExpandVariants now returns (ExpandVariantsResult, error); detect duplicate model IDs and variant collisions and fail fast with clear errors. Expose TemplateToVariants for config substitution. - mergeCommands: when add uses --flag=value and base had --flag value, clear the base's value slot and compact baseTokens so the final command has no leftover positional value (e.g. no "--temp=1.0 0.8"). - substituteTemplateRefsInConfig: replace template model IDs in groups.members and hooks.on_startup.preload with the expanded variant IDs so refs stay valid after expansion. - Tests: add mixed-format override case to TestMergeCommands_Basic; rename TestTokenizeCommand to TestTokenizeCommand_Parsing.
There was a problem hiding this comment.
🧹 Nitpick comments (3)
proxy/config/template_fill.go (2)
201-233: Tokenizer doesn't handle escape sequences within quotes.The tokenizer handles quoted strings but doesn't process escape sequences like
\"or\\inside quotes. For example,--arg "value with \"escaped\" quotes"would not be parsed correctly. This may be acceptable if such edge cases aren't expected in practice, but consider documenting this limitation.Consider adding a comment to document this limitation:
// tokenizeCommand splits a command string into tokens, handling quoted strings +// Note: Does not handle escape sequences within quoted strings (e.g., \") func tokenizeCommand(cmd string) []string {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@proxy/config/template_fill.go` around lines 201 - 233, The tokenizer function tokenizeCommand does not handle escape sequences inside quoted strings (e.g., \" or \\), so update tokenizeCommand to recognize backslash as an escape character when inQuote: on seeing '\' consume the next rune and append it literally (handling end-of-string safely), ensuring sequences like \" and \\ are preserved inside the token; alternatively (if you prefer not to implement escapes), add a clear comment above tokenizeCommand documenting the limitation and examples of unsupported inputs so callers know quoted-escape sequences are not handled.
68-70: Consider copyingMacrosto avoid shared slice mutation.
base.Macrosis assigned directly toexpanded.Macros. SinceMacroListis a slice type ([]MacroEntry), this shares the underlying array. If code elsewhere mutates the expanded model's macros, it would affect the base model's macros too. While this may not be an issue in the current flow (since templates are discarded), it's safer to copy.♻️ Suggested fix
Filters: base.Filters, - Macros: base.Macros, + Macros: copyMacroList(base.Macros), Metadata: copyMetadata(base.Metadata),Add the helper function:
// copyMacroList creates a copy of a MacroList func copyMacroList(ml MacroList) MacroList { if ml == nil { return nil } result := make(MacroList, len(ml)) copy(result, ml) return result }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@proxy/config/template_fill.go` around lines 68 - 70, The expanded model currently assigns base.Macros directly to expanded.Macros which shares the underlying slice and risks unintended mutation; add a helper like copyMacroList(ml MacroList) that returns a newly allocated MacroList copy (preserving nil semantics) and use it when populating expanded.Macros (replace assignment of base.Macros with copyMacroList(base.Macros)) to avoid shared-slice aliasing.proxy/config/template_fill_test.go (1)
366-389: Consider adding a test for template ID substitution in groups.This test verifies that explicitly named variant IDs work in groups. However, it doesn't test the
substituteTemplateRefsInConfigfunctionality where a template ID (e.g.,template-model) in groups.members is automatically expanded to its variant IDs.🧪 Suggested additional test
func TestExpandVariants_TemplateRefSubstitutionInGroups(t *testing.T) { content := ` models: template-model: cmd: server --port ${PORT} variants: v1: cmdAdd: --v1 v2: cmdAdd: --v2 groups: mygroup: members: - template-model ` config, err := LoadConfigFromReader(strings.NewReader(content)) assert.NoError(t, err) // Template ID should be substituted with variant IDs group := config.Groups["mygroup"] assert.Contains(t, group.Members, "template-model-v1") assert.Contains(t, group.Members, "template-model-v2") assert.NotContains(t, group.Members, "template-model") }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@proxy/config/template_fill_test.go` around lines 366 - 389, Add a new unit test named TestExpandVariants_TemplateRefSubstitutionInGroups that loads a config where a group member is the template ID "template-model" (not explicit variant IDs), calls LoadConfigFromReader (so substituteTemplateRefsInConfig runs during config load), and asserts that group := config.Groups["mygroup"] contains "template-model-v1" and "template-model-v2" and does NOT contain "template-model"; this verifies substituteTemplateRefsInConfig correctly expands template IDs into their variant IDs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@proxy/config/template_fill_test.go`:
- Around line 366-389: Add a new unit test named
TestExpandVariants_TemplateRefSubstitutionInGroups that loads a config where a
group member is the template ID "template-model" (not explicit variant IDs),
calls LoadConfigFromReader (so substituteTemplateRefsInConfig runs during config
load), and asserts that group := config.Groups["mygroup"] contains
"template-model-v1" and "template-model-v2" and does NOT contain
"template-model"; this verifies substituteTemplateRefsInConfig correctly expands
template IDs into their variant IDs.
In `@proxy/config/template_fill.go`:
- Around line 201-233: The tokenizer function tokenizeCommand does not handle
escape sequences inside quoted strings (e.g., \" or \\), so update
tokenizeCommand to recognize backslash as an escape character when inQuote: on
seeing '\' consume the next rune and append it literally (handling end-of-string
safely), ensuring sequences like \" and \\ are preserved inside the token;
alternatively (if you prefer not to implement escapes), add a clear comment
above tokenizeCommand documenting the limitation and examples of unsupported
inputs so callers know quoted-escape sequences are not handled.
- Around line 68-70: The expanded model currently assigns base.Macros directly
to expanded.Macros which shares the underlying slice and risks unintended
mutation; add a helper like copyMacroList(ml MacroList) that returns a newly
allocated MacroList copy (preserving nil semantics) and use it when populating
expanded.Macros (replace assignment of base.Macros with
copyMacroList(base.Macros)) to avoid shared-slice aliasing.
ℹ️ Review info
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
proxy/config/config.goproxy/config/template_fill.goproxy/config/template_fill_test.go
- tokenizeCommand: handle backslash escapes inside quoted strings (\ " \' \n \t)
so --chat-template-kwargs "{\"enable_thinking\": false}" parses correctly.
- copyMacroList: avoid expanded variants sharing base.Macros slice.
- Tests: tokenizer double/single-quoted JSON, mergeCommands chat-template-kwargs,
TestExpandVariants_TemplateRefSubstitutionInGroups, TestLoadTestVariantsConfig_FromFile.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@proxy/config/template_fill.go`:
- Line 68: The Filters field is being assigned directly from base.Filters which
shares underlying map references; add a deep-copy helper (e.g., copyFilters(f
ModelFilters) ModelFilters) that clones SetParams and SetParamsByID (copying
inner maps) and use it where Filters is set (replace direct assignment of
base.Filters with copyFilters(base.Filters)) so variants get their own
independent maps; reference ModelFilters, SetParams, and SetParamsByID when
implementing the helper and wiring it into the code that constructs the variant.
- copyFilters: deep-copy ModelFilters (SetParams, SetParamsByID maps) so variants do not share map references with base. - gofmt template_fill_test.go (slice literal spacing).
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@proxy/config/template_fill.go`:
- Around line 289-299: The copy operation for SetParamsByID currently omits keys
whose value is nil, changing semantics; update the loop in the function that
populates result.SetParamsByID so that when f.SetParamsByID[k] (v) is nil you
still assign result.SetParamsByID[k] = nil instead of skipping it, and when v !=
nil continue to deep-copy the inner map as done now (references: SetParamsByID,
result.SetParamsByID).
| if f.SetParamsByID != nil { | ||
| result.SetParamsByID = make(map[string]map[string]any, len(f.SetParamsByID)) | ||
| for k, v := range f.SetParamsByID { | ||
| if v != nil { | ||
| copied := make(map[string]any, len(v)) | ||
| for kk, vv := range v { | ||
| copied[kk] = vv | ||
| } | ||
| result.SetParamsByID[k] = copied | ||
| } | ||
| } |
There was a problem hiding this comment.
Preserve SetParamsByID keys when inner map is nil.
On Line 292, nil entries are skipped, which drops keys from the copied map. That changes semantics (key exists with nil vs key absent). copyFilters should retain those keys to stay lossless.
🛠️ Suggested fix
if f.SetParamsByID != nil {
result.SetParamsByID = make(map[string]map[string]any, len(f.SetParamsByID))
for k, v := range f.SetParamsByID {
- if v != nil {
- copied := make(map[string]any, len(v))
- for kk, vv := range v {
- copied[kk] = vv
- }
- result.SetParamsByID[k] = copied
- }
+ if v == nil {
+ result.SetParamsByID[k] = nil
+ continue
+ }
+ copied := make(map[string]any, len(v))
+ for kk, vv := range v {
+ copied[kk] = vv
+ }
+ result.SetParamsByID[k] = copied
}
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if f.SetParamsByID != nil { | |
| result.SetParamsByID = make(map[string]map[string]any, len(f.SetParamsByID)) | |
| for k, v := range f.SetParamsByID { | |
| if v != nil { | |
| copied := make(map[string]any, len(v)) | |
| for kk, vv := range v { | |
| copied[kk] = vv | |
| } | |
| result.SetParamsByID[k] = copied | |
| } | |
| } | |
| if f.SetParamsByID != nil { | |
| result.SetParamsByID = make(map[string]map[string]any, len(f.SetParamsByID)) | |
| for k, v := range f.SetParamsByID { | |
| if v == nil { | |
| result.SetParamsByID[k] = nil | |
| continue | |
| } | |
| copied := make(map[string]any, len(v)) | |
| for kk, vv := range v { | |
| copied[kk] = vv | |
| } | |
| result.SetParamsByID[k] = copied | |
| } | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@proxy/config/template_fill.go` around lines 289 - 299, The copy operation for
SetParamsByID currently omits keys whose value is nil, changing semantics;
update the loop in the function that populates result.SetParamsByID so that when
f.SetParamsByID[k] (v) is nil you still assign result.SetParamsByID[k] = nil
instead of skipping it, and when v != nil continue to deep-copy the inner map as
done now (references: SetParamsByID, result.SetParamsByID).
|
Thanks for the work on this. I won't be accepting it into llama-swap. Please feel free to maintain it in your own fork. As mentioned in my comment in #549, I recommend using the new SetParamsByID filter for model variants. It's not quite as powerful as this design but it is balance of simple and functional that I want. |
No problem. My situation is indeed quite specific, and I understand your decision. |
Implements #549.
Intent:
One base config holds shared settings; fill_enum holds per-variant overrides. The server expands this into multiple models (e.g. Qwen3.5-35B-A3B-thinking-normal, -thinking-coding, etc.), so users can switch variants in the UI or by model ID without duplicating config. This is especially useful when controlling parameters on the client is inconvenient or not possible.
Compatibility:
New config type only; existing configs are unchanged. It also works with setParamById without conflict—both can be used in the same setup; they do not interfere with each other.
Example:
GET /v1/models response (after expanding variants)
With the config above, the template Qwen3.5-35B-A3B is expanded into four variant model IDs. A request to GET /v1/models returns something like:
{ "object": "list", "data": [ { "id": "Gemma-3-4B-IT", "object": "model", "created": 1709251200, "owned_by": "llama-swap" }, { "id": "Qwen3.5-35B-A3B-nothinking_coding", "object": "model", "created": 1709251200, "owned_by": "llama-swap", "name": "Qwen3.5 35B (Instruct, Coding)", "description": "For coding without thinking, temp=0.6" }, { "id": "Qwen3.5-35B-A3B-nothinking_normal", "object": "model", "created": 1709251200, "owned_by": "llama-swap", "name": "Qwen3.5 35B (Instruct, Normal)", "description": "For general tasks without thinking, temp=0.7" }, { "id": "Qwen3.5-35B-A3B-thinking_coding", "object": "model", "created": 1709251200, "owned_by": "llama-swap", "name": "Qwen3.5 35B (Thinking, Coding)", "description": "For coding tasks with thinking enabled, temp=0.6" }, { "id": "Qwen3.5-35B-A3B-thinking_normal", "object": "model", "created": 1709251200, "owned_by": "llama-swap", "name": "Qwen3.5 35B (Thinking, Normal)", "description": "For general tasks with thinking enabled, temp=1.0" } ] }Entries are sorted by id. Clients can use these model IDs (e.g. in model for chat completions) or the configured aliases (e.g. qwen35-thinking, qwen35-code) where supported.