Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 5 additions & 4 deletions config.example.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,7 @@ macros:
# - required
# - each key is the model's ID, used in API requests
# - model settings have default values that are used if they are not defined here
# - the model's ID is available in the ${MODEL_ID} macro, also available in macros defined above
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Clarify where ${MODEL_ID} expands (cmd/cmdStop only).

As implemented, ${MODEL_ID} is substituted in model.cmd and model.cmdStop (not in proxy or checkEndpoint). Update the comment to avoid implying broader scope.

Apply this diff:

-# - the model's ID is available in the ${MODEL_ID} macro, also available in macros defined above
+# - the model's ID is available as ${MODEL_ID} in model.cmd and model.cmdStop
+# - macros that are referenced by cmd/cmdStop may also include ${MODEL_ID}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# - the model's ID is available in the ${MODEL_ID} macro, also available in macros defined above
# - the model's ID is available as ${MODEL_ID} in model.cmd and model.cmdStop
# - macros that are referenced by cmd/cmdStop may also include ${MODEL_ID}
🤖 Prompt for AI Agents
In config.example.yaml around line 52, the comment incorrectly implies
${MODEL_ID} is available broadly; clarify that ${MODEL_ID} is only substituted
into model.cmd and model.cmdStop. Update the comment text to state explicitly
that the ${MODEL_ID} macro is available only for model.cmd and model.cmdStop
(and not in proxy or checkEndpoint), so users won’t assume it expands elsewhere.

# - below are examples of the various settings a model can have:
# - available model settings: env, cmd, cmdStop, proxy, aliases, checkEndpoint, ttl, unlisted
models:
Expand Down Expand Up @@ -148,12 +149,12 @@ models:
cmd: llama-server --port ${PORT} -m Llama-3.2-1B-Instruct-Q4_K_M.gguf -ngl 0

# Docker example:
# container run times like Docker and Podman can be used reliably with a
# a combination of cmd and cmdStop.
# container runtimes like Docker and Podman can be used reliably with
# a combination of cmd, cmdStop, and ${MODEL_ID}
"docker-llama":
Comment on lines +152 to 154
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Avoid suggesting ${MODEL_ID} for container names; many model IDs are not Docker/Podman-safe.

Model IDs can contain '/', ':' (e.g., author/model:variant), which are invalid in container names. Recommend ${PORT} (numeric, unique per model using it) for names and stopping commands. This is a documentation/runtime correctness fix; MODEL_ID remains trusted input per project learnings.

Apply this diff:

-  # container runtimes like Docker and Podman can be used reliably with
-  # a combination of cmd, cmdStop, and ${MODEL_ID}
+  # container runtimes like Docker and Podman can be used with cmd/cmdStop.
+  # NOTE: prefer ${PORT} for container names; ${MODEL_ID} may contain '/' or ':' and won't be a valid container name.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# container runtimes like Docker and Podman can be used reliably with
# a combination of cmd, cmdStop, and ${MODEL_ID}
"docker-llama":
# container runtimes like Docker and Podman can be used with cmd/cmdStop.
# NOTE: prefer ${PORT} for container names; ${MODEL_ID} may contain '/' or ':' and won't be a valid container name.
"docker-llama":
🤖 Prompt for AI Agents
In config.example.yaml around lines 152 to 154, the docs currently suggest using
${MODEL_ID} in container names/stop commands which is unsafe because model IDs
may contain characters invalid for Docker/Podman; update the text and any
examples to recommend using ${PORT} (a numeric, unique per-model identifier) for
container names and stop commands instead, ensure guidance notes "${PORT} must
be unique per model instance and numeric", and adjust any example command
snippets to use ${PORT} in place of ${MODEL_ID}.

proxy: "http://127.0.0.1:${PORT}"
cmd: |
docker run --name dockertest
docker run --name ${MODEL_ID}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Use a PORT-based container name to ensure validity across all model IDs.

Prevents failures when MODEL_ID contains invalid characters for container names.

Apply this diff:

-      docker run --name ${MODEL_ID}
+      docker run --name llama-${PORT}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
docker run --name ${MODEL_ID}
docker run --name llama-${PORT}
🤖 Prompt for AI Agents
In config.example.yaml around line 157, the docker run command uses --name
${MODEL_ID} which can fail if MODEL_ID contains invalid container-name
characters; change the container name to use a PORT-based, always-valid
identifier (for example --name model_${PORT} or --name model-${PORT}) and update
any related references/env docs to use the PORT-based name so container naming
is deterministic and valid across all model IDs.

--init --rm -p ${PORT}:8080 -v /mnt/nvme/models:/models
ghcr.io/ggml-org/llama.cpp:server
--model '/models/Qwen2.5-Coder-0.5B-Instruct-Q4_K_M.gguf'
Expand All @@ -167,7 +168,7 @@ models:
# - on POSIX systems: a SIGTERM signal is sent
# - on Windows, calls taskkill to stop the process
# - processes have 5 seconds to shutdown until forceful termination is attempted
cmdStop: docker stop dockertest
cmdStop: docker stop ${MODEL_ID}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Align cmdStop with the PORT-based container name.

Keeps start/stop symmetric and robust to arbitrary model IDs.

Apply this diff:

-    cmdStop: docker stop ${MODEL_ID}
+    cmdStop: docker stop llama-${PORT}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
cmdStop: docker stop ${MODEL_ID}
cmdStop: docker stop llama-${PORT}
🤖 Prompt for AI Agents
In config.example.yaml at line 171, cmdStop currently uses docker stop
${MODEL_ID}; change it to stop the container using the same PORT-based container
name pattern used by cmdStart so start/stop are symmetric—replace ${MODEL_ID}
with the exact PORT-based container name expression used in cmdStart (for
example ${PORT}_${MODEL_ID} or whatever pattern is used elsewhere in this file).


# groups: a dictionary of group settings
# - optional, default: empty dictionary
Expand Down
8 changes: 7 additions & 1 deletion proxy/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -237,7 +237,7 @@ func LoadConfigFromReader(r io.Reader) (Config, error) {

- name must fit the regex ^[a-zA-Z0-9_-]+$
- names must be less than 64 characters (no reason, just cause)
- name can not be any reserved macros: PORT
- name can not be any reserved macros: PORT, MODEL_ID
- macro values must be less than 1024 characters
*/
macroNameRegex := regexp.MustCompile(`^[a-zA-Z0-9_-]+$`)
Expand All @@ -253,6 +253,7 @@ func LoadConfigFromReader(r io.Reader) (Config, error) {
}
switch macroName {
case "PORT":
case "MODEL_ID":
return Config{}, fmt.Errorf("macro name '%s' is reserved and cannot be used", macroName)
}
}
Expand Down Expand Up @@ -296,6 +297,11 @@ func LoadConfigFromReader(r io.Reader) (Config, error) {
nextPort++
}

if strings.Contains(modelConfig.Cmd, "${MODEL_ID}") || strings.Contains(modelConfig.CmdStop, "${MODEL_ID}") {
modelConfig.Cmd = strings.ReplaceAll(modelConfig.Cmd, "${MODEL_ID}", modelId)
modelConfig.CmdStop = strings.ReplaceAll(modelConfig.CmdStop, "${MODEL_ID}", modelId)
}

// make sure there are no unknown macros that have not been replaced
macroPattern := regexp.MustCompile(`\$\{([a-zA-Z0-9_-]+)\}`)
fieldMap := map[string]string{
Expand Down
41 changes: 41 additions & 0 deletions proxy/config_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -440,3 +440,44 @@ models:
expectedCmd := "/user/llama.cpp/build/bin/llama-server --port 9990 --model /path/to/model.gguf -ngl 99"
assert.Equal(t, expectedCmd, cmdStr, "Final command does not match expected structure")
}

func TestConfig_MacroModelId(t *testing.T) {
content := `
startPort: 9000
macros:
"docker-llama": docker run --name ${MODEL_ID} -p ${PORT}:8080 docker_img
"docker-stop": docker stop ${MODEL_ID}

models:
model1:
cmd: /path/to/server -p ${PORT} -hf ${MODEL_ID}

model2:
cmd: ${docker-llama}
cmdStop: ${docker-stop}

author/model:F16:
cmd: /path/to/server -p ${PORT} -hf ${MODEL_ID}
cmdStop: stop
`

config, err := LoadConfigFromReader(strings.NewReader(content))
assert.NoError(t, err)
sanitizedCmd, err := SanitizeCommand(config.Models["model1"].Cmd)
assert.NoError(t, err)
assert.Equal(t, "/path/to/server -p 9001 -hf model1", strings.Join(sanitizedCmd, " "))

assert.Equal(t, "docker stop ${MODEL_ID}", config.Macros["docker-stop"])

sanitizedCmd2, err := SanitizeCommand(config.Models["model2"].Cmd)
assert.NoError(t, err)
assert.Equal(t, "docker run --name model2 -p 9002:8080 docker_img", strings.Join(sanitizedCmd2, " "))

sanitizedCmdStop, err := SanitizeCommand(config.Models["model2"].CmdStop)
assert.NoError(t, err)
assert.Equal(t, "docker stop model2", strings.Join(sanitizedCmdStop, " "))

sanitizedCmd3, err := SanitizeCommand(config.Models["author/model:F16"].Cmd)
assert.NoError(t, err)
assert.Equal(t, "/path/to/server -p 9000 -hf author/model:F16", strings.Join(sanitizedCmd3, " "))
}
Loading