fix: enforce tool_choice parameter across all API endpoints#28
fix: enforce tool_choice parameter across all API endpoints#28krystophny merged 1 commit intomainfrom
Conversation
Strip tools from chat template when tool_choice="none" and skip tool call parsing on the output side. Inject system hints for "required" and named function modes. Fixes #23.
ⓘ You are approaching your monthly quota for Qodo. Upgrade your plan Review Summary by QodoEnforce tool_choice parameter across all API endpoints
WalkthroughsDescription• Add _apply_tool_choice() helper enforcing tool_choice semantics across all API endpoints
- tool_choice="none": strips tools and skips parsing
- tool_choice="required": injects system message requiring tool call
- tool_choice={"function": {"name": "X"}}: filters tools and injects targeted message
• Guard all tool call parsing sites with should_parse_tools flag
• Fix pre-existing bug: reject reasoning input items in Responses API with 400 error
• Add comprehensive test suite: 6 unit tests + 5 integration tests via /v1/chat/completions
Diagramflowchart LR
A["tool_choice parameter"] --> B["_apply_tool_choice helper"]
B --> C["Modify chat_kwargs & messages"]
B --> D["Return should_parse_tools flag"]
C --> E["Strip tools for none"]
C --> F["Inject system message for required"]
C --> G["Filter tools for named function"]
D --> H["Guard parsing at all sites"]
H --> I["Chat completion endpoint"]
H --> J["Responses API endpoint"]
H --> K["Anthropic messages endpoint"]
H --> L["Streaming endpoints"]
File Changes1. tests/test_tool_choice.py
|
Code Review by Qodo
1. tool_choice='none' converts tools
|
| # Add tools if provided | ||
| if request.tools: | ||
| chat_kwargs["tools"] = convert_tools_for_template(request.tools) | ||
| should_parse_tools = _apply_tool_choice( | ||
| request.tool_choice, chat_kwargs, messages | ||
| ) |
There was a problem hiding this comment.
1. tool_choice='none' converts tools 📎 Requirement gap ✓ Correctness
For tool_choice="none", the code still calls convert_tools_for_template() before stripping tools, violating the requirement that tool conversion/injection not run at all. This can cause unintended side effects and breaks the stated success criteria for the none mode.
Agent Prompt
## Issue description
When `tool_choice="none"`, the server still calls `convert_tools_for_template()` and only later removes `tools` from `chat_kwargs`. Compliance requires that tool injection/conversion does not run at all in this mode.
## Issue Context
This affects `/v1/chat/completions` (and similarly structured endpoints) because conversion happens before `_apply_tool_choice()` is applied.
## Fix Focus Areas
- vllm_mlx/server.py[2402-2407]
- vllm_mlx/server.py[856-930]
- vllm_mlx/server.py[2524-2646]
- vllm_mlx/server.py[2680-2840]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| if isinstance(tool_choice, dict): | ||
| func_info = tool_choice.get("function", {}) | ||
| fname = func_info.get("name", "") if isinstance(func_info, dict) else "" | ||
| if fname: | ||
| messages.append( | ||
| { | ||
| "role": "system", | ||
| "content": f"You MUST call the function: {fname}", | ||
| } | ||
| ) | ||
| template_tools = chat_kwargs.get("tools") | ||
| if template_tools: | ||
| filtered = [ | ||
| t | ||
| for t in template_tools | ||
| if t.get("function", {}).get("name") == fname | ||
| ] | ||
| if filtered: | ||
| chat_kwargs["tools"] = filtered | ||
| return True |
There was a problem hiding this comment.
2. Named tool_choice not forced 📎 Requirement gap ✓ Correctness
For named-function tool_choice (e.g., {function:{name:"X"}}), the implementation filters the
tool list and adds a system message but does not force the tool-call opening to include the
specified function name. This does not prevent calling other functions or producing no tool call at
all.
Agent Prompt
## Issue description
Named-function `tool_choice` is not enforced via a parser-format prefix that includes the function name; only tool filtering and a system message are applied.
## Issue Context
Compliance requires constraining generation so the specified function name appears in the tool-call opening per the active tool parser’s format.
## Fix Focus Areas
- vllm_mlx/server.py[451-502]
- vllm_mlx/server.py[480-499]
- vllm_mlx/server.py[2402-2447]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| def _apply_tool_choice( | ||
| tool_choice: str | dict | None, | ||
| chat_kwargs: dict, | ||
| messages: list[dict], | ||
| ) -> bool: | ||
| """Apply tool_choice policy to chat kwargs and messages. | ||
|
|
||
| Modifies *chat_kwargs* and *messages* in place so that the chat template | ||
| and downstream parsing honour the caller's tool_choice setting. | ||
|
|
||
| Returns ``True`` when the model output should be parsed for tool calls, | ||
| ``False`` when tool-call parsing must be skipped (``tool_choice="none"``). | ||
| """ | ||
| if tool_choice == "none": | ||
| chat_kwargs.pop("tools", None) | ||
| return False | ||
|
|
||
| if tool_choice == "required": | ||
| messages.append( | ||
| { | ||
| "role": "system", | ||
| "content": ( | ||
| "You MUST call one of the provided tools. " | ||
| "Do not respond with plain text." | ||
| ), | ||
| } | ||
| ) | ||
| return True | ||
|
|
||
| if isinstance(tool_choice, dict): | ||
| func_info = tool_choice.get("function", {}) | ||
| fname = func_info.get("name", "") if isinstance(func_info, dict) else "" | ||
| if fname: | ||
| messages.append( | ||
| { | ||
| "role": "system", | ||
| "content": f"You MUST call the function: {fname}", | ||
| } | ||
| ) | ||
| template_tools = chat_kwargs.get("tools") | ||
| if template_tools: | ||
| filtered = [ | ||
| t | ||
| for t in template_tools | ||
| if t.get("function", {}).get("name") == fname | ||
| ] | ||
| if filtered: | ||
| chat_kwargs["tools"] = filtered | ||
| return True | ||
|
|
||
| # "auto" or None — no changes needed | ||
| return True |
There was a problem hiding this comment.
3. toolparser missing tool_call_prefix 📎 Requirement gap ⚙ Maintainability
The tool parser interface does not expose a standard TOOL_CALL_PREFIX attribute needed for required/named tool_choice prefix enforcement across parsers. Without this, server enforcement must hardcode per-parser strings or cannot implement consistent prefix injection.
Agent Prompt
## Issue description
The abstract tool parser interface lacks a standard `TOOL_CALL_PREFIX` attribute, so server-side enforcement cannot reliably inject the correct tool-call opening prefix across different parsers.
## Issue Context
Compliance requires a canonical per-parser prefix/tag to enable consistent enforcement for `tool_choice="required"` and named-function tool_choice.
## Fix Focus Areas
- vllm_mlx/tool_parsers/abstract_tool_parser.py[40-67]
- vllm_mlx/tool_parsers/*.py
- vllm_mlx/server.py[451-502]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| if isinstance(tool_choice, dict): | ||
| func_info = tool_choice.get("function", {}) | ||
| fname = func_info.get("name", "") if isinstance(func_info, dict) else "" | ||
| if fname: | ||
| messages.append( | ||
| { | ||
| "role": "system", | ||
| "content": f"You MUST call the function: {fname}", | ||
| } | ||
| ) | ||
| template_tools = chat_kwargs.get("tools") | ||
| if template_tools: | ||
| filtered = [ | ||
| t | ||
| for t in template_tools | ||
| if t.get("function", {}).get("name") == fname | ||
| ] | ||
| if filtered: | ||
| chat_kwargs["tools"] = filtered | ||
| return True |
There was a problem hiding this comment.
4. Nonexistent tool forced 🐞 Bug ✓ Correctness
When tool_choice selects a specific function name that is not present in chat_kwargs['tools'],
_apply_tool_choice() still injects a system instruction requiring that function, while keeping the
original tools list unchanged. This creates contradictory constraints ("MUST call X" but X is
unavailable) and can produce unusable tool_calls since tool-call parsing does not validate tool
names against the provided tools.
Agent Prompt
## Issue description
When `tool_choice` is a dict selecting a function name that does not exist in `chat_kwargs["tools"]`, `_apply_tool_choice()` still injects a system message requiring that function. This can cause the model to attempt to call an unavailable tool and the server may still surface that call because tool parsing does not validate against the provided tools.
## Issue Context
Current behavior keeps all tools when no match is found (per unit test), but still adds a targeted system instruction. This is internally inconsistent.
## Fix Focus Areas
- vllm_mlx/server.py[451-502]
- tests/test_tool_choice.py[106-134]
## Suggested fix
- In `_apply_tool_choice()` for dict tool_choice:
- Compute whether `fname` exists in `chat_kwargs.get("tools")`.
- Only inject the targeted system message (and filter tools) if a match exists.
- If no match exists, either:
- treat it as `"auto"` (no system message, no filtering) to preserve current “keeps all” behavior, OR
- raise `HTTPException(status_code=400, ...)` if the project prefers strict validation.
- Update/add a unit test to assert the chosen behavior (especially that no targeted system message is injected when no match exists if you choose the lenient path).
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
The reasoning input item rejection (HTTPException 400) was added in PR #28 but conflicts with the earlier fix that converts reasoning items to assistant messages in _responses_input_to_chat_messages. The rejection ran first, crashing the SSE stream mid-flight. Also downgrade reasoning config rejection to a debug log since raising inside the streaming generator causes "response already started" crashes.
The reasoning input item rejection (HTTPException 400) was added in PR #28 but conflicts with the earlier fix that converts reasoning items to assistant messages in _responses_input_to_chat_messages. The rejection ran first, crashing the SSE stream mid-flight. Also downgrade reasoning config rejection to a debug log since raising inside the streaming generator causes "response already started" crashes.
Summary
_apply_tool_choice()helper that enforcestool_choicesemantics before chat template rendering and after output generationtool_choice="none": strip tools from chat kwargs and skip all tool call parsing on outputtool_choice="required": inject a system message instructing the model to call a tooltool_choice={"function": {"name": "X"}}: filter tools to the named function and inject a targeted system message_parse_tool_calls_with_parsercall sites and all streaming tool parser initializations with theshould_parse_toolsflagFixes #23.
Files changed
vllm_mlx/server.py--_apply_tool_choice()helper + enforcement at all 4convert_tools_for_templatesites + guards at all parse sites + reasoning input item validationtests/test_tool_choice.py-- 11 tests: 6 unit tests for the helper, 5 integration tests via/v1/chat/completionsVerification
Full test suite passes after fix
New tests pass
Test plan
tool_choice="none"strips tools from chat_kwargs and produces no tool_calls in responsetool_choice="required"injects system message instructing model to call a tooltool_choice={"function": {"name": "X"}}filters tools and injects targeted system messagetool_choice="auto"andNoneleave everything unchanged