Skip to content

fix: enforce tool_choice parameter across all API endpoints#28

Merged
krystophny merged 1 commit intomainfrom
fix/tool-choice-enforcement
Mar 26, 2026
Merged

fix: enforce tool_choice parameter across all API endpoints#28
krystophny merged 1 commit intomainfrom
fix/tool-choice-enforcement

Conversation

@krystophny
Copy link
Copy Markdown
Collaborator

Summary

  • Add _apply_tool_choice() helper that enforces tool_choice semantics before chat template rendering and after output generation
  • When tool_choice="none": strip tools from chat kwargs and skip all tool call parsing on output
  • When tool_choice="required": inject a system message instructing the model to call a tool
  • When tool_choice={"function": {"name": "X"}}: filter tools to the named function and inject a targeted system message
  • Guard all 6 _parse_tool_calls_with_parser call sites and all streaming tool parser initializations with the should_parse_tools flag
  • Fix pre-existing bug: reject reasoning input items in Responses API with 400 (test was failing)

Fixes #23.

Files changed

  • vllm_mlx/server.py -- _apply_tool_choice() helper + enforcement at all 4 convert_tools_for_template sites + guards at all parse sites + reasoning input item validation
  • tests/test_tool_choice.py -- 11 tests: 6 unit tests for the helper, 5 integration tests via /v1/chat/completions

Verification

Full test suite passes after fix

$ python -m pytest tests/ --timeout=120 -x
=============== 1021 passed, 9 skipped, 20 deselected in 26.96s ================

New tests pass

$ python -m pytest tests/test_tool_choice.py -v
tests/test_tool_choice.py::TestApplyToolChoice::test_none_strips_tools_and_returns_false PASSED
tests/test_tool_choice.py::TestApplyToolChoice::test_required_adds_system_message PASSED
tests/test_tool_choice.py::TestApplyToolChoice::test_dict_filters_tools_and_adds_system_message PASSED
tests/test_tool_choice.py::TestApplyToolChoice::test_dict_with_no_matching_tool_keeps_all PASSED
tests/test_tool_choice.py::TestApplyToolChoice::test_auto_returns_true_no_changes PASSED
tests/test_tool_choice.py::TestApplyToolChoice::test_none_value_returns_true_no_changes PASSED
tests/test_tool_choice.py::TestToolChoiceOpenAIEndpoint::test_tool_choice_none_strips_tools_and_skips_parsing PASSED
tests/test_tool_choice.py::TestToolChoiceOpenAIEndpoint::test_tool_choice_required_injects_system_message PASSED
tests/test_tool_choice.py::TestToolChoiceOpenAIEndpoint::test_tool_choice_named_filters_tools PASSED
tests/test_tool_choice.py::TestToolChoiceOpenAIEndpoint::test_tool_choice_auto_no_changes PASSED
tests/test_tool_choice.py::TestToolChoiceOpenAIEndpoint::test_tool_choice_omitted_behaves_as_auto PASSED
=============================== 11 passed in 3.76s ================================

Test plan

  • tool_choice="none" strips tools from chat_kwargs and produces no tool_calls in response
  • tool_choice="required" injects system message instructing model to call a tool
  • tool_choice={"function": {"name": "X"}} filters tools and injects targeted system message
  • tool_choice="auto" and None leave everything unchanged
  • All existing tests continue to pass (1021 passed)

Strip tools from chat template when tool_choice="none" and skip
tool call parsing on the output side. Inject system hints for
"required" and named function modes. Fixes #23.
@qodo-code-review
Copy link
Copy Markdown

ⓘ You are approaching your monthly quota for Qodo. Upgrade your plan

Review Summary by Qodo

Enforce tool_choice parameter across all API endpoints

🐞 Bug fix ✨ Enhancement 🧪 Tests

Grey Divider

Walkthroughs

Description
• Add _apply_tool_choice() helper enforcing tool_choice semantics across all API endpoints
  - tool_choice="none": strips tools and skips parsing
  - tool_choice="required": injects system message requiring tool call
  - tool_choice={"function": {"name": "X"}}: filters tools and injects targeted message
• Guard all tool call parsing sites with should_parse_tools flag
• Fix pre-existing bug: reject reasoning input items in Responses API with 400 error
• Add comprehensive test suite: 6 unit tests + 5 integration tests via /v1/chat/completions
Diagram
flowchart LR
  A["tool_choice parameter"] --> B["_apply_tool_choice helper"]
  B --> C["Modify chat_kwargs & messages"]
  B --> D["Return should_parse_tools flag"]
  C --> E["Strip tools for none"]
  C --> F["Inject system message for required"]
  C --> G["Filter tools for named function"]
  D --> H["Guard parsing at all sites"]
  H --> I["Chat completion endpoint"]
  H --> J["Responses API endpoint"]
  H --> K["Anthropic messages endpoint"]
  H --> L["Streaming endpoints"]
Loading

Grey Divider

File Changes

1. tests/test_tool_choice.py 🧪 Tests +289/-0

Add tool_choice enforcement tests

• New comprehensive test file with 11 tests covering tool_choice enforcement
• 6 unit tests for _apply_tool_choice() helper covering all modes (none, required, dict, auto,
 None)
• 5 integration tests via /v1/chat/completions endpoint with FakeEngine
• Tests verify tools are stripped, system messages injected, and parsing skipped appropriately

tests/test_tool_choice.py


2. vllm_mlx/server.py 🐞 Bug fix +122/-16

Implement tool_choice enforcement and parsing guards

• Add _apply_tool_choice() helper function that modifies chat_kwargs and messages in place based
 on tool_choice value
• Apply _apply_tool_choice() at 4 sites: chat completion, Responses API, Anthropic messages, and
 streaming
• Guard all tool call parsing with should_parse_tools flag returned from helper
• Add validation in _responses_request_to_chat_request() to reject reasoning input items with 400
 error
• Update _prepare_responses_request() return signature to include should_parse_tools flag
• Update stream_chat_completion() signature to accept should_parse_tools parameter

vllm_mlx/server.py


Grey Divider

Qodo Logo

@qodo-code-review
Copy link
Copy Markdown

qodo-code-review bot commented Mar 26, 2026

Code Review by Qodo

🐞 Bugs (2) 📘 Rule violations (0) 📎 Requirement gaps (3) 📐 Spec deviations (0)

Grey Divider


Action required

1. tool_choice='none' converts tools 📎 Requirement gap ✓ Correctness
Description
For tool_choice="none", the code still calls convert_tools_for_template() before stripping
tools, violating the requirement that tool conversion/injection not run at all. This can cause
unintended side effects and breaks the stated success criteria for the none mode.
Code

vllm_mlx/server.py[R2402-2407]

    # Add tools if provided
    if request.tools:
        chat_kwargs["tools"] = convert_tools_for_template(request.tools)
+    should_parse_tools = _apply_tool_choice(
+        request.tool_choice, chat_kwargs, messages
+    )
Evidence
Compliance ID 1 requires that when tool_choice="none", tools are not injected and
convert_tools_for_template() is not invoked. The code converts tools unconditionally when
request.tools is present, and only later _apply_tool_choice() removes them for none.

Enforce tool_choice='none' by disabling tool injection and tool-call parsing
vllm_mlx/server.py[2402-2407]
vllm_mlx/server.py[464-466]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
When `tool_choice="none"`, the server still calls `convert_tools_for_template()` and only later removes `tools` from `chat_kwargs`. Compliance requires that tool injection/conversion does not run at all in this mode.

## Issue Context
This affects `/v1/chat/completions` (and similarly structured endpoints) because conversion happens before `_apply_tool_choice()` is applied.

## Fix Focus Areas
- vllm_mlx/server.py[2402-2407]
- vllm_mlx/server.py[856-930]
- vllm_mlx/server.py[2524-2646]
- vllm_mlx/server.py[2680-2840]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. Named tool_choice not forced 📎 Requirement gap ✓ Correctness
Description
For named-function tool_choice (e.g., {function:{name:"X"}}), the implementation filters the
tool list and adds a system message but does not force the tool-call opening to include the
specified function name. This does not prevent calling other functions or producing no tool call at
all.
Code

vllm_mlx/server.py[R480-499]

+    if isinstance(tool_choice, dict):
+        func_info = tool_choice.get("function", {})
+        fname = func_info.get("name", "") if isinstance(func_info, dict) else ""
+        if fname:
+            messages.append(
+                {
+                    "role": "system",
+                    "content": f"You MUST call the function: {fname}",
+                }
+            )
+            template_tools = chat_kwargs.get("tools")
+            if template_tools:
+                filtered = [
+                    t
+                    for t in template_tools
+                    if t.get("function", {}).get("name") == fname
+                ]
+                if filtered:
+                    chat_kwargs["tools"] = filtered
+        return True
Evidence
Compliance ID 3 requires prefixing generation with the parser’s opening prefix plus the specified
function name to constrain outputs to calling function X. The current code only appends a system
message and optionally filters chat_kwargs["tools"], with no generation prefix injection that
enforces the function name in the emitted tool call.

Enforce named-function tool_choice by forcing a specific function to be called
vllm_mlx/server.py[480-499]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Named-function `tool_choice` is not enforced via a parser-format prefix that includes the function name; only tool filtering and a system message are applied.

## Issue Context
Compliance requires constraining generation so the specified function name appears in the tool-call opening per the active tool parser’s format.

## Fix Focus Areas
- vllm_mlx/server.py[451-502]
- vllm_mlx/server.py[480-499]
- vllm_mlx/server.py[2402-2447]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. ToolParser missing TOOL_CALL_PREFIX 📎 Requirement gap ⚙ Maintainability
Description
The tool parser interface does not expose a standard TOOL_CALL_PREFIX attribute needed for
required/named tool_choice prefix enforcement across parsers. Without this, server enforcement must
hardcode per-parser strings or cannot implement consistent prefix injection.
Code

vllm_mlx/server.py[R451-502]

+def _apply_tool_choice(
+    tool_choice: str | dict | None,
+    chat_kwargs: dict,
+    messages: list[dict],
+) -> bool:
+    """Apply tool_choice policy to chat kwargs and messages.
+
+    Modifies *chat_kwargs* and *messages* in place so that the chat template
+    and downstream parsing honour the caller's tool_choice setting.
+
+    Returns ``True`` when the model output should be parsed for tool calls,
+    ``False`` when tool-call parsing must be skipped (``tool_choice="none"``).
+    """
+    if tool_choice == "none":
+        chat_kwargs.pop("tools", None)
+        return False
+
+    if tool_choice == "required":
+        messages.append(
+            {
+                "role": "system",
+                "content": (
+                    "You MUST call one of the provided tools. "
+                    "Do not respond with plain text."
+                ),
+            }
+        )
+        return True
+
+    if isinstance(tool_choice, dict):
+        func_info = tool_choice.get("function", {})
+        fname = func_info.get("name", "") if isinstance(func_info, dict) else ""
+        if fname:
+            messages.append(
+                {
+                    "role": "system",
+                    "content": f"You MUST call the function: {fname}",
+                }
+            )
+            template_tools = chat_kwargs.get("tools")
+            if template_tools:
+                filtered = [
+                    t
+                    for t in template_tools
+                    if t.get("function", {}).get("name") == fname
+                ]
+                if filtered:
+                    chat_kwargs["tools"] = filtered
+        return True
+
+    # "auto" or None — no changes needed
+    return True
Evidence
Compliance ID 5 requires abstract_tool_parser.py (or equivalent) to define a TOOL_CALL_PREFIX
requirement for all parsers. The abstract ToolParser only declares SUPPORTS_NATIVE_TOOL_FORMAT
and contains no TOOL_CALL_PREFIX, preventing standardized enforcement by server code such as
_apply_tool_choice().

Tool parser interface exposes a standard TOOL_CALL_PREFIX used for enforcement
vllm_mlx/tool_parsers/abstract_tool_parser.py[40-52]
vllm_mlx/server.py[451-502]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The abstract tool parser interface lacks a standard `TOOL_CALL_PREFIX` attribute, so server-side enforcement cannot reliably inject the correct tool-call opening prefix across different parsers.

## Issue Context
Compliance requires a canonical per-parser prefix/tag to enable consistent enforcement for `tool_choice="required"` and named-function tool_choice.

## Fix Focus Areas
- vllm_mlx/tool_parsers/abstract_tool_parser.py[40-67]
- vllm_mlx/tool_parsers/*.py
- vllm_mlx/server.py[451-502]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


View more (1)
4. Nonexistent tool forced 🐞 Bug ✓ Correctness
Description
When tool_choice selects a specific function name that is not present in chat_kwargs['tools'],
_apply_tool_choice() still injects a system instruction requiring that function, while keeping the
original tools list unchanged. This creates contradictory constraints ("MUST call X" but X is
unavailable) and can produce unusable tool_calls since tool-call parsing does not validate tool
names against the provided tools.
Code

vllm_mlx/server.py[R480-499]

+    if isinstance(tool_choice, dict):
+        func_info = tool_choice.get("function", {})
+        fname = func_info.get("name", "") if isinstance(func_info, dict) else ""
+        if fname:
+            messages.append(
+                {
+                    "role": "system",
+                    "content": f"You MUST call the function: {fname}",
+                }
+            )
+            template_tools = chat_kwargs.get("tools")
+            if template_tools:
+                filtered = [
+                    t
+                    for t in template_tools
+                    if t.get("function", {}).get("name") == fname
+                ]
+                if filtered:
+                    chat_kwargs["tools"] = filtered
+        return True
Evidence
_apply_tool_choice() appends the "MUST call the function" system message whenever tool_choice is a
dict with function.name, but only filters tools if a match exists; when no match exists, tools
remain unchanged while the instruction still demands the missing function. The existing generic
tool-call parser does not validate parsed tool names against any tool allowlist, and the chat
endpoint will return parsed tool_calls directly to clients.

vllm_mlx/server.py[451-502]
tests/test_tool_choice.py[123-134]
vllm_mlx/api/tool_calling.py[85-264]
vllm_mlx/server.py[2442-2467]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
When `tool_choice` is a dict selecting a function name that does not exist in `chat_kwargs["tools"]`, `_apply_tool_choice()` still injects a system message requiring that function. This can cause the model to attempt to call an unavailable tool and the server may still surface that call because tool parsing does not validate against the provided tools.

## Issue Context
Current behavior keeps all tools when no match is found (per unit test), but still adds a targeted system instruction. This is internally inconsistent.

## Fix Focus Areas
- vllm_mlx/server.py[451-502]
- tests/test_tool_choice.py[106-134]

## Suggested fix
- In `_apply_tool_choice()` for dict tool_choice:
 - Compute whether `fname` exists in `chat_kwargs.get("tools")`.
 - Only inject the targeted system message (and filter tools) if a match exists.
 - If no match exists, either:
   - treat it as `"auto"` (no system message, no filtering) to preserve current “keeps all” behavior, OR
   - raise `HTTPException(status_code=400, ...)` if the project prefers strict validation.
- Update/add a unit test to assert the chosen behavior (especially that no targeted system message is injected when no match exists if you choose the lenient path).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

5. System message appended 🐞 Bug ⚙ Maintainability
Description
_apply_tool_choice() appends enforcement system messages to the end of the messages list, which is
inconsistent with other server logic that prepends/merges system instructions at the beginning. This
can change prompt structure (especially in Responses where system messages are explicitly normalized
to the front) and risks reduced adherence or template-specific misbehavior.
Code

vllm_mlx/server.py[R468-489]

+    if tool_choice == "required":
+        messages.append(
+            {
+                "role": "system",
+                "content": (
+                    "You MUST call one of the provided tools. "
+                    "Do not respond with plain text."
+                ),
+            }
+        )
+        return True
+
+    if isinstance(tool_choice, dict):
+        func_info = tool_choice.get("function", {})
+        fname = func_info.get("name", "") if isinstance(func_info, dict) else ""
+        if fname:
+            messages.append(
+                {
+                    "role": "system",
+                    "content": f"You MUST call the function: {fname}",
+                }
+            )
Evidence
The new helper uses messages.append(...) for tool_choice enforcement, while existing utilities
prepend system instructions (or merge them into the leading system message). Responses request
conversion explicitly consolidates system messages into a single leading system message, but
_apply_tool_choice() later appends another system message at the end, breaking that normalization.

vllm_mlx/server.py[451-489]
vllm_mlx/server.py[730-769]
vllm_mlx/server.py[2510-2539]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`_apply_tool_choice()` adds tool-choice enforcement by appending a new `{"role": "system" ...}` message at the end of the conversation. Elsewhere, the server consistently prepends/merges system instructions at the start, and Responses conversion normalizes system messages to the front.

## Issue Context
Appending a system message after user content may change how some chat templates or clients interpret role ordering, and it undermines the server’s existing “system-first” normalization approach.

## Fix Focus Areas
- vllm_mlx/server.py[451-489]
- vllm_mlx/server.py[2510-2539]
- vllm_mlx/server.py[730-769]

## Suggested fix
- Change `_apply_tool_choice()` to inject tool-enforcement instructions like `_inject_json_instruction()`:
 - If a system message exists, append the enforcement text to that system message’s content.
 - Otherwise insert a new system message at index 0.
- Keep the rest of tool_choice behavior the same (including `tool_choice="none"` returning `False`).
- Update `tests/test_tool_choice.py` assertions that currently require the injected system message to be `messages[-1]`.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

@krystophny krystophny merged commit 1b23891 into main Mar 26, 2026
8 of 9 checks passed
Comment on lines 2402 to +2407
# Add tools if provided
if request.tools:
chat_kwargs["tools"] = convert_tools_for_template(request.tools)
should_parse_tools = _apply_tool_choice(
request.tool_choice, chat_kwargs, messages
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. tool_choice='none' converts tools 📎 Requirement gap ✓ Correctness

For tool_choice="none", the code still calls convert_tools_for_template() before stripping
tools, violating the requirement that tool conversion/injection not run at all. This can cause
unintended side effects and breaks the stated success criteria for the none mode.
Agent Prompt
## Issue description
When `tool_choice="none"`, the server still calls `convert_tools_for_template()` and only later removes `tools` from `chat_kwargs`. Compliance requires that tool injection/conversion does not run at all in this mode.

## Issue Context
This affects `/v1/chat/completions` (and similarly structured endpoints) because conversion happens before `_apply_tool_choice()` is applied.

## Fix Focus Areas
- vllm_mlx/server.py[2402-2407]
- vllm_mlx/server.py[856-930]
- vllm_mlx/server.py[2524-2646]
- vllm_mlx/server.py[2680-2840]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment on lines +480 to +499
if isinstance(tool_choice, dict):
func_info = tool_choice.get("function", {})
fname = func_info.get("name", "") if isinstance(func_info, dict) else ""
if fname:
messages.append(
{
"role": "system",
"content": f"You MUST call the function: {fname}",
}
)
template_tools = chat_kwargs.get("tools")
if template_tools:
filtered = [
t
for t in template_tools
if t.get("function", {}).get("name") == fname
]
if filtered:
chat_kwargs["tools"] = filtered
return True
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

2. Named tool_choice not forced 📎 Requirement gap ✓ Correctness

For named-function tool_choice (e.g., {function:{name:"X"}}), the implementation filters the
tool list and adds a system message but does not force the tool-call opening to include the
specified function name. This does not prevent calling other functions or producing no tool call at
all.
Agent Prompt
## Issue description
Named-function `tool_choice` is not enforced via a parser-format prefix that includes the function name; only tool filtering and a system message are applied.

## Issue Context
Compliance requires constraining generation so the specified function name appears in the tool-call opening per the active tool parser’s format.

## Fix Focus Areas
- vllm_mlx/server.py[451-502]
- vllm_mlx/server.py[480-499]
- vllm_mlx/server.py[2402-2447]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment on lines +451 to +502
def _apply_tool_choice(
tool_choice: str | dict | None,
chat_kwargs: dict,
messages: list[dict],
) -> bool:
"""Apply tool_choice policy to chat kwargs and messages.

Modifies *chat_kwargs* and *messages* in place so that the chat template
and downstream parsing honour the caller's tool_choice setting.

Returns ``True`` when the model output should be parsed for tool calls,
``False`` when tool-call parsing must be skipped (``tool_choice="none"``).
"""
if tool_choice == "none":
chat_kwargs.pop("tools", None)
return False

if tool_choice == "required":
messages.append(
{
"role": "system",
"content": (
"You MUST call one of the provided tools. "
"Do not respond with plain text."
),
}
)
return True

if isinstance(tool_choice, dict):
func_info = tool_choice.get("function", {})
fname = func_info.get("name", "") if isinstance(func_info, dict) else ""
if fname:
messages.append(
{
"role": "system",
"content": f"You MUST call the function: {fname}",
}
)
template_tools = chat_kwargs.get("tools")
if template_tools:
filtered = [
t
for t in template_tools
if t.get("function", {}).get("name") == fname
]
if filtered:
chat_kwargs["tools"] = filtered
return True

# "auto" or None — no changes needed
return True
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

3. toolparser missing tool_call_prefix 📎 Requirement gap ⚙ Maintainability

The tool parser interface does not expose a standard TOOL_CALL_PREFIX attribute needed for
required/named tool_choice prefix enforcement across parsers. Without this, server enforcement must
hardcode per-parser strings or cannot implement consistent prefix injection.
Agent Prompt
## Issue description
The abstract tool parser interface lacks a standard `TOOL_CALL_PREFIX` attribute, so server-side enforcement cannot reliably inject the correct tool-call opening prefix across different parsers.

## Issue Context
Compliance requires a canonical per-parser prefix/tag to enable consistent enforcement for `tool_choice="required"` and named-function tool_choice.

## Fix Focus Areas
- vllm_mlx/tool_parsers/abstract_tool_parser.py[40-67]
- vllm_mlx/tool_parsers/*.py
- vllm_mlx/server.py[451-502]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment on lines +480 to +499
if isinstance(tool_choice, dict):
func_info = tool_choice.get("function", {})
fname = func_info.get("name", "") if isinstance(func_info, dict) else ""
if fname:
messages.append(
{
"role": "system",
"content": f"You MUST call the function: {fname}",
}
)
template_tools = chat_kwargs.get("tools")
if template_tools:
filtered = [
t
for t in template_tools
if t.get("function", {}).get("name") == fname
]
if filtered:
chat_kwargs["tools"] = filtered
return True
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

4. Nonexistent tool forced 🐞 Bug ✓ Correctness

When tool_choice selects a specific function name that is not present in chat_kwargs['tools'],
_apply_tool_choice() still injects a system instruction requiring that function, while keeping the
original tools list unchanged. This creates contradictory constraints ("MUST call X" but X is
unavailable) and can produce unusable tool_calls since tool-call parsing does not validate tool
names against the provided tools.
Agent Prompt
## Issue description
When `tool_choice` is a dict selecting a function name that does not exist in `chat_kwargs["tools"]`, `_apply_tool_choice()` still injects a system message requiring that function. This can cause the model to attempt to call an unavailable tool and the server may still surface that call because tool parsing does not validate against the provided tools.

## Issue Context
Current behavior keeps all tools when no match is found (per unit test), but still adds a targeted system instruction. This is internally inconsistent.

## Fix Focus Areas
- vllm_mlx/server.py[451-502]
- tests/test_tool_choice.py[106-134]

## Suggested fix
- In `_apply_tool_choice()` for dict tool_choice:
  - Compute whether `fname` exists in `chat_kwargs.get("tools")`.
  - Only inject the targeted system message (and filter tools) if a match exists.
  - If no match exists, either:
    - treat it as `"auto"` (no system message, no filtering) to preserve current “keeps all” behavior, OR
    - raise `HTTPException(status_code=400, ...)` if the project prefers strict validation.
- Update/add a unit test to assert the chosen behavior (especially that no targeted system message is injected when no match exists if you choose the lenient path).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

krystophny added a commit that referenced this pull request Mar 27, 2026
The reasoning input item rejection (HTTPException 400) was added in
PR #28 but conflicts with the earlier fix that converts reasoning
items to assistant messages in _responses_input_to_chat_messages.
The rejection ran first, crashing the SSE stream mid-flight.

Also downgrade reasoning config rejection to a debug log since
raising inside the streaming generator causes "response already
started" crashes.
krystophny added a commit that referenced this pull request Mar 27, 2026
The reasoning input item rejection (HTTPException 400) was added in
PR #28 but conflicts with the earlier fix that converts reasoning
items to assistant messages in _responses_input_to_chat_messages.
The rejection ran first, crashing the SSE stream mid-flight.

Also downgrade reasoning config rejection to a debug log since
raising inside the streaming generator causes "response already
started" crashes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

tool_choice parameter is accepted but never enforced

1 participant