Skip to content

Enable tool calling for MLLM/VLM chat paths by passing tools and tool_choice into chat templates#116

Closed
laywens wants to merge 1 commit intowaybarrios:mainfrom
laywens:codex/upstream-i7-tool-calling
Closed

Enable tool calling for MLLM/VLM chat paths by passing tools and tool_choice into chat templates#116
laywens wants to merge 1 commit intowaybarrios:mainfrom
laywens:codex/upstream-i7-tool-calling

Conversation

@laywens
Copy link
Copy Markdown

@laywens laywens commented Feb 25, 2026

VLM/MLLM Tool Calling Support

Problem

Multimodal models served with --mllm accepted tools in API payloads,
but tool metadata was not consistently passed into MLLM chat-template paths.
This left many VLM requests without structured tool_calls output even when
parser flags were enabled.

Fix

Pass tools and tool_choice through MLLM chat/template paths in both engines:

  • SimpleEngine
    • forwards tools + tool_choice in MLLM chat and stream_chat
    • keeps LLM behavior safe by not leaking unsupported kwargs to model calls
  • BatchedEngine
    • _apply_chat_template now accepts tool_choice
    • propagates tool_choice through prompt/prefix-boundary flows
  • MLXMultimodalLM
    • includes tools + tool_choice in chat-template invocation with safe fallback
  • server.py
    • propagates tool_choice when tools are supplied

Validation

  • Added regression tests for tool/tool_choice passthrough and LLM guard behavior
    in tests/test_simple_engine.py.
  • Added docs updates for MLLM/VLM tool-calling usage.

Usage

vllm-mlx serve <vlm-model-id> \
  --mllm \
  --enable-auto-tool-choice \
  --tool-call-parser auto

Notes

tool_choice support is best-effort: templates that don't support tool_choice
fall back cleanly without raising template errors.

@Thump604
Copy link
Copy Markdown
Collaborator

Thump604 commented Apr 8, 2026

@waybarrios, @swaylenhayes: status note plus coordination.

This PR passes tools and tool_choice through MLLM chat-template paths in both SimpleEngine and BatchedEngine. The fix shape is correct for the underlying gap (MLLM-loaded models silently dropping tool definitions from the template).

PR currently shows CONFLICTING merge status. Last activity Feb 25 (~6 weeks ago).

Coordination note: PR #139 (kargarisaac, "fix: pass tools to chat template in MLLM path") targets the same problem with a narrower scope (just tools, not also tool_choice). #139 is also CONFLICTING. The two PRs are not currently cross-linked. This PR (#116) covers both tools and tool_choice so the scope is broader. Either could land first.

Worth a rebase status update if you are still active on this branch.

@laywens
Copy link
Copy Markdown
Author

laywens commented Apr 8, 2026

@Thump604 @waybarrios

Still active. I agree #116 is stale as opened and should not be revived blindly.

Given that #139 covers the narrower core bug (tools propagation through the MLLM path) and overlaps with the same SimpleEngine / MLXMultimodalLM surface, the cleanest path is probably...

  1. to land the smallest fix first
  2. then follow with a fresh narrow PR for any remaining scope from Enable tool calling for MLLM/VLM chat paths by passing tools and tool_choice into chat templates #116 that fix: pass tools to chat template in MLLM path #139 does not cover (e.g. tool_choice, batched-path propagation) + server-side wiring.

If you would prefer one consolidated branch instead, I can rebase and refresh the broader path, but my bias is toward the smaller landing.

@Thump604
Copy link
Copy Markdown
Collaborator

Thump604 commented Apr 8, 2026

@swaylenhayes thanks for the constructive response. The smallest-fix-first approach is the right call. The order I would suggest:

  1. fix: pass tools to chat template in MLLM path #139 (kargarisaac) gets a rebase on current main since it is currently CONFLICTING
  2. Wayne lands fix: pass tools to chat template in MLLM path #139 with the narrow tools propagation fix
  3. A fresh narrow PR (yours, mine, or anyone) covers the remaining scope from Enable tool calling for MLLM/VLM chat paths by passing tools and tool_choice into chat templates #116: tool_choice propagation, batched-path equivalents, server-side wiring

That gives smaller focused reviews for waybarrios and bounds the failure mode of any one piece. If #139 stalls on the rebase, your consolidated-#116 fallback is also fine.

Tagging @waybarrios for the path-forward call. Happy to help with the follow-up narrow PR if useful.

@janhilgard
Copy link
Copy Markdown
Collaborator

Hey @swaylenhayes — thanks for this PR! All the functional changes are now in main:

Change Status in main
SimpleEnginetools + tool_choice into MLLM chat convert_tools_for_template() in all paths
BatchedEnginetool_choice in _apply_chat_template ✅ Propagated through prompt/prefix flows
MLXMultimodalLMtools in chat-template invocation tools param with safe fallback
server.pytool_choice propagation ✅ Including tool_choice="none" support

These landed via PR #278 (production backport) and PR #258 (sampling params).

The documentation guide (docs/guides/tool-calling.md) isn't in main yet — if you'd like to resubmit just the docs, that would be a welcome addition.

Closing as superseded. Thanks for the contribution!

@janhilgard janhilgard closed this Apr 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants