fix(server): integrate tool call parser into reasoning parser streaming path by mxl · Pull Request #253 · waybarrios/vllm-mlx

mxl · 2026-04-04T04:46:02Z

Running Qwen 3.5 model results in printing tool calls as raw xml and not applying actual tools.

Summary

run the streaming tool-call parser in the reasoning-parser path after reasoning content has been stripped from streamed output
suppress tool-call markup from normal content chunks and emit structured tool_calls chunks when parsed tool calls are detected
preserve the tool_calls finish reason on the final streamed chunk when generation ends with parsed tool calls

…ng path

waybarrios

Clean fix for a real bug — tool calls were leaking as raw XML when the reasoning parser was active. The approach is correct and tests are solid.

One minor bug:

`request.model` vs `_model_name` inconsistency

The new tool_chunk in the reasoning path uses model=request.model, but every other chunk in the function (including the equivalent tool_calls chunk in the standard path) uses model=_model_name:

# New code (reasoning path):
tool_chunk = ChatCompletionChunk(
    id=response_id,
    model=request.model,      # ← client-provided value
    ...
)

# Standard path (line ~2252) and all other chunks:
chunk = ChatCompletionChunk(
    id=response_id,
    model=_model_name,         # ← actual served model name
    ...
)

These can differ when --served-model-name is set. Should be _model_name for consistency.

waybarrios · 2026-04-11T03:20:04Z

Pushed a small fix (1d16507) — was using request.model instead of _model_name in the reasoning tool chunk. Consistent with the rest of the function now.

mxl force-pushed the fix/reasoning-parser-tool-call-streaming branch 7 times, most recently from 4bff328 to 89303c8 Compare April 4, 2026 07:13

fix(server): integrate tool call parser into reasoning parser streami…

1ce4107

…ng path

mxl force-pushed the fix/reasoning-parser-tool-call-streaming branch from 89303c8 to 1ce4107 Compare April 4, 2026 07:40

waybarrios reviewed Apr 11, 2026

View reviewed changes

use _model_name instead of request.model in reasoning tool chunk

1d16507

waybarrios merged commit 660552e into waybarrios:main Apr 11, 2026
7 checks passed

janhilgard mentioned this pull request Apr 11, 2026

Fix streaming tool calls when reasoning parser is active #93

Closed

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(server): integrate tool call parser into reasoning parser streaming path#253

fix(server): integrate tool call parser into reasoning parser streaming path#253
waybarrios merged 2 commits intowaybarrios:mainfrom
mxl:fix/reasoning-parser-tool-call-streaming

mxl commented Apr 4, 2026 •

edited

Loading

Uh oh!

waybarrios left a comment

Uh oh!

waybarrios commented Apr 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mxl commented Apr 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

waybarrios left a comment

Choose a reason for hiding this comment

request.model vs _model_name inconsistency

Uh oh!

waybarrios commented Apr 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mxl commented Apr 4, 2026 •

edited

Loading

`request.model` vs `_model_name` inconsistency