fix: streaming tool calls drop for Qwen3.6 bracket format#374
Open
mikepixelmagic-dev wants to merge 1 commit intowaybarrios:mainfrom
Open
fix: streaming tool calls drop for Qwen3.6 bracket format#374mikepixelmagic-dev wants to merge 1 commit intowaybarrios:mainfrom
mikepixelmagic-dev wants to merge 1 commit intowaybarrios:mainfrom
Conversation
Two bugs caused Qwen3.6 [Calling tool: name({...})] streaming tool calls
to leak into text content instead of emitting structured tool_calls:
1. server.py _stream_responses_request: the fast-path gate checked
`"<" not in delta_text`, which skips the tool parser for bracket-format
deltas (they start with "["). Refactored to use the existing
`_streaming_tool_markup_possible()` helper, matching the 4 other
streaming paths that already use it.
2. qwen_tool_parser.extract_tool_calls_streaming: the closing-marker
check looked for `</tool_call>` or `)]` in `delta_text` only. Those
markers routinely span token boundaries (e.g. `)` and `]` arrive in
separate deltas), so the check never fires and the parser returns
None for every chunk, suppressing the whole call. Check `current_text`
(accumulated) instead so the close is detected reliably.
Reproduction: multi-turn tool-calling session with Qwen3.6-35B-A3B-8bit
and --tool-call-parser qwen --reasoning-parser qwen3. Without these
fixes, streaming emits `[Calling tool: create_file({...})]` as content.
With fixes, structured tool_calls are emitted and a 40-turn drift test
passes cleanly (was failing at turn 5 before).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes two bugs that cause Qwen3.6
[Calling tool: name({...})]streaming tool calls to leak into text content instead of emitting structuredtool_calls.Bug 1 —
server.py_stream_responses_requestfast-path gateThe Responses API streaming path still uses the old
"<" not in delta_textgate to decide whether to engage the tool parser. That gate only matches<tool_call>/<function=shapes — bracket-format deltas start with[, so they skip the parser entirely and get emitted as plain text.The other 4 streaming paths (Anthropic messages and OpenAI chat completions, reasoning and non-reasoning branches) were already refactored to use
_streaming_tool_markup_possible()in prior PRs (see refs below) but this path was missed.Fix: use
_streaming_tool_markup_possible(tool_accumulated_text + delta_text), matching the other 4 paths.Bug 2 —
qwen_tool_parser.extract_tool_calls_streamingclosing-marker checkDetection of a completed bracket-format tool call looks for
)]or</tool_call>indelta_textonly. In practice those closing markers routinely span token boundaries — e.g.)arrives in one delta and]in the next — so the check never fires, the parser returnsNonefor every chunk, and the entire call gets suppressed without ever being emitted as structuredtool_calls.Fix: check
current_text(accumulated) instead ofdelta_text, so the close is detected reliably regardless of token splits.Reproduction
Serve Qwen3.6-35B-A3B-8bit with:
Run a streaming chat completion with multiple tools (e.g. a create_file tool with
pathandcontentparameters). The model emits a bracket-format tool call. Without these fixes, the client receives[Calling tool: create_file({"path": "...", "content": "..."})]ascontentwithfinish_reason: nulland zerotool_callsdeltas. With these fixes, a proper structuredtool_callsdelta is emitted andfinish_reason: tool_callsis set.A 40-turn drift test cycling through 5 tools (
get_weather,read_file,web_search,calculator,create_file) held 4 turns before drifting pre-fix, and held all 40 turns post-fix with no drift.Related work
This PR completes the bracket-format coverage by fixing the one remaining Responses API path and the qwen tool parser closing-marker detection.
Test plan
create_file-style tool emits structuredtool_calls