Skip to content

[Bugfix] Qwen3Coder streaming tool call JSON missing opening brace in arguments#35716

Closed
KrxGu wants to merge 3 commits intovllm-project:mainfrom
KrxGu:fix/35266-qwen3coder-streaming-brace
Closed

[Bugfix] Qwen3Coder streaming tool call JSON missing opening brace in arguments#35716
KrxGu wants to merge 3 commits intovllm-project:mainfrom
KrxGu:fix/35266-qwen3coder-streaming-brace

Conversation

@KrxGu
Copy link
Copy Markdown
Contributor

@KrxGu KrxGu commented Mar 2, 2026

Fixes #35266

Problem

When streaming tool calls with Qwen3CoderToolParser, the function.arguments field was sometimes missing the opening {, producing invalid JSON like:
"command": "echo hi"}
instead of:
{"command": "echo hi"}

This caused json.loads(arguments) to fail in any client that parses arguments on completion (opencode, claudecode, etc.).

Root Cause

Two bugs in extract_tool_calls_streaming:

  1. { emission was gated on self.parameter_prefix not in delta_text, so when the first <parameter=> tag arrived in the same delta as the function header (a valid chunk boundary), the opening brace was silently skipped.
  2. A separate fallback immediately after set json_started = True regardless, so the state machine believed JSON had started even though { was never emitted.
  3. When </function> arrived after a parameter was processed, the closing } had no subsequent call to live in, leaving the stream permanently unclosed.

Fix

  • Remove the parameter_prefix not in delta_text gate { is now always emitted once, unconditionally, on first entry into function body.
  • Remove the premature json_started = True fallback.
  • When the last parameter is processed and </function> is already present in tool_text, append } to the same delta fragment instead of deferring it to a call that may never come.

Testing

Added a regression test in tests/tool_parsers/test_qwen3coder_tool_parser.py that directly calls extract_tool_calls_streaming with hand-crafted chunk boundaries reproducing the exact failure mode (function header and first <parameter=> in the same delta). No GPU or model download required.

Verified locally with a mock-tokenizer script against the fixed parser, both the bug-trigger split and normal split produce valid JSON.

When the first <parameter=> tag arrived in the same streaming delta as
the function header, the opening brace was skipped but json_started was
still flipped True. This caused arguments like `"cmd": "..."}` instead
of `{"cmd": "..."}`, breaking any client that calls json.loads() on
function.arguments. Fixes vllm-project#35266.

- Remove the `parameter_prefix not in delta_text` gate that suppressed
  `{` emission
- Remove the premature json_started=True fallback that ran without emitting `{`
- Append closing `}` to the last param fragment when `</function>` is
  already present in tool_text so the stream is never left unclosed
- Add regression test reproducing the exact chunk-boundary failure mode
  (no GPU or real tokenizer required)

Signed-off-by: KrxGu <krishom70@gmail.com>
Copilot AI review requested due to automatic review settings March 2, 2026 07:10
@mergify mergify bot added the qwen Related to Qwen models label Mar 2, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request fixes a bug in Qwen3CoderToolParser where streaming tool calls could produce invalid JSON by omitting the opening brace for arguments. The fix correctly ensures the opening brace is always emitted and handles edge cases around chunk boundaries. A comprehensive regression test has been added to prevent this issue from recurring.

My review focuses on the implementation of the fix. While the logic is correct, I've identified a significant amount of duplicated code that was introduced. I've suggested refactoring this into a helper method to improve maintainability. I also pointed out that silently catching all exceptions is risky and should be replaced with logging to provide better visibility into potential parsing errors.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes malformed JSON in streamed tool-call arguments produced by Qwen3CoderToolParser (missing { / unclosed }) so downstream clients can reliably json.loads() the concatenated function.arguments.

Changes:

  • Make the opening { emission unconditional on first entry to the function body (remove the prior gating + fallback state corruption).
  • Reorder/gate function-end handling so parameters aren’t skipped when </function> arrives in the same accumulated text.
  • Add a regression test intended to reproduce the missing-opening-brace failure mode (#35266).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
vllm/tool_parsers/qwen3coder_tool_parser.py Adjusts the streaming state machine to always emit { once, avoid premature JSON-start state, and ensure } is emitted even in “last-chunk” boundary cases.
tests/tool_parsers/test_qwen3coder_tool_parser.py Adds a regression test for the missing-opening-brace streaming bug.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

KrxGu added 2 commits March 2, 2026 07:19
…nto helper

Addresses review feedback: the duplicated logic for updating prev_tool_call_arr
with final parsed arguments was extracted into _update_final_tool_arguments(),
called from both the deferred-close path and the inline-close path. Also
replaced bare except-pass with logger.warning(exc_info=True) so failures
are visible in logs without interrupting the stream.

Signed-off-by: KrxGu <krishom70@gmail.com>
…ct#35266 bug

The 4-chunk split had <tool_call><function=bash> as one delta. When
tool_call_start_token is in delta_text the parser returns None immediately,
so the header was emitted on the next chunk whose delta was <parameter=...>.
The opening-brace logic then fired on </function> which has no <parameter=,
meaning the old gate (parameter_prefix not in delta_text) would pass and the
test gave a false negative on the unfixed code.

Fix: split into 5 chunks so <tool_call> and <function=bash> are separate
deltas. The header is now emitted when delta_text is <function=bash>, and
the very next delta is <parameter=command>... which contains <parameter= and
is the exact condition the old gate suppressed. Adds detailed inline comment
explaining the state machine transitions.

Signed-off-by: KrxGu <krishom70@gmail.com>
@KrxGu
Copy link
Copy Markdown
Contributor Author

KrxGu commented Mar 2, 2026

This PR is ready for review.

CI looks good, Good To Merge!!

@KrxGu KrxGu changed the title [Tool Parser] Fix Qwen3Coder streaming tool calls missing opening brace in arguments [Bugfix] Qwen3Coder streaming tool call JSON missing opening brace in arguments Mar 2, 2026
@mergify mergify bot added the bug Something isn't working label Mar 2, 2026
@chaunceyjiang
Copy link
Copy Markdown
Collaborator

@KrxGu Could you help test #35615?

@KrxGu
Copy link
Copy Markdown
Contributor Author

KrxGu commented Mar 2, 2026

@KrxGu Could you help test #35615?

Sure, Tested #35615 locally (GPU-free, mock tokenizer).

Burst failure details:
Chunk sequence used:
["<tool_call>", "<function=write>",
"<parameter=filePath>/tmp/a.txt<parameter=content>HELLO",
"</tool_call>"]

Actual fragments emitted: ['{', '"filePath": "/tmp/a.txt", "content": "HELLO"']
Expected: ['{', '"filePath": "/tmp/a.txt", "content": "HELLO"', '}']

Root cause: On the burst delta, json_started=False so { is returned immediately (correct). On the next call (</tool_call> delta), the param loop processes both params and hits if json_fragments: return DeltaMessage(...) — returning before the } closing check which lives after that block. Since there are no further chunks, } is never emitted and the streamed JSON is left unclosed.

Also can you please lmk how are we planning to move ahead with my PR(35716)?

@mergify
Copy link
Copy Markdown

mergify bot commented Mar 3, 2026

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @KrxGu.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Mar 3, 2026
@KrxGu
Copy link
Copy Markdown
Contributor Author

KrxGu commented Mar 5, 2026

@chaunceyjiang am i supposed to close this PR as #35615 is already merged?

@ccdv-ai
Copy link
Copy Markdown

ccdv-ai commented Mar 5, 2026

Tool calls still fail for me (nvfp4)

@KrxGu
Copy link
Copy Markdown
Contributor Author

KrxGu commented Mar 23, 2026

@chaunceyjiang closing this PR.

@KrxGu KrxGu closed this Mar 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working needs-rebase qwen Related to Qwen models tool-calling

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

[Bug]: Missing opening brace for Qwen3.5 streaming tool calls

4 participants