Skip to content

[Bugfix][Tool Parser] Fix Qwen3 Coder parser to stream tool call arguments#32536

Open
karanb192 wants to merge 1 commit into
vllm-project:mainfrom
karanb192:fix/qwen3coder-streaming-args
Open

[Bugfix][Tool Parser] Fix Qwen3 Coder parser to stream tool call arguments#32536
karanb192 wants to merge 1 commit into
vllm-project:mainfrom
karanb192:fix/qwen3coder-streaming-args

Conversation

@karanb192
Copy link
Copy Markdown

Summary

  • Fixes issue where the Qwen3 Coder tool parser did not stream tool call arguments incrementally
  • When using long parameter values (like code blocks), the stream would halt until the entire parameter was complete
  • This change enables real-time streaming of tool call arguments as tokens are generated

Changes

  • When a parameter is detected but incomplete, the parser now enters streaming mode (self.in_param = True) instead of waiting
  • Streams the parameter key header (e.g., "code": ") immediately
  • Incrementally streams JSON-escaped parameter values as they arrive
  • Properly detects multiple end markers: </parameter>, <parameter=, and </function>

Test Plan

  • Manual testing with Qwen3 Coder model for tool calls with long arguments
  • Verify streaming chunks are sent incrementally instead of waiting for complete parameters
  • Existing tool parser tests should continue to pass

Fixes #30439

…ments

When using Qwen3 Coder for tool calls, the parser previously waited until
the entire parameter value was complete before sending any stream chunks.
This caused the stream to halt for long parameters like code blocks, with
no indication of progress.

This change enables incremental streaming of tool call arguments by:
- Setting `self.in_param = True` when a parameter starts but isn't complete
- Streaming the parameter key header and partial values as they arrive
- Properly handling multiple end markers (</parameter>, <parameter=, </function>)
- JSON-escaping streamed values correctly

Fixes vllm-project#30439

Signed-off-by: Karan Bansal <karanb192@gmail.com>
@mergify mergify Bot added qwen Related to Qwen models bug Something isn't working labels Jan 18, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request successfully addresses a bug where the Qwen3 Coder tool parser would halt when streaming tool calls with long parameter values. The changes enable incremental streaming of tool call arguments, which is a significant improvement. The implementation correctly handles the complexities of parsing the XML-like format and generating JSON argument chunks on the fly. However, I have identified a critical performance issue in the logic for streaming parameter values. The current approach has quadratic time complexity, which will negatively impact performance for the very long parameter values this change is intended to support. My review comment provides details on this issue.

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 3 potential issues.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

Comment @cursor review or bugbot run to trigger another review on this PR

else:
# Still streaming, wait for more content
return None
# Parameter is incomplete - start streaming
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing string close when function end detected during streaming

Medium Severity

When </function> is detected in the accumulated tool_text while in streaming mode (in_param = True), the code outputs } to close the JSON but never closes the open string value with a closing quote. This produces malformed JSON like {"param": "partial_value} because in_param is not reset and no " is emitted before the closing brace. The new streaming code path at lines 636-680 can now trigger this scenario when the function end marker is split across tokens.

Additional Locations (1)

Fix in Cursor Fix in Web

idx = delta_text.find(marker)
if idx != -1 and (end_idx == -1 or idx < end_idx):
end_idx = idx
end_marker = marker
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Split XML markers cause corrupted parameter values in stream

Medium Severity

When in streaming mode, end markers (</parameter>, <parameter=, </function>) are only detected if complete within a single delta_text. If a marker is tokenized across multiple deltas (e.g., </para then meter>), the partial marker content is streamed as part of the parameter value. This corrupts the output—for example, a value hello could become hello</parametermeter> in the streamed JSON. The code at lines 800-821 unconditionally adds the entire delta_text to current_param_value when no complete marker is found.

Additional Locations (1)

Fix in Cursor Fix in Web

if self.param_count == 0:
key_fragment = f'"{self.current_param_name}": "'
else:
key_fragment = f', "{self.current_param_name}": "'
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Streamed parameters always output as strings ignoring type

Low Severity

When entering streaming mode, key_fragment hardcodes an opening quote for a string value (e.g., "param": "), committing the output to JSON string type. The non-streaming path at lines 700-711 calls _convert_param_value() for proper type conversion (integers, booleans, objects). This creates inconsistent output: the same parameter value could be "count": 42 (number) if it completes quickly or "count": "42" (string) if streaming is triggered.

Additional Locations (1)

Fix in Cursor Fix in Web

@karanb192
Copy link
Copy Markdown
Author

Hi @DarkLight1337 - could you please review this tool parser fix? It addresses issue #30439 where Qwen3 Coder parser wasn't streaming arguments correctly. Thanks!

@DarkLight1337
Copy link
Copy Markdown
Member

Please stop pinging me, I'm not able to review tool parsers. You can ask @chaunceyjiang instead.

@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented Mar 20, 2026

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @karanb192.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working needs-rebase qwen Related to Qwen models tool-calling

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

[Bug]: Qwen3 Coder parser does not stream tool call arguments

2 participants