WIP: Fix glm-4.6 tool call streaming parse by tonylt · Pull Request #11951 · sgl-project/sglang

tonylt · 2025-10-22T03:02:16Z

Motivation

Summary

I have implemented a fix for GitHub issue #11888 regarding GLM-4.6 tool calls not supporting streaming output for arguments in SGLang.

Problem Analysis

The issue was that GLM-4.6 tool calls were being returned all at once rather than being streamed progressively. The original implementation waited for complete tool calls (until it found </tool_call>) before parsing and streaming them, which caused arguments to appear in a single chunk after a long wait.

Modifications

Solution Implemented

I implemented incremental streaming support for GLM-4.6 tool call arguments by modifying both the Rust and Python implementations:

Key Changes Made:

Added streaming state tracking:

Added current_tool_name_sent field to track whether the tool name has been streamed
Enhanced the streaming logic to handle partial tool calls

Implemented incremental parsing:

Created parse_partial_tool_call() method that can parse incomplete tool calls
Added logic to detect and stream tool names first, then arguments incrementally

Enhanced argument streaming:

Modified the streaming logic to calculate differences between current and previously streamed arguments
Implemented proper diff calculation to stream only new argument content

Files Modified:

Rust Implementation (sgl-router/src/tool_parser/parsers/glm4_moe_parser.rs):

Added current_tool_name_sent field to the parser struct
Implemented parse_partial_tool_call() method for incremental parsing
Enhanced parse_incremental() method to support streaming tool names and arguments
Updated the reset method to include the new field

Python Implementation (python/sglang/srt/function_call/glm4_moe_detector.py):

Added current_tool_name_sent field to track streaming state
Implemented _parse_partial_tool_call() method for incremental parsing
Enhanced parse_streaming_increment() method to support streaming
Added _find_common_prefix() helper method for diff calculation

Tests (sgl-router/tests/tool_parser_glm4_moe.rs):

Added comprehensive tests for streaming functionality
Tests verify that tool names are streamed first, followed by incremental argument streaming

Expected Behavior After Fix

With this implementation, GLM-4.6 tool calls now support proper streaming:
Tool name streaming: The function name is streamed first as soon as it's detected
Incremental argument streaming: Arguments are streamed progressively as they are parsed from the XML format
Better user experience: Users will see tool calls building up incrementally rather than waiting for complete tool calls

Testing

I created and ran comprehensive tests that verify:

Tool names are streamed immediately when detected
Arguments are streamed incrementally as they are parsed
The streaming behavior matches the expected format for GLM-4.6 models

The fix ensures that GLM-4.6 tool calls now provide the same streaming experience as other model formats in SGLang, addressing the user's concern about better responsiveness and user experience.

Accuracy Tests

Benchmarking and Profiling

Checklist

Format your code according to the Format code with pre-commit.
Add unit tests according to the Run and add unit tests.
Update documentation according to Write documentations.
Provide accuracy and speed benchmark results according to Test the accuracy and Benchmark the speed.

gaoganlsz · 2025-10-22T09:23:37Z

tonylt 加油~ 在线等修复

gaoganlsz · 2025-10-22T12:22:41Z

tonylt 加油~ 在线等修复

目前的实现有问题, 例如下面的new_text调用顺序会把工具名称解析成:"read", 其实应该是:"read-file"

1:<tool_call>read
2:-file\n
3:<arg_key>target_file
4:</arg_key>\n<arg_value>

gaoganlsz · 2025-10-22T12:40:47Z

json.loads(prev_args_str) 这种解析key, value节点的规则, 无法满足例如创建一个大文件, value的内容超大就会导致等待一个超大的流节点, 无法满足需求

Leoyzen · 2025-10-23T03:33:06Z

Maybe consider reuse the streaming xml parser from this pr: #10035.

Or a more general streaming xml parser for all kind of LLM which uses xml as tool use template?

tonylt · 2025-10-29T03:28:04Z

Maybe consider reuse the streaming xml parser from this pr: #10035.

Or a more general streaming xml parser for all kind of LLM which uses xml as tool use template?

@Leoyzen Yes, a unified parser is much better. I'll take a look, thanks.

cynial · 2025-12-05T04:20:40Z

@tonylt @gaoganlsz I'm trying to fix this issue. Could you take a look?
#13989

tonylt force-pushed the fix-glm-4.6-tool-call-streaming-output branch from 11bcdbd to 8e1a949 Compare October 22, 2025 03:06

JustinTong0323 self-assigned this Oct 22, 2025

fix glm-4.6 tool call streaming parse

e3faf3d

tonylt force-pushed the fix-glm-4.6-tool-call-streaming-output branch from 8e1a949 to e3faf3d Compare October 22, 2025 03:18

tonylt marked this pull request as ready for review October 22, 2025 08:26

tonylt requested review from ByronHsu, CatherineSue, JustinTong0323 and slin1237 as code owners October 22, 2025 08:26

tonylt changed the title ~~Fix glm-4.6 tool call streaming parse~~ WIP: Fix glm-4.6 tool call streaming parse Oct 22, 2025

tonylt marked this pull request as draft October 22, 2025 08:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: Fix glm-4.6 tool call streaming parse#11951

WIP: Fix glm-4.6 tool call streaming parse#11951
tonylt wants to merge 1 commit intosgl-project:mainfrom
tonylt:fix-glm-4.6-tool-call-streaming-output

tonylt commented Oct 22, 2025 •

edited

Loading

Uh oh!

gaoganlsz commented Oct 22, 2025

Uh oh!

gaoganlsz commented Oct 22, 2025

Uh oh!

gaoganlsz commented Oct 22, 2025

Uh oh!

Leoyzen commented Oct 23, 2025

Uh oh!

tonylt commented Oct 29, 2025

Uh oh!

cynial commented Dec 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

tonylt commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Summary

Problem Analysis

Modifications

Solution Implemented

Key Changes Made:

Files Modified:

Expected Behavior After Fix

Testing

Accuracy Tests

Benchmarking and Profiling

Checklist

Uh oh!

gaoganlsz commented Oct 22, 2025

Uh oh!

gaoganlsz commented Oct 22, 2025

Uh oh!

gaoganlsz commented Oct 22, 2025

Uh oh!

Leoyzen commented Oct 23, 2025

Uh oh!

tonylt commented Oct 29, 2025

Uh oh!

cynial commented Dec 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

tonylt commented Oct 22, 2025 •

edited

Loading