Update xml tool parser for qwen series model by wenmengzhou · Pull Request #10035 · sgl-project/sglang

wenmengzhou · 2025-09-04T15:20:11Z

Motivation

Contribute the internal tool parser used at Qwen API Service, which use a standard xml parser to parse text streamingly, and handles a lot of corner cases:

make sure the params of corresponding type are returned
handle function format error such as missing } for params
to be continued further...

Modifications

Provide a brandnew tool parser which implement the streaming parser in file

python/sglang/srt/function_call/qwen3_coder_new_detector.py
seperate the test cases for qwen xml parser in a single file in order to continusly add badcase and maintain this parser to handle as many badcases as possible

test/srt/test_function_call_qwen_xml_paser.py

Accuracy Tests

No Need

Benchmarking and Profiling

No Need

Checklist

Format your code according to the Format code with pre-commit.
Add unit tests according to the Run and add unit tests.
Update documentation according to Write documentations.
Provide accuracy and speed benchmark results according to Test the accuracy and Benchmark the speed.

gemini-code-assist

Summary of Changes

Hello @wenmengzhou, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request integrates a sophisticated XML tool parser to enhance the function calling capabilities for Qwen series models. The core objective is to provide a more reliable and accurate mechanism for interpreting tool calls from model outputs, particularly in streaming contexts, by addressing common parsing challenges and ensuring proper data type handling.

Highlights

New XML Tool Parser: Introduced a brand-new XML-based tool parser, StreamingXMLToolCallParser, specifically designed for Qwen series models to handle function calls robustly.
Enhanced Streaming Capabilities: The new parser significantly improves streaming parsing by handling partial XML chunks, ensuring correct type conversion for parameters, and managing complex scenarios like multiple tool calls and malformed XML.
Dedicated Test Suite: A separate and comprehensive test file (test_function_call_qwen_xml_paser.py) has been added to rigorously test the new XML parser, covering various edge cases and ensuring its reliability.
Code Refactoring: Existing test cases for the Qwen3 Coder Detector were refactored and moved to the new dedicated test file, improving test organization and maintainability.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a new, comprehensive streaming XML parser for Qwen model tool calls, along with a dedicated suite of unit tests. The implementation is thorough and handles many corner cases. My review focuses on improving robustness, error handling, and maintainability. The key suggestions are to replace broad exception handling with more specific catches and proper logging, simplify the parsing logic by relying more on the expat library's native streaming capabilities instead of manual pre-parsing, and remove some unused code. Overall, this is a solid contribution that will be even better with these refinements.

gemini-code-assist · 2025-09-04T15:22:25Z

python/sglang/srt/function_call/qwen3_coder_new_detector.py

+            except Exception:
+                pass


Catching a broad except Exception: and then using pass can hide critical bugs in the fallback logic, making the system harder to debug and maintain. If an exception occurs here, it should be logged to provide visibility into potential issues during parsing.

Suggested change

except Exception:

pass

except Exception as e:

logger.warning(f"Error during fallback completion: {e}")

gemini-code-assist · 2025-09-04T15:22:25Z

python/sglang/srt/function_call/qwen3_coder_new_detector.py

+            try:
+                param_value = int(param_value)
+            except Exception:
+                pass
+            return param_value
+        elif param_type.startswith("num") or param_type.startswith("float"):
+            try:
+                float_param_value = float(param_value)
+                param_value = (
+                    float_param_value
+                    if float_param_value - int(float_param_value) != 0
+                    else int(float_param_value)
+                )
+            except Exception:
+                pass
+            return param_value
+        elif param_type in ["boolean", "bool", "binary"]:
+            param_value = param_value.lower()
+            return param_value == "true"
+        elif param_type in ["array"]:
+            try:
+                # First try ast.literal_eval for safe evaluation of Python literals
+                param_value = ast.literal_eval(param_value)
+                if not isinstance(param_value, list):
+                    param_value = list(param_value)
+            except (ValueError, SyntaxError):
+                # If literal_eval fails, try json.loads for JSON array format
+                try:
+                    param_value = json.loads(param_value)
+                    if not isinstance(param_value, list):
+                        param_value = list(param_value)
+                except (json.JSONDecodeError, TypeError):
+                    # If both parsing methods fail, keep as string
+                    pass


The function uses broad except Exception: pass blocks (e.g., lines 173, 184, 196) to handle type conversion errors. This approach silently ignores failures, causing the function to return the original string value instead of the expected type. This can lead to downstream errors when the calling code expects a specific data type (e.g., an integer or a list). It would be more robust to log these conversion failures as warnings to aid in debugging malformed model outputs.

gemini-code-assist · 2025-09-04T15:22:25Z

python/sglang/srt/function_call/qwen3_coder_new_detector.py

+    def _process_complete_xml_elements(self) -> bool:
+        """
+        Process complete XML elements in the buffer
+
+        Returns:
+            bool: whether complete elements were found and processed
+        """
+        found_any = False
+
+        while self.last_processed_pos < len(self.streaming_buffer):
+            # Find next complete element
+            element, end_pos = self._find_next_complete_element(self.last_processed_pos)
+            if element is None:
+                # No complete element found, wait for more data
+                break
+
+            # Check if this element should be skipped
+            if self._should_skip_element(element):
+                # print(f"Skip non-XML text: {repr(element)}")
+                self.last_processed_pos = end_pos
+                continue
+
+            # Found complete XML element, process it
+            try:
+                # Preprocess XML chunk
+                preprocessed_element = self._preprocess_xml_chunk(element)
+                # Check if this is the first tool_call start
+                if (
+                    preprocessed_element.strip().startswith("<tool_call>")
+                    and self.tool_call_index == 0
+                ):
+                    # First tool_call starts, output previously collected text content first
+                    if self.text_content_buffer:
+                        text_delta = DeltaMessage(
+                            role=None,
+                            content=self.text_content_buffer,
+                            reasoning_content=None,
+                            tool_calls=[],
+                        )
+                        self._emit_delta(text_delta)
+                        # Clear buffer for potential subsequent text content
+                        self.text_content_buffer = ""
+
+                # Check if this is a new tool_call start and there's already a completed tool_call
+                if (
+                    preprocessed_element.strip().startswith("<tool_call>")
+                    and self.tool_call_index > 0
+                ):
+                    # New tool_call starts, reset parser state but keep generated deltas
+                    # print(f"reset parser for new tool call")
+                    self._reset_parser_for_new_tool_call()
+
+                # Parse preprocessed element
+                self.parser.Parse(preprocessed_element, False)
+                found_any = True
+
+            except Exception as e:
+                print(f"exception occurs: {e}, preprocessed_element: {repr(element)}")
+                pass
+
+            # Update processed position
+            self.last_processed_pos = end_pos
+
+        return found_any


The current implementation uses several helper functions (_find_next_complete_element, _extract_complete_xml_chunks, _preprocess_xml_chunk) to manually find and preprocess XML elements from the input stream before feeding them to the expat parser. This approach is complex and can be brittle, as manually parsing XML-like structures with string manipulation and regex is a known anti-pattern.

The xml.parsers.expat library is designed for streaming and can handle incomplete chunks of data gracefully. A simpler and more robust design would be to feed the incoming chunks directly to self.parser.Parse() and let expat manage element boundaries and parsing events. This would eliminate the need for the custom element-finding logic and make the overall parser more reliable.

gemini-code-assist · 2025-09-04T15:22:25Z

python/sglang/srt/function_call/qwen3_coder_new_detector.py

+    except Exception:
+        try:
+            return ast.literal_eval(raw)
+        except Exception:


Using a broad except Exception: can hide unexpected errors and make debugging difficult. It's better to catch the specific exceptions that you expect json.loads and ast.literal_eval to raise, such as json.JSONDecodeError, ValueError, and SyntaxError.

Suggested change

except Exception:

try:

return ast.literal_eval(raw)

except Exception:

except (json.JSONDecodeError, TypeError):

try:

return ast.literal_eval(raw)

except (ValueError, SyntaxError):

gemini-code-assist · 2025-09-04T15:22:25Z

python/sglang/srt/function_call/qwen3_coder_new_detector.py

+            except Exception as e:
+                print(f"exception occurs: {e}, preprocessed_element: {repr(element)}")
+                pass


Using print() for logging exceptions is not ideal in a library, as it writes to standard output and can't be easily controlled or redirected. It's better to use the logging module. Additionally, passing after catching a broad exception can leave the parser in an inconsistent state and hide critical errors.

Suggested change

except Exception as e:

print(f"exception occurs: {e}, preprocessed_element: {repr(element)}")

pass

except Exception as e:

logger.warning(f"Exception during XML parsing: {e}, element: {repr(element)}")

pass

gemini-code-assist · 2025-09-04T15:22:25Z

python/sglang/srt/function_call/qwen3_coder_new_detector.py

+    def _extract(self, text: str, tools: List[Tool]) -> Tuple[str, List[ToolCallItem]]:
+        normal_parts: List[str] = []
+        calls: List[ToolCallItem] = []
+        cursor = 0
+        while True:
+            s = text.find(self.tool_call_start_token, cursor)
+            if s == -1:
+                normal_parts.append(text[cursor:])
+                break
+            normal_parts.append(text[cursor:s])
+            e = text.find(self.tool_call_end_token, s)
+            if e == -1:
+                normal_parts.append(text[s:])
+                break
+            block = text[s : e + len(self.tool_call_end_token)]
+            cursor = e + len(self.tool_call_end_token)
+            calls.extend(self._parse_block(block, tools))
+        return "".join(normal_parts), calls
+
+    def _parse_block(self, block: str, tools: List[Tool]) -> List[ToolCallItem]:
+        res: List[ToolCallItem] = []
+        for m in self.tool_call_function_regex.findall(block):
+            txt = m[0] if m[0] else m[1]
+            if ">" not in txt:
+                continue
+            idx = txt.index(">")
+            fname = txt[:idx].strip()
+            body = txt[idx + 1 :]
+            params: Dict[str, Any] = {}
+            for pm in self.tool_call_parameter_regex.findall(body):
+                ptxt = pm[0] if pm[0] else pm[1]
+                if ">" not in ptxt:
+                    continue
+                pidx = ptxt.index(">")
+                pname = ptxt[:pidx].strip()
+                pval = ptxt[pidx + 1 :].lstrip("\n").rstrip("\n")
+                params[pname] = _safe_val(pval)
+            raw = {"name": fname, "arguments": params}
+            try:
+                # TODO: fix idx in function call, the index for a function
+                # call will always be -1 in parse_base_json
+                res.extend(self.parse_base_json(raw, tools))
+            except Exception:
+                logger.warning("invalid tool call for %s dropped", fname)
+        return res


The methods _extract and _parse_block appear to be unused. The main parsing logic is handled by parse_streaming_increment, and the non-streaming detect_and_parse method that would have used them is commented out. This dead code should be removed to improve code clarity and maintainability.

PopSoda2002 · 2025-09-10T15:11:34Z

Thanks for your great contribution!

PopSoda2002

It's a good PR, but I think it needs to be changed in these several aspects:

Try to divide the PR, may start from core content first as a small PR, currently it's 2200+ lines, really hard for reviewing, and it's hard to tell whether the unit test covers the all changes
Try to use more descriptive name not just new_detector, new can be old in the future, otherwise, it needs to be created using new_new_detector, I suggest to use {feature}_detector to distinguish from old one
Try to put your unit test result in the PR description
Try to clean the code better

PopSoda2002 · 2025-09-10T15:22:54Z

python/sglang/srt/function_call/qwen3_coder_new_detector.py

+        """Set tool configuration information"""
+        self.tools = tools
+
+    def _get_param_type(self, param_name: str) -> str:


Try to create a param type enum to hold different type

this method extract different param types set by user when calling the api, by default the value types are expected to be valid json schema, but user may set the wrong type such as using str instread of string, or strings instead of string, so there are so many types and this function only extract the type info and let the _convert_param_value to deal with

PopSoda2002 · 2025-09-10T15:27:28Z

python/sglang/srt/function_call/qwen3_coder_new_detector.py

+        if param_value.lower() == "null":
+            return None
+
+        param_type = param_type.strip().lower()


Can you assign a name for the group? Like STRING_TYPE _GROUP= ["string", "str", "text", "varchar", "char", "enum"], then use if param_type in STRING_TYPE _GROUP?

It will be more clear

PopSoda2002 · 2025-09-10T15:27:49Z

python/sglang/srt/function_call/qwen3_coder_new_detector.py

+            except Exception:
+                pass
+            return param_value
+        elif param_type in ["boolean", "bool", "binary"]:


Similar with above

PopSoda2002 · 2025-09-10T15:30:41Z

python/sglang/srt/function_call/qwen3_coder_new_detector.py

+            escaped text
+        """
+        # XML special character escape mapping
+        xml_escapes = {


constant variable can use upper character like XML_ESCAPES_MAP

…feat/xml_parser

zhaochenyang20 · 2025-09-12T02:50:45Z

@CatherineSue @JustinTong0323 could you help to reivew?

PopSoda2002 · 2025-09-12T18:31:22Z

python/sglang/srt/function_call/qwen3_coder_new_detector.py

+    def supports_structural_tag(self) -> bool:
+        return False
+
+    def structure_info(self) -> _GetInfoFunc:


Why adding an not implement method here, it seems no one will inherit this class

PopSoda2002 · 2025-09-12T18:31:55Z

test/registered/function_call/test_function_call_qwen_xml_paser.py

+    def test_parse_streaming_multiple_tools(self):
+        """Test streaming with multiple tool calls."""
+        model_output = """<tool_call>
+<function=get_current_weather>


these string should be strictly followed the model output indentation, so here can not add additional indent

PopSoda2002 · 2025-09-12T18:32:06Z

test/registered/function_call/test_function_call_qwen_xml_paser.py

+                type="function",
+                function=Function(
+                    name="read_file",
+                    description="Reads and returns the content of a specified file from the local filesystem. Handles text, images (PNG, JPG, GIF, WEBP, SVG, BMP), and PDF files. For text files, it can read specific line ranges.",


JustinTong0323 · 2025-09-13T21:29:09Z

python/sglang/srt/function_call/qwen3_coder_new_detector.py

Could we directly replace old one with it?

cynial · 2025-11-04T01:10:43Z

@CatherineSue Could you have a look at the code review?

JustinTong0323 · 2025-12-22T15:13:23Z

/tag-and-rerun-ci

JustinTong0323 · 2025-12-22T15:49:04Z

I see lots of Error during fallback completion: asynchronous entity: line 7, column 0 in server log when testing with qwen-coder cli, is this expected?

Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>

wenmengzhou · 2026-03-25T10:12:43Z

close this pr using existing implementation

wenmengzhou requested a review from CatherineSue as a code owner September 4, 2025 15:20

gemini-code-assist bot reviewed Sep 4, 2025

View reviewed changes

wenmengzhou added 6 commits September 5, 2025 12:12

first version, add StreamingXMLToolCallParser and adapt to sglang

a07d5ed

add english comment

ed60cd1

add support for json schema array

54c45aa

isolate qwen xml parser

bbecdbb

use streaming parse for full text parse and ut clean

c9fd2ee

refine

ce008d3

wenmengzhou force-pushed the feat/xml_parser branch from 8dfcc7f to ce008d3 Compare September 5, 2025 04:13

wenmengzhou and others added 2 commits September 5, 2025 12:14

Merge branch 'main' into feat/xml_parser

a81a2d8

remove strip for value and fix ut

214a0c3

PopSoda2002 suggested changes Sep 10, 2025

View reviewed changes

wenmengzhou added 3 commits September 11, 2025 13:04

remove code

ddf04c7

add more complex case

fad4f22

Merge branch 'feat/xml_parser' of github.com:wenmengzhou/sglang into …

561e9cd

…feat/xml_parser

PopSoda2002 reviewed Sep 12, 2025

View reviewed changes

JustinTong0323 self-requested a review September 12, 2025 20:52

wenmengzhou and others added 3 commits September 13, 2025 20:22

fix exception catch

b5803b9

resulove cr comment

a8b0e89

Merge branch 'main' into feat/xml_parser

b9b2cea

JustinTong0323 reviewed Sep 13, 2025

View reviewed changes

Merge branch 'main' into feat/xml_parser

bf5cedd

Leoyzen mentioned this pull request Oct 23, 2025

WIP: Fix glm-4.6 tool call streaming parse #11951

Draft

4 tasks

Merge branch 'main' into feat/xml_parser

872d31e

wenmengzhou added 2 commits December 13, 2025 16:12

merge main

d5fc25d

adpated to latest main branch code

abfaa69

wenmengzhou and others added 2 commits December 22, 2025 22:26

delete structure_info method

5e6ab05

Merge branch 'main' into feat/xml_parser

ec30af9

github-actions bot added the run-ci label Dec 22, 2025

JustinTong0323 added 3 commits December 22, 2025 15:51

fix: add missing structure_info

cdc751a

Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>

add to parsers

08c6870

Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>

move test

af41175

Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>

wenmengzhou closed this Mar 25, 2026

Conversation

wenmengzhou commented Sep 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

PopSoda2002 commented Sep 10, 2025

Uh oh!

PopSoda2002 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zhaochenyang20 commented Sep 12, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wenmengzhou commented Sep 4, 2025 •

edited

Loading