Skip to content

Update xml tool parser for qwen series model#10035

Closed
wenmengzhou wants to merge 23 commits intosgl-project:mainfrom
wenmengzhou:feat/xml_parser
Closed

Update xml tool parser for qwen series model#10035
wenmengzhou wants to merge 23 commits intosgl-project:mainfrom
wenmengzhou:feat/xml_parser

Conversation

@wenmengzhou
Copy link

@wenmengzhou wenmengzhou commented Sep 4, 2025

Motivation

Contribute the internal tool parser used at Qwen API Service, which use a standard xml parser to parse text streamingly, and handles a lot of corner cases:

  1. make sure the params of corresponding type are returned
  2. handle function format error such as missing } for params
  3. to be continued further...

Modifications

  • Provide a brandnew tool parser which implement the streaming parser in file

    python/sglang/srt/function_call/qwen3_coder_new_detector.py

  • seperate the test cases for qwen xml parser in a single file in order to continusly add badcase and maintain this parser to handle as many badcases as possible

    test/srt/test_function_call_qwen_xml_paser.py

Accuracy Tests

No Need

Benchmarking and Profiling

No Need

Checklist

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @wenmengzhou, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request integrates a sophisticated XML tool parser to enhance the function calling capabilities for Qwen series models. The core objective is to provide a more reliable and accurate mechanism for interpreting tool calls from model outputs, particularly in streaming contexts, by addressing common parsing challenges and ensuring proper data type handling.

Highlights

  • New XML Tool Parser: Introduced a brand-new XML-based tool parser, StreamingXMLToolCallParser, specifically designed for Qwen series models to handle function calls robustly.
  • Enhanced Streaming Capabilities: The new parser significantly improves streaming parsing by handling partial XML chunks, ensuring correct type conversion for parameters, and managing complex scenarios like multiple tool calls and malformed XML.
  • Dedicated Test Suite: A separate and comprehensive test file (test_function_call_qwen_xml_paser.py) has been added to rigorously test the new XML parser, covering various edge cases and ensuring its reliability.
  • Code Refactoring: Existing test cases for the Qwen3 Coder Detector were refactored and moved to the new dedicated test file, improving test organization and maintainability.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new, comprehensive streaming XML parser for Qwen model tool calls, along with a dedicated suite of unit tests. The implementation is thorough and handles many corner cases. My review focuses on improving robustness, error handling, and maintainability. The key suggestions are to replace broad exception handling with more specific catches and proper logging, simplify the parsing logic by relying more on the expat library's native streaming capabilities instead of manual pre-parsing, and remove some unused code. Overall, this is a solid contribution that will be even better with these refinements.

Comment on lines +351 to +352
except Exception:
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Catching a broad except Exception: and then using pass can hide critical bugs in the fallback logic, making the system harder to debug and maintain. If an exception occurs here, it should be logged to provide visibility into potential issues during parsing.

Suggested change
except Exception:
pass
except Exception as e:
logger.warning(f"Error during fallback completion: {e}")

Comment on lines +171 to +204
try:
param_value = int(param_value)
except Exception:
pass
return param_value
elif param_type.startswith("num") or param_type.startswith("float"):
try:
float_param_value = float(param_value)
param_value = (
float_param_value
if float_param_value - int(float_param_value) != 0
else int(float_param_value)
)
except Exception:
pass
return param_value
elif param_type in ["boolean", "bool", "binary"]:
param_value = param_value.lower()
return param_value == "true"
elif param_type in ["array"]:
try:
# First try ast.literal_eval for safe evaluation of Python literals
param_value = ast.literal_eval(param_value)
if not isinstance(param_value, list):
param_value = list(param_value)
except (ValueError, SyntaxError):
# If literal_eval fails, try json.loads for JSON array format
try:
param_value = json.loads(param_value)
if not isinstance(param_value, list):
param_value = list(param_value)
except (json.JSONDecodeError, TypeError):
# If both parsing methods fail, keep as string
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The function uses broad except Exception: pass blocks (e.g., lines 173, 184, 196) to handle type conversion errors. This approach silently ignores failures, causing the function to return the original string value instead of the expected type. This can lead to downstream errors when the calling code expects a specific data type (e.g., an integer or a list). It would be more robust to log these conversion failures as warnings to aid in debugging malformed model outputs.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment on lines +445 to +508
def _process_complete_xml_elements(self) -> bool:
"""
Process complete XML elements in the buffer

Returns:
bool: whether complete elements were found and processed
"""
found_any = False

while self.last_processed_pos < len(self.streaming_buffer):
# Find next complete element
element, end_pos = self._find_next_complete_element(self.last_processed_pos)
if element is None:
# No complete element found, wait for more data
break

# Check if this element should be skipped
if self._should_skip_element(element):
# print(f"Skip non-XML text: {repr(element)}")
self.last_processed_pos = end_pos
continue

# Found complete XML element, process it
try:
# Preprocess XML chunk
preprocessed_element = self._preprocess_xml_chunk(element)
# Check if this is the first tool_call start
if (
preprocessed_element.strip().startswith("<tool_call>")
and self.tool_call_index == 0
):
# First tool_call starts, output previously collected text content first
if self.text_content_buffer:
text_delta = DeltaMessage(
role=None,
content=self.text_content_buffer,
reasoning_content=None,
tool_calls=[],
)
self._emit_delta(text_delta)
# Clear buffer for potential subsequent text content
self.text_content_buffer = ""

# Check if this is a new tool_call start and there's already a completed tool_call
if (
preprocessed_element.strip().startswith("<tool_call>")
and self.tool_call_index > 0
):
# New tool_call starts, reset parser state but keep generated deltas
# print(f"reset parser for new tool call")
self._reset_parser_for_new_tool_call()

# Parse preprocessed element
self.parser.Parse(preprocessed_element, False)
found_any = True

except Exception as e:
print(f"exception occurs: {e}, preprocessed_element: {repr(element)}")
pass

# Update processed position
self.last_processed_pos = end_pos

return found_any
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The current implementation uses several helper functions (_find_next_complete_element, _extract_complete_xml_chunks, _preprocess_xml_chunk) to manually find and preprocess XML elements from the input stream before feeding them to the expat parser. This approach is complex and can be brittle, as manually parsing XML-like structures with string manipulation and regex is a known anti-pattern.

The xml.parsers.expat library is designed for streaming and can handle incomplete chunks of data gracefully. A simpler and more robust design would be to feed the incoming chunks directly to self.parser.Parse() and let expat manage element boundaries and parsing events. This would eliminate the need for the custom element-finding logic and make the overall parser more reliable.

Comment on lines +31 to +34
except Exception:
try:
return ast.literal_eval(raw)
except Exception:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using a broad except Exception: can hide unexpected errors and make debugging difficult. It's better to catch the specific exceptions that you expect json.loads and ast.literal_eval to raise, such as json.JSONDecodeError, ValueError, and SyntaxError.

Suggested change
except Exception:
try:
return ast.literal_eval(raw)
except Exception:
except (json.JSONDecodeError, TypeError):
try:
return ast.literal_eval(raw)
except (ValueError, SyntaxError):

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment on lines +501 to +503
except Exception as e:
print(f"exception occurs: {e}, preprocessed_element: {repr(element)}")
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using print() for logging exceptions is not ideal in a library, as it writes to standard output and can't be easily controlled or redirected. It's better to use the logging module. Additionally, passing after catching a broad exception can leave the parser in an inconsistent state and hide critical errors.

Suggested change
except Exception as e:
print(f"exception occurs: {e}, preprocessed_element: {repr(element)}")
pass
except Exception as e:
logger.warning(f"Exception during XML parsing: {e}, element: {repr(element)}")
pass

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment on lines +1488 to +1532
def _extract(self, text: str, tools: List[Tool]) -> Tuple[str, List[ToolCallItem]]:
normal_parts: List[str] = []
calls: List[ToolCallItem] = []
cursor = 0
while True:
s = text.find(self.tool_call_start_token, cursor)
if s == -1:
normal_parts.append(text[cursor:])
break
normal_parts.append(text[cursor:s])
e = text.find(self.tool_call_end_token, s)
if e == -1:
normal_parts.append(text[s:])
break
block = text[s : e + len(self.tool_call_end_token)]
cursor = e + len(self.tool_call_end_token)
calls.extend(self._parse_block(block, tools))
return "".join(normal_parts), calls

def _parse_block(self, block: str, tools: List[Tool]) -> List[ToolCallItem]:
res: List[ToolCallItem] = []
for m in self.tool_call_function_regex.findall(block):
txt = m[0] if m[0] else m[1]
if ">" not in txt:
continue
idx = txt.index(">")
fname = txt[:idx].strip()
body = txt[idx + 1 :]
params: Dict[str, Any] = {}
for pm in self.tool_call_parameter_regex.findall(body):
ptxt = pm[0] if pm[0] else pm[1]
if ">" not in ptxt:
continue
pidx = ptxt.index(">")
pname = ptxt[:pidx].strip()
pval = ptxt[pidx + 1 :].lstrip("\n").rstrip("\n")
params[pname] = _safe_val(pval)
raw = {"name": fname, "arguments": params}
try:
# TODO: fix idx in function call, the index for a function
# call will always be -1 in parse_base_json
res.extend(self.parse_base_json(raw, tools))
except Exception:
logger.warning("invalid tool call for %s dropped", fname)
return res
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The methods _extract and _parse_block appear to be unused. The main parsing logic is handled by parse_streaming_increment, and the non-streaming detect_and_parse method that would have used them is commented out. This dead code should be removed to improve code clarity and maintainability.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deleted

@PopSoda2002
Copy link
Contributor

Thanks for your great contribution!

Copy link
Contributor

@PopSoda2002 PopSoda2002 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good PR, but I think it needs to be changed in these several aspects:

  1. Try to divide the PR, may start from core content first as a small PR, currently it's 2200+ lines, really hard for reviewing, and it's hard to tell whether the unit test covers the all changes
  2. Try to use more descriptive name not just new_detector, new can be old in the future, otherwise, it needs to be created using new_new_detector, I suggest to use {feature}_detector to distinguish from old one
  3. Try to put your unit test result in the PR description
  4. Try to clean the code better

"""Set tool configuration information"""
self.tools = tools

def _get_param_type(self, param_name: str) -> str:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try to create a param type enum to hold different type

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this method extract different param types set by user when calling the api, by default the value types are expected to be valid json schema, but user may set the wrong type such as using str instread of string, or strings instead of string, so there are so many types and this function only extract the type info and let the _convert_param_value to deal with

if param_value.lower() == "null":
return None

param_type = param_type.strip().lower()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you assign a name for the group? Like STRING_TYPE _GROUP= ["string", "str", "text", "varchar", "char", "enum"], then use if param_type in STRING_TYPE _GROUP?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be more clear

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

except Exception:
pass
return param_value
elif param_type in ["boolean", "bool", "binary"]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar with above

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

escaped text
"""
# XML special character escape mapping
xml_escapes = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

constant variable can use upper character like XML_ESCAPES_MAP

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@zhaochenyang20
Copy link
Collaborator

@CatherineSue @JustinTong0323 could you help to reivew?

def supports_structural_tag(self) -> bool:
return False

def structure_info(self) -> _GetInfoFunc:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why adding an not implement method here, it seems no one will inherit this class

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deleted

def test_parse_streaming_multiple_tools(self):
"""Test streaming with multiple tool calls."""
model_output = """<tool_call>
<function=get_current_weather>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

intent

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these string should be strictly followed the model output indentation, so here can not add additional indent

type="function",
function=Function(
name="read_file",
description="Reads and returns the content of a specified file from the local filesystem. Handles text, images (PNG, JPG, GIF, WEBP, SVG, BMP), and PDF files. For text files, it can read specific line ranges.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

intent

@JustinTong0323 JustinTong0323 self-requested a review September 12, 2025 20:52
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we directly replace old one with it?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for sure

@cynial
Copy link
Contributor

cynial commented Nov 4, 2025

@CatherineSue Could you have a look at the code review?

@JustinTong0323
Copy link
Collaborator

/tag-and-rerun-ci

@JustinTong0323
Copy link
Collaborator

I see lots of Error during fallback completion: asynchronous entity: line 7, column 0 in server log when testing with qwen-coder cli, is this expected?

Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
@wenmengzhou
Copy link
Author

close this pr using existing implementation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants