Update xml tool parser for qwen series model#10035
Update xml tool parser for qwen series model#10035wenmengzhou wants to merge 23 commits intosgl-project:mainfrom
Conversation
There was a problem hiding this comment.
Summary of Changes
Hello @wenmengzhou, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request integrates a sophisticated XML tool parser to enhance the function calling capabilities for Qwen series models. The core objective is to provide a more reliable and accurate mechanism for interpreting tool calls from model outputs, particularly in streaming contexts, by addressing common parsing challenges and ensuring proper data type handling.
Highlights
- New XML Tool Parser: Introduced a brand-new XML-based tool parser,
StreamingXMLToolCallParser, specifically designed for Qwen series models to handle function calls robustly. - Enhanced Streaming Capabilities: The new parser significantly improves streaming parsing by handling partial XML chunks, ensuring correct type conversion for parameters, and managing complex scenarios like multiple tool calls and malformed XML.
- Dedicated Test Suite: A separate and comprehensive test file (
test_function_call_qwen_xml_paser.py) has been added to rigorously test the new XML parser, covering various edge cases and ensuring its reliability. - Code Refactoring: Existing test cases for the Qwen3 Coder Detector were refactored and moved to the new dedicated test file, improving test organization and maintainability.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Code Review
This pull request introduces a new, comprehensive streaming XML parser for Qwen model tool calls, along with a dedicated suite of unit tests. The implementation is thorough and handles many corner cases. My review focuses on improving robustness, error handling, and maintainability. The key suggestions are to replace broad exception handling with more specific catches and proper logging, simplify the parsing logic by relying more on the expat library's native streaming capabilities instead of manual pre-parsing, and remove some unused code. Overall, this is a solid contribution that will be even better with these refinements.
| except Exception: | ||
| pass |
There was a problem hiding this comment.
Catching a broad except Exception: and then using pass can hide critical bugs in the fallback logic, making the system harder to debug and maintain. If an exception occurs here, it should be logged to provide visibility into potential issues during parsing.
| except Exception: | |
| pass | |
| except Exception as e: | |
| logger.warning(f"Error during fallback completion: {e}") |
| try: | ||
| param_value = int(param_value) | ||
| except Exception: | ||
| pass | ||
| return param_value | ||
| elif param_type.startswith("num") or param_type.startswith("float"): | ||
| try: | ||
| float_param_value = float(param_value) | ||
| param_value = ( | ||
| float_param_value | ||
| if float_param_value - int(float_param_value) != 0 | ||
| else int(float_param_value) | ||
| ) | ||
| except Exception: | ||
| pass | ||
| return param_value | ||
| elif param_type in ["boolean", "bool", "binary"]: | ||
| param_value = param_value.lower() | ||
| return param_value == "true" | ||
| elif param_type in ["array"]: | ||
| try: | ||
| # First try ast.literal_eval for safe evaluation of Python literals | ||
| param_value = ast.literal_eval(param_value) | ||
| if not isinstance(param_value, list): | ||
| param_value = list(param_value) | ||
| except (ValueError, SyntaxError): | ||
| # If literal_eval fails, try json.loads for JSON array format | ||
| try: | ||
| param_value = json.loads(param_value) | ||
| if not isinstance(param_value, list): | ||
| param_value = list(param_value) | ||
| except (json.JSONDecodeError, TypeError): | ||
| # If both parsing methods fail, keep as string | ||
| pass |
There was a problem hiding this comment.
The function uses broad except Exception: pass blocks (e.g., lines 173, 184, 196) to handle type conversion errors. This approach silently ignores failures, causing the function to return the original string value instead of the expected type. This can lead to downstream errors when the calling code expects a specific data type (e.g., an integer or a list). It would be more robust to log these conversion failures as warnings to aid in debugging malformed model outputs.
| def _process_complete_xml_elements(self) -> bool: | ||
| """ | ||
| Process complete XML elements in the buffer | ||
|
|
||
| Returns: | ||
| bool: whether complete elements were found and processed | ||
| """ | ||
| found_any = False | ||
|
|
||
| while self.last_processed_pos < len(self.streaming_buffer): | ||
| # Find next complete element | ||
| element, end_pos = self._find_next_complete_element(self.last_processed_pos) | ||
| if element is None: | ||
| # No complete element found, wait for more data | ||
| break | ||
|
|
||
| # Check if this element should be skipped | ||
| if self._should_skip_element(element): | ||
| # print(f"Skip non-XML text: {repr(element)}") | ||
| self.last_processed_pos = end_pos | ||
| continue | ||
|
|
||
| # Found complete XML element, process it | ||
| try: | ||
| # Preprocess XML chunk | ||
| preprocessed_element = self._preprocess_xml_chunk(element) | ||
| # Check if this is the first tool_call start | ||
| if ( | ||
| preprocessed_element.strip().startswith("<tool_call>") | ||
| and self.tool_call_index == 0 | ||
| ): | ||
| # First tool_call starts, output previously collected text content first | ||
| if self.text_content_buffer: | ||
| text_delta = DeltaMessage( | ||
| role=None, | ||
| content=self.text_content_buffer, | ||
| reasoning_content=None, | ||
| tool_calls=[], | ||
| ) | ||
| self._emit_delta(text_delta) | ||
| # Clear buffer for potential subsequent text content | ||
| self.text_content_buffer = "" | ||
|
|
||
| # Check if this is a new tool_call start and there's already a completed tool_call | ||
| if ( | ||
| preprocessed_element.strip().startswith("<tool_call>") | ||
| and self.tool_call_index > 0 | ||
| ): | ||
| # New tool_call starts, reset parser state but keep generated deltas | ||
| # print(f"reset parser for new tool call") | ||
| self._reset_parser_for_new_tool_call() | ||
|
|
||
| # Parse preprocessed element | ||
| self.parser.Parse(preprocessed_element, False) | ||
| found_any = True | ||
|
|
||
| except Exception as e: | ||
| print(f"exception occurs: {e}, preprocessed_element: {repr(element)}") | ||
| pass | ||
|
|
||
| # Update processed position | ||
| self.last_processed_pos = end_pos | ||
|
|
||
| return found_any |
There was a problem hiding this comment.
The current implementation uses several helper functions (_find_next_complete_element, _extract_complete_xml_chunks, _preprocess_xml_chunk) to manually find and preprocess XML elements from the input stream before feeding them to the expat parser. This approach is complex and can be brittle, as manually parsing XML-like structures with string manipulation and regex is a known anti-pattern.
The xml.parsers.expat library is designed for streaming and can handle incomplete chunks of data gracefully. A simpler and more robust design would be to feed the incoming chunks directly to self.parser.Parse() and let expat manage element boundaries and parsing events. This would eliminate the need for the custom element-finding logic and make the overall parser more reliable.
| except Exception: | ||
| try: | ||
| return ast.literal_eval(raw) | ||
| except Exception: |
There was a problem hiding this comment.
Using a broad except Exception: can hide unexpected errors and make debugging difficult. It's better to catch the specific exceptions that you expect json.loads and ast.literal_eval to raise, such as json.JSONDecodeError, ValueError, and SyntaxError.
| except Exception: | |
| try: | |
| return ast.literal_eval(raw) | |
| except Exception: | |
| except (json.JSONDecodeError, TypeError): | |
| try: | |
| return ast.literal_eval(raw) | |
| except (ValueError, SyntaxError): |
| except Exception as e: | ||
| print(f"exception occurs: {e}, preprocessed_element: {repr(element)}") | ||
| pass |
There was a problem hiding this comment.
Using print() for logging exceptions is not ideal in a library, as it writes to standard output and can't be easily controlled or redirected. It's better to use the logging module. Additionally, passing after catching a broad exception can leave the parser in an inconsistent state and hide critical errors.
| except Exception as e: | |
| print(f"exception occurs: {e}, preprocessed_element: {repr(element)}") | |
| pass | |
| except Exception as e: | |
| logger.warning(f"Exception during XML parsing: {e}, element: {repr(element)}") | |
| pass |
| def _extract(self, text: str, tools: List[Tool]) -> Tuple[str, List[ToolCallItem]]: | ||
| normal_parts: List[str] = [] | ||
| calls: List[ToolCallItem] = [] | ||
| cursor = 0 | ||
| while True: | ||
| s = text.find(self.tool_call_start_token, cursor) | ||
| if s == -1: | ||
| normal_parts.append(text[cursor:]) | ||
| break | ||
| normal_parts.append(text[cursor:s]) | ||
| e = text.find(self.tool_call_end_token, s) | ||
| if e == -1: | ||
| normal_parts.append(text[s:]) | ||
| break | ||
| block = text[s : e + len(self.tool_call_end_token)] | ||
| cursor = e + len(self.tool_call_end_token) | ||
| calls.extend(self._parse_block(block, tools)) | ||
| return "".join(normal_parts), calls | ||
|
|
||
| def _parse_block(self, block: str, tools: List[Tool]) -> List[ToolCallItem]: | ||
| res: List[ToolCallItem] = [] | ||
| for m in self.tool_call_function_regex.findall(block): | ||
| txt = m[0] if m[0] else m[1] | ||
| if ">" not in txt: | ||
| continue | ||
| idx = txt.index(">") | ||
| fname = txt[:idx].strip() | ||
| body = txt[idx + 1 :] | ||
| params: Dict[str, Any] = {} | ||
| for pm in self.tool_call_parameter_regex.findall(body): | ||
| ptxt = pm[0] if pm[0] else pm[1] | ||
| if ">" not in ptxt: | ||
| continue | ||
| pidx = ptxt.index(">") | ||
| pname = ptxt[:pidx].strip() | ||
| pval = ptxt[pidx + 1 :].lstrip("\n").rstrip("\n") | ||
| params[pname] = _safe_val(pval) | ||
| raw = {"name": fname, "arguments": params} | ||
| try: | ||
| # TODO: fix idx in function call, the index for a function | ||
| # call will always be -1 in parse_base_json | ||
| res.extend(self.parse_base_json(raw, tools)) | ||
| except Exception: | ||
| logger.warning("invalid tool call for %s dropped", fname) | ||
| return res |
There was a problem hiding this comment.
8dfcc7f to
ce008d3
Compare
|
Thanks for your great contribution! |
PopSoda2002
left a comment
There was a problem hiding this comment.
It's a good PR, but I think it needs to be changed in these several aspects:
- Try to divide the PR, may start from core content first as a small PR, currently it's 2200+ lines, really hard for reviewing, and it's hard to tell whether the unit test covers the all changes
- Try to use more descriptive name not just
new_detector, new can be old in the future, otherwise, it needs to be created usingnew_new_detector, I suggest to use{feature}_detectorto distinguish from old one - Try to put your unit test result in the PR description
- Try to clean the code better
| """Set tool configuration information""" | ||
| self.tools = tools | ||
|
|
||
| def _get_param_type(self, param_name: str) -> str: |
There was a problem hiding this comment.
Try to create a param type enum to hold different type
There was a problem hiding this comment.
this method extract different param types set by user when calling the api, by default the value types are expected to be valid json schema, but user may set the wrong type such as using str instread of string, or strings instead of string, so there are so many types and this function only extract the type info and let the _convert_param_value to deal with
| if param_value.lower() == "null": | ||
| return None | ||
|
|
||
| param_type = param_type.strip().lower() |
There was a problem hiding this comment.
Can you assign a name for the group? Like STRING_TYPE _GROUP= ["string", "str", "text", "varchar", "char", "enum"], then use if param_type in STRING_TYPE _GROUP?
| except Exception: | ||
| pass | ||
| return param_value | ||
| elif param_type in ["boolean", "bool", "binary"]: |
| escaped text | ||
| """ | ||
| # XML special character escape mapping | ||
| xml_escapes = { |
There was a problem hiding this comment.
constant variable can use upper character like XML_ESCAPES_MAP
|
@CatherineSue @JustinTong0323 could you help to reivew? |
| def supports_structural_tag(self) -> bool: | ||
| return False | ||
|
|
||
| def structure_info(self) -> _GetInfoFunc: |
There was a problem hiding this comment.
Why adding an not implement method here, it seems no one will inherit this class
| def test_parse_streaming_multiple_tools(self): | ||
| """Test streaming with multiple tool calls.""" | ||
| model_output = """<tool_call> | ||
| <function=get_current_weather> |
There was a problem hiding this comment.
these string should be strictly followed the model output indentation, so here can not add additional indent
| type="function", | ||
| function=Function( | ||
| name="read_file", | ||
| description="Reads and returns the content of a specified file from the local filesystem. Handles text, images (PNG, JPG, GIF, WEBP, SVG, BMP), and PDF files. For text files, it can read specific line ranges.", |
There was a problem hiding this comment.
Could we directly replace old one with it?
|
@CatherineSue Could you have a look at the code review? |
|
/tag-and-rerun-ci |
|
I see lots of |
Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
|
close this pr using existing implementation |
Motivation
Contribute the internal tool parser used at Qwen API Service, which use a standard xml parser to parse text streamingly, and handles a lot of corner cases:
Modifications
Accuracy Tests
No Need
Benchmarking and Profiling
No Need
Checklist