support new qwen3_coder_detector#16744
Conversation
Summary of ChangesHello @attack204, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a significantly improved Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
/tag-and-rerun-ci |
There was a problem hiding this comment.
Code Review
This pull request refactors the Qwen3CoderDetector to use a more robust cursor-based streaming parser, which is a significant improvement over the previous regex-based implementation. It also introduces a comprehensive suite of tests to validate the new parser's functionality for both streaming and non-streaming scenarios.
While the changes are a positive step forward, I've identified a few critical and high-severity issues that need to be addressed:
- The
Qwen3CoderDetectorclass is currently not instantiable due to an unimplemented abstract method. - There's a debug
logger.criticalstatement that should be removed. - The streaming state is not correctly reset, which will lead to issues when the detector instance is reused.
Additionally, there are some medium-severity issues related to maintainability, such as comments in Chinese and the use of bare except blocks. Please see the detailed comments for suggestions on how to resolve these issues.
| logger.critical( | ||
| f"[xixi.yjx] PARSER: 1231 Try to port from vLLM parser: Using cursor-based streaming parser." | ||
| ) |
| def _reset_streaming_state(self): | ||
| """Reset internal streaming cursors.""" | ||
| self.parsed_pos = 0 | ||
| self.current_tool_param_count = 0 | ||
| self.json_started = False | ||
| self.is_inside_tool_call = False # [FIX] Reset state | ||
|
|
||
| # Base class state reset is handled by base class logic mostly, | ||
| # but we ensure our cursor aligns with buffer resets. | ||
| if hasattr(self, "_buffer") and not self._buffer: | ||
| self.parsed_pos = 0 |
There was a problem hiding this comment.
The _reset_streaming_state method is incomplete. It fails to reset state variables inherited from BaseFormatDetector, such as current_tool_id and current_tool_name_sent. Since these are modified during parsing, not resetting them will cause state to leak between streaming sessions on the same detector instance, leading to incorrect behavior. The method should reset all streaming-related state from both the base and child class.
def _reset_streaming_state(self):
"""Reset internal streaming cursors and all streaming state."""
# Reset state for this class
self.parsed_pos = 0
self.current_tool_param_count = 0
self.json_started = False
self.is_inside_tool_call = False
self.current_func_name = None
# Reset state from BaseFormatDetector
# Note: _buffer is also part of state that should be reset.
self._buffer = ""
self.prev_tool_call_arr = []
self.current_tool_id = -1
self.current_tool_name_sent = False
self.streamed_args_for_tool = []| # Streaming State | ||
| # 覆盖父类的 _buffer 管理,或者配合父类使用。 | ||
| # SGLang BaseFormatDetector 通常有自己的 _buffer,但这里我们显式管理以确保逻辑清晰 | ||
| if not hasattr(self, "_buffer"): | ||
| self._buffer = "" | ||
|
|
||
| # 指向 buffer 中下一个待处理字符的索引 | ||
| self.parsed_pos = 0 | ||
| # 当前正在处理的 tool 内部的参数计数,用于判断是否加逗号 | ||
| self.current_tool_param_count = 0 | ||
| # 标记当前 tool 是否已经发送了 '{' | ||
| self.json_started = False | ||
|
|
||
| # [FIX] 新增状态位:标记是否处于 tool_call 结构块内部 | ||
| self.is_inside_tool_call = False | ||
|
|
There was a problem hiding this comment.
This __init__ method has a couple of issues:
- The instance attribute
self.current_func_nameis used inparse_streaming_incrementbut it's not initialized here. This could lead to anAttributeError. It should be initialized, for example toNone. - There are several comments in Chinese (e.g., lines 44-45, 49, 51, 53, 56). For consistency and maintainability in this codebase, they should be translated to English.
- The comment
[FIX]on line 56 should be removed.
| def _convert_param_value( | ||
| self, param_value: str, param_name: str, param_config: dict, func_name: str | ||
| ) -> Any: | ||
| """Convert parameter value based on its type in the schema.""" | ||
| # Handle null value for any type | ||
| if param_value.lower() == "null": | ||
| return None | ||
|
|
||
| if param_name not in param_config: | ||
| if param_config != {}: | ||
| logger.warning( | ||
| f"Parsed parameter '{param_name}' is not defined in the tool " | ||
| f"parameters for tool '{func_name}', directly returning the string value." | ||
| ) | ||
| return param_value | ||
|
|
||
| if ( | ||
| isinstance(param_config[param_name], dict) | ||
| and "type" in param_config[param_name] | ||
| ): | ||
| param_type = str(param_config[param_name]["type"]).strip().lower() | ||
| else: | ||
| param_type = "string" | ||
| if param_type in ["string", "str", "text", "varchar", "char", "enum"]: | ||
| return param_value | ||
| elif ( | ||
| param_type.startswith("int") | ||
| or param_type.startswith("uint") | ||
| or param_type.startswith("long") | ||
| or param_type.startswith("short") | ||
| or param_type.startswith("unsigned") | ||
| ): | ||
| try: | ||
| param_value = int(param_value) | ||
| except: | ||
| logger.warning( | ||
| f"Parsed value '{param_value}' of parameter '{param_name}' is not an integer in tool " | ||
| f"'{func_name}', degenerating to string." | ||
| ) | ||
| return param_value | ||
| elif param_type.startswith("num") or param_type.startswith("float"): | ||
| try: | ||
| float_param_value = float(param_value) | ||
| param_value = ( | ||
| float_param_value | ||
| if float_param_value - int(float_param_value) != 0 | ||
| else int(float_param_value) | ||
| ) | ||
| except: | ||
| logger.warning( | ||
| f"Parsed value '{param_value}' of parameter '{param_name}' is not a float in tool " | ||
| f"'{func_name}', degenerating to string." | ||
| ) | ||
| return param_value | ||
| elif param_type in ["boolean", "bool", "binary"]: | ||
| param_value = param_value.lower() | ||
| if param_value not in ["true", "false"]: | ||
| logger.warning( | ||
| f"Parsed value '{param_value}' of parameter '{param_name}' is not a boolean (`true` of `false`) in tool '{func_name}', degenerating to false." | ||
| ) | ||
| return param_value == "true" | ||
| else: | ||
| if ( | ||
| param_type in ["object", "array", "arr"] | ||
| or param_type.startswith("dict") | ||
| or param_type.startswith("list") | ||
| ): | ||
| try: | ||
| param_value = json.loads(param_value) | ||
| return param_value | ||
| except: | ||
| logger.warning( | ||
| f"Parsed value '{param_value}' of parameter '{param_name}' cannot be parsed with json.loads in tool " | ||
| f"'{func_name}', will try other methods to parse it." | ||
| ) | ||
| try: | ||
| param_value = ast.literal_eval(param_value) # safer | ||
| except: | ||
| logger.warning( | ||
| f"Parsed value '{param_value}' of parameter '{param_name}' cannot be converted via Python `ast.literal_eval()` in tool '{func_name}', degenerating to string." | ||
| ) | ||
| return param_value |
There was a problem hiding this comment.
The try...except blocks in this method use a bare except:, which is too broad and can mask unexpected errors. It's better to catch specific exceptions that you expect to occur during type conversion, such as ValueError for int() and float(), and json.JSONDecodeError or SyntaxError for json.loads() and ast.literal_eval(). This makes the error handling more robust and predictable.
| """ | ||
| 运行单个测试用例 | ||
|
|
||
| Args: | ||
| test_name: 测试名称 | ||
| response_text: 响应文本 | ||
| mode: 流式生成模式 | ||
| expected: 期望的解析结果,包含 'text' 和 'tools' 字段 | ||
| """ |
| """ | ||
| 运行单个流式测试 | ||
|
|
||
| Args: | ||
| test_name: 测试名称 | ||
| response_text: 完整的响应文本 | ||
| mode: 流式生成模式 ('char', 'atomic_tags', 或其他) | ||
| tools: 工具列表,如果为 None 则使用默认工具 | ||
| verbose: 是否打印详细日志,如果为 None 则使用实例属性 self.verbose | ||
| compare_with_non_streaming: 是否与非流式解析结果对比,如果为 None 则根据 parser_mode 自动决定 | ||
| expected: 期望的解析结果,包含 'text' 和 'tools' 字段 | ||
|
|
||
| Returns: | ||
| StreamingTestResult: 测试结果 | ||
| """ |
ba3597a to
fa40bac
Compare
|
/rerun-failed-ci |
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a significant refactoring of the Qwen3CoderDetector to support the new Qwen3 Coder model's tool call format. The implementation is completely rewritten, moving from a simpler regex-based parser to a more robust, cursor-based streaming parser. This new approach improves handling of various edge cases and incremental JSON construction. Additionally, it adds sophisticated parameter type conversion based on the tool's schema. The accompanying tests have also been completely rewritten to provide comprehensive coverage for the new implementation, including basic functionality, streaming, parameter types, and edge cases.
The changes are a great improvement in terms of robustness and functionality. I have a few suggestions to further improve the code quality, mainly around exception handling and a small bug in type conversion logic.
| return raw | ||
|
|
||
|
|
||
| class Qwen3CoderDetector(BaseFormatDetector): |
There was a problem hiding this comment.
The docstring for Qwen3CoderDetector was removed. The new implementation is significantly more complex than the previous one, involving a cursor-based streaming parser and type conversion logic. Adding a new, detailed docstring explaining the class's purpose, its state variables, and the expected tool call format would greatly improve maintainability. For example:
"""
Detector for Qwen3 Coder models.
This detector uses a cursor-based streaming parser to handle the XML-like
tool call format. It supports incremental parsing of tool calls and converts
parameter values to their appropriate types based on the provided tool schema.
Assumed format:
<tool_call>
<function=function_name>
<parameter=param_name1>value1</parameter>
<parameter=param_name2>value2</parameter>
</function>
</tool_call>
"""| except: | ||
| logger.warning( | ||
| f"Parsed value '{param_value}' of parameter '{param_name}' is not an integer in tool " | ||
| f"'{func_name}', degenerating to string." | ||
| ) |
There was a problem hiding this comment.
Using a bare except: is generally discouraged as it can catch unexpected exceptions like SystemExit or KeyboardInterrupt, making the program harder to debug and control. It's better to catch Exception if you want to catch all general exceptions. This applies to the other try...except: blocks in this function as well (lines 143, 165, and 172).
| except: | |
| logger.warning( | |
| f"Parsed value '{param_value}' of parameter '{param_name}' is not an integer in tool " | |
| f"'{func_name}', degenerating to string." | |
| ) | |
| except Exception: | |
| logger.warning( | |
| f"Parsed value '{param_value}' of parameter '{param_name}' is not an integer in tool " | |
| f"'{func_name}', degenerating to string." | |
| ) |
| maybe_convert = ( | ||
| False if "." in param_value or "e" in param_value.lower() else True | ||
| ) | ||
| param_value: float = float(param_value) | ||
| if maybe_convert and param_value.is_integer(): | ||
| param_value = int(param_value) |
There was a problem hiding this comment.
The logic to convert a float-like string to an integer seems to have a flaw. The maybe_convert flag prevents conversion for strings containing a . like "3.0", which will remain a float instead of being converted to an integer. This is likely not the intended behavior. You can simplify the logic to correctly handle all cases where a float represents a whole number.
| maybe_convert = ( | |
| False if "." in param_value or "e" in param_value.lower() else True | |
| ) | |
| param_value: float = float(param_value) | |
| if maybe_convert and param_value.is_integer(): | |
| param_value = int(param_value) | |
| val = float(param_value) | |
| if val.is_integer(): | |
| param_value = int(val) | |
| else: | |
| param_value = val |
| except: | ||
| logger.warning( | ||
| f"Parsed value '{param_value}' of parameter '{param_name}' is not a float in tool " | ||
| f"'{func_name}', degenerating to string." | ||
| ) |
There was a problem hiding this comment.
Using a bare except: is generally discouraged as it can hide unexpected errors. It's better to catch Exception to avoid catching system-exiting exceptions.
| except: | |
| logger.warning( | |
| f"Parsed value '{param_value}' of parameter '{param_name}' is not a float in tool " | |
| f"'{func_name}', degenerating to string." | |
| ) | |
| except Exception: | |
| logger.warning( | |
| f"Parsed value '{param_value}' of parameter '{param_name}' is not a float in tool " | |
| f"'{func_name}', degenerating to string." | |
| ) |
| except: | ||
| logger.warning( | ||
| f"Parsed value '{param_value}' of parameter '{param_name}' cannot be parsed with json.loads in tool " | ||
| f"'{func_name}', will try other methods to parse it." | ||
| ) |
There was a problem hiding this comment.
Using a bare except: can catch more exceptions than intended. Please specify the exception type, or use except Exception: for a general catch.
| except: | |
| logger.warning( | |
| f"Parsed value '{param_value}' of parameter '{param_name}' cannot be parsed with json.loads in tool " | |
| f"'{func_name}', will try other methods to parse it." | |
| ) | |
| except Exception: | |
| logger.warning( | |
| f"Parsed value '{param_value}' of parameter '{param_name}' cannot be parsed with json.loads in tool " | |
| f"'{func_name}', will try other methods to parse it." | |
| ) |
| except: | ||
| logger.warning( | ||
| f"Parsed value '{param_value}' of parameter '{param_name}' cannot be converted via Python `ast.literal_eval()` in tool '{func_name}', degenerating to string." | ||
| ) |
There was a problem hiding this comment.
A bare except: is too broad. It's recommended to catch Exception to avoid suppressing critical system-level exceptions.
| except: | |
| logger.warning( | |
| f"Parsed value '{param_value}' of parameter '{param_name}' cannot be converted via Python `ast.literal_eval()` in tool '{func_name}', degenerating to string." | |
| ) | |
| except Exception: | |
| logger.warning( | |
| f"Parsed value '{param_value}' of parameter '{param_name}' cannot be converted via Python `ast.literal_eval()` in tool '{func_name}', degenerating to string." | |
| ) |
| def _reset_streaming_state(self): | ||
| """Reset internal streaming cursors.""" | ||
| self.parsed_pos = 0 | ||
| self.current_tool_param_count = 0 | ||
| self.json_started = False | ||
| self.is_inside_tool_call = False # [FIX] Reset state | ||
|
|
||
| # Base class state reset is handled by base class logic mostly, | ||
| # but we ensure our cursor aligns with buffer resets. | ||
| if hasattr(self, "_buffer") and not self._buffer: |
There was a problem hiding this comment.
If a stream is interrupted mid-generation (user disconnects), self.current_func_name might retain a stale value. When the detector is reused, it could corrupt the next request.
Please Ensure self.current_func_name = None is explicitly set in _reset_streaming_state
| # ------------------------------------------------------- | ||
| if current_slice.startswith(self.tool_call_start_token): | ||
| self.parsed_pos += len(self.tool_call_start_token) | ||
| self.is_inside_tool_call = True # [FIX] Enter tool call region |
There was a problem hiding this comment.
clean the comments
| if not hasattr(self, "_buffer"): | ||
| self._buffer = "" | ||
|
|
||
| # Index pointing to the next character to be processed in buffer | ||
| self.parsed_pos = 0 | ||
| # Parameter count inside the current tool being processed, used to determine whether to add comma | ||
| self.current_tool_param_count = 0 | ||
| # Flag indicating whether current tool has already sent '{' |
There was a problem hiding this comment.
class Qwen3CoderDetector(BaseFormatDetector):
def __init__(self):
super().__init__()
# Explicitly define all attributes with Type Hints
# No 'hasattr' checks needed. We control the lifecycle here.
self._buffer: str = ""
self.parsed_pos: int = 0
self.current_tool_param_count: int = 0
self.json_started: bool = False
self.is_inside_tool_call: bool = False
# Initialize attributes that were missing in the original PR
self.current_func_name: Optional[str] = Nonebe3a864 to
bf57b75
Compare
|
/rerun-failed-ci |
f55b7fe to
c41c78b
Compare
|
/rerun-failed-ci |
|
/rerun-failed-ci |
* fix(ci): recover from corrupted MMMU parquet cache (sgl-project#17256) * [diffusion] feat: support default 4-step inference for Flux2-Klein distilled models (sgl-project#17225) Signed-off-by: Lancer <maruixiang6688@gmail.com> * Add runner utilization report workflow (sgl-project#17234) * cli: support sglang version (sgl-project#17250) * Use swa radix cache and memory pool for gpt-oss model (sgl-project#17261) * [VLM][Reland] Refactor load_mm_data to improve performance (sgl-project#16152) Co-authored-by: luoyuan.luo <luoyuan.luo@antgroup.com> * [Tiny] Improve docs (sgl-project#17264) * [diffusion] fix: set guidance_scale default to None (sgl-project#17182) * Tiny fix comment typo (sgl-project#17287) * [SPEC_V2] Enable cudagraph draft_extend for trtllm_mla_backend and Acclen Fix for DP under cudagraph mode (sgl-project#16974) * Add kl test for swa radix cache (sgl-project#17281) * fix: Handle multiple named chat templates in HuggingFace tokenizers (sgl-project#17236) Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com> * Move radix cache related tests (sgl-project#17295) * [Refactor] Add `-fp4-gemm-backend` to replace `SGLANG_FLASHINFER_FP4_GEMM_BACKEND` (sgl-project#16534) Co-authored-by: Vincent Zhong <207368749+vincentzed@users.noreply.github.com> * [Bugfix] Fix PD accuracy when MTP is not configured on the prefill node (sgl-project#17212) Co-authored-by: Shangming Cai <csmthu@gmail.com> * [Diffusion] Apply jit qk_norm to flux1 (sgl-project#17296) * [Refactor] Split out deepseek v2 weight loader function into mixin (sgl-project#16649) * [NPU]Support GPT-OSS for NPU (sgl-project#14197) * [jit-kernel] Add CuTe DSL GDN Decode Kernel (sgl-project#15631) Co-authored-by: Jinyan Chen <jinyanc@nvidia.com> * [GLM 4.7] Add RTX 6000 Pro aka sm120 (sgl-project#17235) Co-authored-by: root <root@ubuntu-nvidia.localdomain> * Update CODEOWNERS for multimodal_gen (sgl-project#17308) Co-authored-by: Xiaoyu Zhang <35585791+BBuf@users.noreply.github.com> * [Feature] overlap LoRA weight loading with compute (sgl-project#15512) * [PD] Optimize MHA models pp util calculation logic (sgl-project#17306) * [Minor] Correct sglang version when installing from source (sgl-project#17315) * Use dsv3 optimized routing `fused_topk_deepseek` instead of `moe_fused_gate` (sgl-project#15347) * [DeepSeek v3.2] Opt MTP decode cuda batch sizes and nsa implementation (sgl-project#16961) * Update code sync scripts (sgl-project#17319) * [Auto Sync] Update tokenizer_manager.py (20260119) (sgl-project#17317) Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * support new qwen3_coder_detector (sgl-project#16744) Co-authored-by: liugaoji.lgj <liugaoji.lgj@alibaba-inc.com> * Fix kernel selection in biased_grouped_topk_gpu (sgl-project#17325) * KV Cache Events with Attention DP bug fix (sgl-project#16030) (sgl-project#16412) * [Perf] fuse q, k norm for Flux2Attention (sgl-project#17241) Co-authored-by: Minglei Zhu <zminglei@linkedin.com> * [CI] Add partition to stage-b-test-large-1-gpu (11->12) (sgl-project#17245) * fix(ci): rate limit and permission errors in trace publishing (sgl-project#17238) * Revert "[Perf] fuse q, k norm for Flux2Attention (sgl-project#17241)" (sgl-project#17332) * Migrate performance, accuracy, and quantization tests to CI registry (sgl-project#17177) Co-authored-by: Kangyan-Zhou <zky314343421@gmail.com> * Inclusion of nvfp4 blockscale in EPLB Rebalance (sgl-project#17158) * [Refactor] Set `fp4-gemm-backend=auto` on SM100 and rename `fp4-gemm-backend` with `flashinfer_` prefix (sgl-project#17309) * [Diffusion] Apply qknorm to flux2 and apply lightx2v rms_norm_one_pass kernel(without residual) (sgl-project#17305) Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> * Fix v32 continue_final_message not work (sgl-project#16567) * Evict swa kv cache during decoding (sgl-project#17220) * [RadixTree][1/N Refactor]: Support unified match_prefix params (sgl-project#17142) Co-authored-by: yizhang2077 <1109276519@qq.com> Co-authored-by: pansicheng <sicheng.pan.chn@gmail.com> * [AMD CI] Migrate and Add More Testcases (sgl-project#17116) Co-authored-by: yctseng0211 <yctseng@amd.com> * [AMD] CI - add partitions for stage-b-test-small-1-gpu-amd (sgl-project#17345) * Restore deepseek_v2.py to main's code, except the utils * Ran `pre-commit` --------- Signed-off-by: Lancer <maruixiang6688@gmail.com> Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com> Co-authored-by: Hudson Xing <1277646412@qq.com> Co-authored-by: Lancer <402430575@qq.com> Co-authored-by: Alison Shao <54658187+alisonshao@users.noreply.github.com> Co-authored-by: Mick <mickjagger19@icloud.com> Co-authored-by: Ke Bao <ispobaoke@gmail.com> Co-authored-by: Yuan Luo <yuan.luo@hotmail.com> Co-authored-by: luoyuan.luo <luoyuan.luo@antgroup.com> Co-authored-by: Mohammad Miadh Angkad <mangkad.bsdsba2027@aim.edu> Co-authored-by: Changyi Yang <112288487+ChangyiYang@users.noreply.github.com> Co-authored-by: YAMY <74099316+YAMY1234@users.noreply.github.com> Co-authored-by: Xinyuan Tong <115166877+JustinTong0323@users.noreply.github.com> Co-authored-by: b8zhong <b8zhong@uwaterloo.ca> Co-authored-by: Vincent Zhong <207368749+vincentzed@users.noreply.github.com> Co-authored-by: Ch3ngY1 <91232537+Ch3ngY1@users.noreply.github.com> Co-authored-by: Shangming Cai <csmthu@gmail.com> Co-authored-by: Xiaoyu Zhang <35585791+BBuf@users.noreply.github.com> Co-authored-by: Jerry Ji <jerryjilol@gmail.com> Co-authored-by: Todobe <43903496+Todobe@users.noreply.github.com> Co-authored-by: Jinyan Chen <93358689+liz-badada@users.noreply.github.com> Co-authored-by: Jinyan Chen <jinyanc@nvidia.com> Co-authored-by: Koushik Dutta <koush@koushikdutta.com> Co-authored-by: root <root@ubuntu-nvidia.localdomain> Co-authored-by: Glen Liu <62917497+glenliu21@users.noreply.github.com> Co-authored-by: Baizhou Zhang <sobereddiezhang@gmail.com> Co-authored-by: Lee Nau <lnau@nvidia.com> Co-authored-by: Yongfei Xu <xuyongfei.xyf@antgroup.com> Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Gaoji Liu <34803073+attack204@users.noreply.github.com> Co-authored-by: liugaoji.lgj <liugaoji.lgj@alibaba-inc.com> Co-authored-by: yudian0504 <138860534+yudian0504@users.noreply.github.com> Co-authored-by: Kartik Ramesh <kartikx2000@gmail.com> Co-authored-by: Minglei Zhu <mingleizhu1122@gmail.com> Co-authored-by: Minglei Zhu <zminglei@linkedin.com> Co-authored-by: Kangyan-Zhou <zky314343421@gmail.com> Co-authored-by: Shu Wang <shuw@nvidia.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: ybyang <10629930+whybeyoung@users.noreply.github.com> Co-authored-by: zhangheng <hzh0425@apache.org> Co-authored-by: yizhang2077 <1109276519@qq.com> Co-authored-by: pansicheng <sicheng.pan.chn@gmail.com> Co-authored-by: Bingxu Chen <Bingxu.Chen@amd.com> Co-authored-by: yctseng0211 <yctseng@amd.com>
UT
E2E Test
The test results, provided by Zeyu Cui from the Qwen Team @cyente, have been confirmed to meet expectations despite the presence of some jitter.