support new qwen3_coder_detector by attack204 · Pull Request #16744 · sgl-project/sglang

attack204 · 2026-01-08T16:13:53Z

UT

E2E Test

The test results, provided by Zeyu Cui from the Qwen Team @cyente, have been confirmed to meet expectations despite the presence of some jitter.

gemini-code-assist · 2026-01-08T16:14:20Z

Summary of Changes

Hello @attack204, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significantly improved Qwen3CoderDetector for handling function calls within Qwen3 models. The core change is a complete overhaul of the parsing logic, moving to a more robust, cursor-based streaming approach that can accurately extract tool names and their arguments, including complex data types, even when input is fragmented. This enhancement is backed by a comprehensive new test framework designed to rigorously validate the parser's behavior across various scenarios.

Highlights

New Qwen3 Coder Detector: Introduced a new Qwen3CoderDetector to support function call parsing for Qwen3 models, replacing previous detection logic.
Robust Streaming Parser: Implemented a new cursor-based streaming parser in Qwen3CoderDetector for incremental and reliable processing of tool calls and their arguments.
Schema-Aware Parameter Conversion: Added functionality to convert parameter values to appropriate Python types (e.g., int, bool, list, dict) based on the provided tool schemas.
Comprehensive Test Suite: Included a new, extensive test suite with a dedicated test framework, fixtures, and various test cases to validate both streaming and non-streaming parsing, covering edge cases and complex data structures.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

attack204 · 2026-01-08T16:15:20Z

/tag-and-rerun-ci

gemini-code-assist

Code Review

This pull request refactors the Qwen3CoderDetector to use a more robust cursor-based streaming parser, which is a significant improvement over the previous regex-based implementation. It also introduces a comprehensive suite of tests to validate the new parser's functionality for both streaming and non-streaming scenarios.

While the changes are a positive step forward, I've identified a few critical and high-severity issues that need to be addressed:

The Qwen3CoderDetector class is currently not instantiable due to an unimplemented abstract method.
There's a debug logger.critical statement that should be removed.
The streaming state is not correctly reset, which will lead to issues when the detector instance is reused.

Additionally, there are some medium-severity issues related to maintainability, such as comments in Chinese and the use of bare except blocks. Please see the detailed comments for suggestions on how to resolve these issues.

gemini-code-assist · 2026-01-08T16:18:52Z

python/sglang/srt/function_call/qwen3_coder_detector.py

+logger.critical(
+    f"[xixi.yjx] PARSER: 1231 Try to port from vLLM parser: Using cursor-based streaming parser."
+)


This logger.critical call appears to be a debug statement. Such statements should be removed from production code to avoid polluting logs with irrelevant information, especially at a critical level.

gemini-code-assist · 2026-01-08T16:18:53Z

python/sglang/srt/function_call/qwen3_coder_detector.py

+    def _reset_streaming_state(self):
+        """Reset internal streaming cursors."""
+        self.parsed_pos = 0
+        self.current_tool_param_count = 0
+        self.json_started = False
+        self.is_inside_tool_call = False  # [FIX] Reset state
+
+        # Base class state reset is handled by base class logic mostly,
+        # but we ensure our cursor aligns with buffer resets.
+        if hasattr(self, "_buffer") and not self._buffer:
+            self.parsed_pos = 0


The _reset_streaming_state method is incomplete. It fails to reset state variables inherited from BaseFormatDetector, such as current_tool_id and current_tool_name_sent. Since these are modified during parsing, not resetting them will cause state to leak between streaming sessions on the same detector instance, leading to incorrect behavior. The method should reset all streaming-related state from both the base and child class.

def _reset_streaming_state(self): """Reset internal streaming cursors and all streaming state.""" # Reset state for this class self.parsed_pos = 0 self.current_tool_param_count = 0 self.json_started = False self.is_inside_tool_call = False self.current_func_name = None # Reset state from BaseFormatDetector # Note: _buffer is also part of state that should be reset. self._buffer = "" self.prev_tool_call_arr = [] self.current_tool_id = -1 self.current_tool_name_sent = False self.streamed_args_for_tool = []

gemini-code-assist · 2026-01-08T16:18:53Z

python/sglang/srt/function_call/qwen3_coder_detector.py

+        # Streaming State
+        # 覆盖父类的 _buffer 管理，或者配合父类使用。
+        # SGLang BaseFormatDetector 通常有自己的 _buffer，但这里我们显式管理以确保逻辑清晰
+        if not hasattr(self, "_buffer"):
+            self._buffer = ""
+
+        # 指向 buffer 中下一个待处理字符的索引
+        self.parsed_pos = 0
+        # 当前正在处理的 tool 内部的参数计数，用于判断是否加逗号
+        self.current_tool_param_count = 0
+        # 标记当前 tool 是否已经发送了 '{'
+        self.json_started = False
+
+        # [FIX] 新增状态位：标记是否处于 tool_call 结构块内部
+        self.is_inside_tool_call = False
+


This __init__ method has a couple of issues:

The instance attribute self.current_func_name is used in parse_streaming_increment but it's not initialized here. This could lead to an AttributeError. It should be initialized, for example to None.

There are several comments in Chinese (e.g., lines 44-45, 49, 51, 53, 56). For consistency and maintainability in this codebase, they should be translated to English.

The comment [FIX] on line 56 should be removed.

gemini-code-assist · 2026-01-08T16:18:53Z

python/sglang/srt/function_call/qwen3_coder_detector.py

+    def _convert_param_value(
+        self, param_value: str, param_name: str, param_config: dict, func_name: str
+    ) -> Any:
+        """Convert parameter value based on its type in the schema."""
+        # Handle null value for any type
+        if param_value.lower() == "null":
+            return None
+
+        if param_name not in param_config:
+            if param_config != {}:
+                logger.warning(
+                    f"Parsed parameter '{param_name}' is not defined in the tool "
+                    f"parameters for tool '{func_name}', directly returning the string value."
+                )
+            return param_value
+
+        if (
+            isinstance(param_config[param_name], dict)
+            and "type" in param_config[param_name]
+        ):
+            param_type = str(param_config[param_name]["type"]).strip().lower()
+        else:
+            param_type = "string"
+        if param_type in ["string", "str", "text", "varchar", "char", "enum"]:
+            return param_value
+        elif (
+            param_type.startswith("int")
+            or param_type.startswith("uint")
+            or param_type.startswith("long")
+            or param_type.startswith("short")
+            or param_type.startswith("unsigned")
+        ):
+            try:
+                param_value = int(param_value)
+            except:
+                logger.warning(
+                    f"Parsed value '{param_value}' of parameter '{param_name}' is not an integer in tool "
+                    f"'{func_name}', degenerating to string."
+                )
+            return param_value
+        elif param_type.startswith("num") or param_type.startswith("float"):
+            try:
+                float_param_value = float(param_value)
+                param_value = (
+                    float_param_value
+                    if float_param_value - int(float_param_value) != 0
+                    else int(float_param_value)
+                )
+            except:
+                logger.warning(
+                    f"Parsed value '{param_value}' of parameter '{param_name}' is not a float in tool "
+                    f"'{func_name}', degenerating to string."
+                )
+            return param_value
+        elif param_type in ["boolean", "bool", "binary"]:
+            param_value = param_value.lower()
+            if param_value not in ["true", "false"]:
+                logger.warning(
+                    f"Parsed value '{param_value}' of parameter '{param_name}' is not a boolean (`true` of `false`) in tool '{func_name}', degenerating to false."
+                )
+            return param_value == "true"
+        else:
+            if (
+                param_type in ["object", "array", "arr"]
+                or param_type.startswith("dict")
+                or param_type.startswith("list")
+            ):
+                try:
+                    param_value = json.loads(param_value)
+                    return param_value
+                except:
+                    logger.warning(
+                        f"Parsed value '{param_value}' of parameter '{param_name}' cannot be parsed with json.loads in tool "
+                        f"'{func_name}', will try other methods to parse it."
+                    )
+            try:
+                param_value = ast.literal_eval(param_value)  # safer
+            except:
+                logger.warning(
+                    f"Parsed value '{param_value}' of parameter '{param_name}' cannot be converted via Python `ast.literal_eval()` in tool '{func_name}', degenerating to string."
+                )
+            return param_value


The try...except blocks in this method use a bare except:, which is too broad and can mask unexpected errors. It's better to catch specific exceptions that you expect to occur during type conversion, such as ValueError for int() and float(), and json.JSONDecodeError or SyntaxError for json.loads() and ast.literal_eval(). This makes the error handling more robust and predictable.

gemini-code-assist · 2026-01-08T16:18:53Z

test/registered/function_call/test_qwen3_coder_detector/test_cases.py

+        """
+        运行单个测试用例
+
+        Args:
+            test_name: 测试名称
+            response_text: 响应文本
+            mode: 流式生成模式
+            expected: 期望的解析结果，包含 'text' 和 'tools' 字段
+        """


The docstrings and comments in this test file are in Chinese. To maintain consistency and improve accessibility for all contributors, please translate them to English.

gemini-code-assist · 2026-01-08T16:18:53Z

test/registered/function_call/test_qwen3_coder_detector/test_framework.py

+        """
+        运行单个流式测试
+
+        Args:
+            test_name: 测试名称
+            response_text: 完整的响应文本
+            mode: 流式生成模式 ('char', 'atomic_tags', 或其他)
+            tools: 工具列表，如果为 None 则使用默认工具
+            verbose: 是否打印详细日志，如果为 None 则使用实例属性 self.verbose
+            compare_with_non_streaming: 是否与非流式解析结果对比，如果为 None 则根据 parser_mode 自动决定
+            expected: 期望的解析结果，包含 'text' 和 'tools' 字段
+
+        Returns:
+            StreamingTestResult: 测试结果
+        """


The docstrings and comments in this test framework file are in Chinese. For better maintainability and to ensure the code is accessible to all contributors, please translate these to English.

attack204 · 2026-01-10T15:26:21Z

/rerun-failed-ci

zhaochenyang20 · 2026-01-11T00:44:04Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces a significant refactoring of the Qwen3CoderDetector to support the new Qwen3 Coder model's tool call format. The implementation is completely rewritten, moving from a simpler regex-based parser to a more robust, cursor-based streaming parser. This new approach improves handling of various edge cases and incremental JSON construction. Additionally, it adds sophisticated parameter type conversion based on the tool's schema. The accompanying tests have also been completely rewritten to provide comprehensive coverage for the new implementation, including basic functionality, streaming, parameter types, and edge cases.

The changes are a great improvement in terms of robustness and functionality. I have a few suggestions to further improve the code quality, mainly around exception handling and a small bug in type conversion logic.

gemini-code-assist · 2026-01-11T00:45:49Z

python/sglang/srt/function_call/qwen3_coder_detector.py

-            return raw
-
-
 class Qwen3CoderDetector(BaseFormatDetector):


The docstring for Qwen3CoderDetector was removed. The new implementation is significantly more complex than the previous one, involving a cursor-based streaming parser and type conversion logic. Adding a new, detailed docstring explaining the class's purpose, its state variables, and the expected tool call format would greatly improve maintainability. For example:

""" Detector for Qwen3 Coder models. This detector uses a cursor-based streaming parser to handle the XML-like tool call format. It supports incremental parsing of tool calls and converts parameter values to their appropriate types based on the provided tool schema. Assumed format: <tool_call> <function=function_name> <parameter=param_name1>value1</parameter> <parameter=param_name2>value2</parameter> </function> </tool_call> """

gemini-code-assist · 2026-01-11T00:45:49Z

python/sglang/srt/function_call/qwen3_coder_detector.py

+            except:
+                logger.warning(
+                    f"Parsed value '{param_value}' of parameter '{param_name}' is not an integer in tool "
+                    f"'{func_name}', degenerating to string."
+                )


Using a bare except: is generally discouraged as it can catch unexpected exceptions like SystemExit or KeyboardInterrupt, making the program harder to debug and control. It's better to catch Exception if you want to catch all general exceptions. This applies to the other try...except: blocks in this function as well (lines 143, 165, and 172).

Suggested change

except:

logger.warning(

f"Parsed value '{param_value}' of parameter '{param_name}' is not an integer in tool "

f"'{func_name}', degenerating to string."

)

except Exception:

logger.warning(

f"Parsed value '{param_value}' of parameter '{param_name}' is not an integer in tool "

f"'{func_name}', degenerating to string."

)

gemini-code-assist · 2026-01-11T00:45:49Z

python/sglang/srt/function_call/qwen3_coder_detector.py

+                maybe_convert = (
+                    False if "." in param_value or "e" in param_value.lower() else True
+                )
+                param_value: float = float(param_value)
+                if maybe_convert and param_value.is_integer():
+                    param_value = int(param_value)


The logic to convert a float-like string to an integer seems to have a flaw. The maybe_convert flag prevents conversion for strings containing a . like "3.0", which will remain a float instead of being converted to an integer. This is likely not the intended behavior. You can simplify the logic to correctly handle all cases where a float represents a whole number.

Suggested change

maybe_convert = (

False if "." in param_value or "e" in param_value.lower() else True

)

param_value: float = float(param_value)

if maybe_convert and param_value.is_integer():

param_value = int(param_value)

val = float(param_value)

if val.is_integer():

param_value = int(val)

else:

param_value = val

gemini-code-assist · 2026-01-11T00:45:50Z

python/sglang/srt/function_call/qwen3_coder_detector.py

+            except:
+                logger.warning(
+                    f"Parsed value '{param_value}' of parameter '{param_name}' is not a float in tool "
+                    f"'{func_name}', degenerating to string."
+                )


Using a bare except: is generally discouraged as it can hide unexpected errors. It's better to catch Exception to avoid catching system-exiting exceptions.

Suggested change

except:

logger.warning(

f"Parsed value '{param_value}' of parameter '{param_name}' is not a float in tool "

f"'{func_name}', degenerating to string."

)

except Exception:

logger.warning(

f"Parsed value '{param_value}' of parameter '{param_name}' is not a float in tool "

f"'{func_name}', degenerating to string."

)

gemini-code-assist · 2026-01-11T00:45:50Z

python/sglang/srt/function_call/qwen3_coder_detector.py

+                except:
+                    logger.warning(
+                        f"Parsed value '{param_value}' of parameter '{param_name}' cannot be parsed with json.loads in tool "
+                        f"'{func_name}', will try other methods to parse it."
+                    )


Using a bare except: can catch more exceptions than intended. Please specify the exception type, or use except Exception: for a general catch.

Suggested change

except:

logger.warning(

f"Parsed value '{param_value}' of parameter '{param_name}' cannot be parsed with json.loads in tool "

f"'{func_name}', will try other methods to parse it."

)

except Exception:

logger.warning(

f"Parsed value '{param_value}' of parameter '{param_name}' cannot be parsed with json.loads in tool "

f"'{func_name}', will try other methods to parse it."

)

gemini-code-assist · 2026-01-11T00:45:50Z

python/sglang/srt/function_call/qwen3_coder_detector.py

+            except:
+                logger.warning(
+                    f"Parsed value '{param_value}' of parameter '{param_name}' cannot be converted via Python `ast.literal_eval()` in tool '{func_name}', degenerating to string."
+                )


A bare except: is too broad. It's recommended to catch Exception to avoid suppressing critical system-level exceptions.

Suggested change

except:

logger.warning(

f"Parsed value '{param_value}' of parameter '{param_name}' cannot be converted via Python `ast.literal_eval()` in tool '{func_name}', degenerating to string."

)

except Exception:

logger.warning(

f"Parsed value '{param_value}' of parameter '{param_name}' cannot be converted via Python `ast.literal_eval()` in tool '{func_name}', degenerating to string."

)

python/sglang/srt/function_call/qwen3_coder_detector.py

zhaochenyang20 · 2026-01-11T00:48:20Z

python/sglang/srt/function_call/qwen3_coder_detector.py

+    def _reset_streaming_state(self):
+        """Reset internal streaming cursors."""
+        self.parsed_pos = 0
+        self.current_tool_param_count = 0
+        self.json_started = False
+        self.is_inside_tool_call = False  # [FIX] Reset state
+
+        # Base class state reset is handled by base class logic mostly,
+        # but we ensure our cursor aligns with buffer resets.
+        if hasattr(self, "_buffer") and not self._buffer:


If a stream is interrupted mid-generation (user disconnects), self.current_func_name might retain a stale value. When the detector is reused, it could corrupt the next request.

Please Ensure self.current_func_name = None is explicitly set in _reset_streaming_state

zhaochenyang20 · 2026-01-11T00:52:46Z

python/sglang/srt/function_call/qwen3_coder_detector.py

+            # -------------------------------------------------------
+            if current_slice.startswith(self.tool_call_start_token):
+                self.parsed_pos += len(self.tool_call_start_token)
+                self.is_inside_tool_call = True  # [FIX] Enter tool call region


clean the comments

zhaochenyang20 · 2026-01-11T00:54:31Z

python/sglang/srt/function_call/qwen3_coder_detector.py

+        if not hasattr(self, "_buffer"):
+            self._buffer = ""
+
+        # Index pointing to the next character to be processed in buffer
+        self.parsed_pos = 0
+        # Parameter count inside the current tool being processed, used to determine whether to add comma
+        self.current_tool_param_count = 0
+        # Flag indicating whether current tool has already sent '{'


In our style guide, try to avoid hasattr as much as possible. Like these two examples:

Avoid Dynamic Attributes: Minimize the use of getattr or setattr. Code should be explicit for better traceability. Example Example 2

class Qwen3CoderDetector(BaseFormatDetector): def __init__(self): super().__init__() # Explicitly define all attributes with Type Hints # No 'hasattr' checks needed. We control the lifecycle here. self._buffer: str = "" self.parsed_pos: int = 0 self.current_tool_param_count: int = 0 self.json_started: bool = False self.is_inside_tool_call: bool = False # Initialize attributes that were missing in the original PR self.current_func_name: Optional[str] = None

python/sglang/srt/function_call/qwen3_coder_detector.py

attack204 · 2026-01-15T14:03:47Z

/rerun-failed-ci

zhaochenyang20 · 2026-01-18T19:25:53Z

/rerun-failed-ci

zhaochenyang20 · 2026-01-19T02:33:25Z

/rerun-failed-ci

zhaochenyang20

You are great

* fix(ci): recover from corrupted MMMU parquet cache (sgl-project#17256) * [diffusion] feat: support default 4-step inference for Flux2-Klein distilled models (sgl-project#17225) Signed-off-by: Lancer <maruixiang6688@gmail.com> * Add runner utilization report workflow (sgl-project#17234) * cli: support sglang version (sgl-project#17250) * Use swa radix cache and memory pool for gpt-oss model (sgl-project#17261) * [VLM][Reland] Refactor load_mm_data to improve performance (sgl-project#16152) Co-authored-by: luoyuan.luo <luoyuan.luo@antgroup.com> * [Tiny] Improve docs (sgl-project#17264) * [diffusion] fix: set guidance_scale default to None (sgl-project#17182) * Tiny fix comment typo (sgl-project#17287) * [SPEC_V2] Enable cudagraph draft_extend for trtllm_mla_backend and Acclen Fix for DP under cudagraph mode (sgl-project#16974) * Add kl test for swa radix cache (sgl-project#17281) * fix: Handle multiple named chat templates in HuggingFace tokenizers (sgl-project#17236) Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com> * Move radix cache related tests (sgl-project#17295) * [Refactor] Add `-fp4-gemm-backend` to replace `SGLANG_FLASHINFER_FP4_GEMM_BACKEND` (sgl-project#16534) Co-authored-by: Vincent Zhong <207368749+vincentzed@users.noreply.github.com> * [Bugfix] Fix PD accuracy when MTP is not configured on the prefill node (sgl-project#17212) Co-authored-by: Shangming Cai <csmthu@gmail.com> * [Diffusion] Apply jit qk_norm to flux1 (sgl-project#17296) * [Refactor] Split out deepseek v2 weight loader function into mixin (sgl-project#16649) * [NPU]Support GPT-OSS for NPU (sgl-project#14197) * [jit-kernel] Add CuTe DSL GDN Decode Kernel (sgl-project#15631) Co-authored-by: Jinyan Chen <jinyanc@nvidia.com> * [GLM 4.7] Add RTX 6000 Pro aka sm120 (sgl-project#17235) Co-authored-by: root <root@ubuntu-nvidia.localdomain> * Update CODEOWNERS for multimodal_gen (sgl-project#17308) Co-authored-by: Xiaoyu Zhang <35585791+BBuf@users.noreply.github.com> * [Feature] overlap LoRA weight loading with compute (sgl-project#15512) * [PD] Optimize MHA models pp util calculation logic (sgl-project#17306) * [Minor] Correct sglang version when installing from source (sgl-project#17315) * Use dsv3 optimized routing `fused_topk_deepseek` instead of `moe_fused_gate` (sgl-project#15347) * [DeepSeek v3.2] Opt MTP decode cuda batch sizes and nsa implementation (sgl-project#16961) * Update code sync scripts (sgl-project#17319) * [Auto Sync] Update tokenizer_manager.py (20260119) (sgl-project#17317) Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * support new qwen3_coder_detector (sgl-project#16744) Co-authored-by: liugaoji.lgj <liugaoji.lgj@alibaba-inc.com> * Fix kernel selection in biased_grouped_topk_gpu (sgl-project#17325) * KV Cache Events with Attention DP bug fix (sgl-project#16030) (sgl-project#16412) * [Perf] fuse q, k norm for Flux2Attention (sgl-project#17241) Co-authored-by: Minglei Zhu <zminglei@linkedin.com> * [CI] Add partition to stage-b-test-large-1-gpu (11->12) (sgl-project#17245) * fix(ci): rate limit and permission errors in trace publishing (sgl-project#17238) * Revert "[Perf] fuse q, k norm for Flux2Attention (sgl-project#17241)" (sgl-project#17332) * Migrate performance, accuracy, and quantization tests to CI registry (sgl-project#17177) Co-authored-by: Kangyan-Zhou <zky314343421@gmail.com> * Inclusion of nvfp4 blockscale in EPLB Rebalance (sgl-project#17158) * [Refactor] Set `fp4-gemm-backend=auto` on SM100 and rename `fp4-gemm-backend` with `flashinfer_` prefix (sgl-project#17309) * [Diffusion] Apply qknorm to flux2 and apply lightx2v rms_norm_one_pass kernel(without residual) (sgl-project#17305) Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> * Fix v32 continue_final_message not work (sgl-project#16567) * Evict swa kv cache during decoding (sgl-project#17220) * [RadixTree][1/N Refactor]: Support unified match_prefix params (sgl-project#17142) Co-authored-by: yizhang2077 <1109276519@qq.com> Co-authored-by: pansicheng <sicheng.pan.chn@gmail.com> * [AMD CI] Migrate and Add More Testcases (sgl-project#17116) Co-authored-by: yctseng0211 <yctseng@amd.com> * [AMD] CI - add partitions for stage-b-test-small-1-gpu-amd (sgl-project#17345) * Restore deepseek_v2.py to main's code, except the utils * Ran `pre-commit` --------- Signed-off-by: Lancer <maruixiang6688@gmail.com> Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com> Co-authored-by: Hudson Xing <1277646412@qq.com> Co-authored-by: Lancer <402430575@qq.com> Co-authored-by: Alison Shao <54658187+alisonshao@users.noreply.github.com> Co-authored-by: Mick <mickjagger19@icloud.com> Co-authored-by: Ke Bao <ispobaoke@gmail.com> Co-authored-by: Yuan Luo <yuan.luo@hotmail.com> Co-authored-by: luoyuan.luo <luoyuan.luo@antgroup.com> Co-authored-by: Mohammad Miadh Angkad <mangkad.bsdsba2027@aim.edu> Co-authored-by: Changyi Yang <112288487+ChangyiYang@users.noreply.github.com> Co-authored-by: YAMY <74099316+YAMY1234@users.noreply.github.com> Co-authored-by: Xinyuan Tong <115166877+JustinTong0323@users.noreply.github.com> Co-authored-by: b8zhong <b8zhong@uwaterloo.ca> Co-authored-by: Vincent Zhong <207368749+vincentzed@users.noreply.github.com> Co-authored-by: Ch3ngY1 <91232537+Ch3ngY1@users.noreply.github.com> Co-authored-by: Shangming Cai <csmthu@gmail.com> Co-authored-by: Xiaoyu Zhang <35585791+BBuf@users.noreply.github.com> Co-authored-by: Jerry Ji <jerryjilol@gmail.com> Co-authored-by: Todobe <43903496+Todobe@users.noreply.github.com> Co-authored-by: Jinyan Chen <93358689+liz-badada@users.noreply.github.com> Co-authored-by: Jinyan Chen <jinyanc@nvidia.com> Co-authored-by: Koushik Dutta <koush@koushikdutta.com> Co-authored-by: root <root@ubuntu-nvidia.localdomain> Co-authored-by: Glen Liu <62917497+glenliu21@users.noreply.github.com> Co-authored-by: Baizhou Zhang <sobereddiezhang@gmail.com> Co-authored-by: Lee Nau <lnau@nvidia.com> Co-authored-by: Yongfei Xu <xuyongfei.xyf@antgroup.com> Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Gaoji Liu <34803073+attack204@users.noreply.github.com> Co-authored-by: liugaoji.lgj <liugaoji.lgj@alibaba-inc.com> Co-authored-by: yudian0504 <138860534+yudian0504@users.noreply.github.com> Co-authored-by: Kartik Ramesh <kartikx2000@gmail.com> Co-authored-by: Minglei Zhu <mingleizhu1122@gmail.com> Co-authored-by: Minglei Zhu <zminglei@linkedin.com> Co-authored-by: Kangyan-Zhou <zky314343421@gmail.com> Co-authored-by: Shu Wang <shuw@nvidia.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: ybyang <10629930+whybeyoung@users.noreply.github.com> Co-authored-by: zhangheng <hzh0425@apache.org> Co-authored-by: yizhang2077 <1109276519@qq.com> Co-authored-by: pansicheng <sicheng.pan.chn@gmail.com> Co-authored-by: Bingxu Chen <Bingxu.Chen@amd.com> Co-authored-by: yctseng0211 <yctseng@amd.com>

attack204 requested review from CatherineSue and JustinTong0323 as code owners January 8, 2026 16:13

github-actions bot added the run-ci label Jan 8, 2026

gemini-code-assist bot reviewed Jan 8, 2026

View reviewed changes

attack204 force-pushed the feature/gaoji_support_new_qwen3_coder_detector branch 5 times, most recently from ba3597a to fa40bac Compare January 10, 2026 10:44

lgjpolardbx added 7 commits January 10, 2026 19:24

support new qwen3_coder_detector

4e563f3

delete uesless file

fa62e4d

fix notation

776b226

align with newest

a45672c

fix ci

2c5fa03

fix lint

c92878c

delete old test

8eda3da

gemini-code-assist bot reviewed Jan 11, 2026

View reviewed changes

zhaochenyang20 requested changes Jan 11, 2026

View reviewed changes

fix comment

bf57b75

attack204 force-pushed the feature/gaoji_support_new_qwen3_coder_detector branch from be3a864 to bf57b75 Compare January 11, 2026 06:55

lgjpolardbx added 2 commits January 11, 2026 14:59

fix comment

8772186

add one more fix patch

21182b5

delete _reset_streaming_state

c41c78b

attack204 force-pushed the feature/gaoji_support_new_qwen3_coder_detector branch from f55b7fe to c41c78b Compare January 18, 2026 04:21

zhaochenyang20 reviewed Jan 19, 2026

View reviewed changes

Kangyan-Zhou merged commit 858a4d6 into sgl-project:main Jan 19, 2026
226 of 241 checks passed

Conversation

attack204 commented Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

UT

E2E Test

Uh oh!

gemini-code-assist bot commented Jan 8, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

attack204 commented Jan 8, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

attack204 commented Jan 10, 2026

Uh oh!

zhaochenyang20 commented Jan 11, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 11, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 11, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 11, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 11, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 11, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 11, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

zhaochenyang20 Jan 11, 2026

Choose a reason for hiding this comment

Uh oh!

zhaochenyang20 Jan 11, 2026

Choose a reason for hiding this comment

Uh oh!

zhaochenyang20 Jan 11, 2026

Choose a reason for hiding this comment

Uh oh!

zhaochenyang20 Jan 11, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

attack204 commented Jan 15, 2026

Uh oh!

zhaochenyang20 commented Jan 18, 2026

Uh oh!

zhaochenyang20 commented Jan 19, 2026

attack204 commented Jan 8, 2026 •

edited

Loading