Skip to content

Enables force reasoning based on chat template for Qwen3-Thinking#8369

Merged
zhyncs merged 25 commits intosgl-project:mainfrom
JustinTong0323:feat-auto-detect-force-reasoning-qwen3
Aug 7, 2025
Merged

Enables force reasoning based on chat template for Qwen3-Thinking#8369
zhyncs merged 25 commits intosgl-project:mainfrom
JustinTong0323:feat-auto-detect-force-reasoning-qwen3

Conversation

@JustinTong0323
Copy link
Copy Markdown
Collaborator

@JustinTong0323 JustinTong0323 commented Jul 25, 2025

Motivation

Adds a mechanism to enforce reasoning based on patterns detected in the chat template.

The changes introduce a force_reasoning flag to the TemplateManager and ReasoningParser. This flag is set if reasoning patterns are detected in the chat template, such as <think> tags.

This ensures that reasoning is enforced even when the user does not explicitly enable it.

Modifications

Checklist

Adds a mechanism to enforce reasoning based on patterns detected in the chat template.

The changes introduce a `force_reasoning` flag to the `TemplateManager` and `ReasoningParser`. This flag is set if reasoning patterns are detected in the chat template, such as `<think>` tags.

This ensures that reasoning is enforced even when the user does not explicitly enable it.

Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @JustinTong0323, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant enhancement to how reasoning is handled within the system, particularly for models like Qwen3-Thinking. It establishes a new capability to automatically detect if a loaded chat template implies a need for reasoning (e.g., through specific tags like <think>) and, if so, enforces this reasoning behavior. This ensures that the model's output consistently includes reasoning steps when the template design dictates it, improving adherence to model-specific output formats without requiring explicit user configuration.

Highlights

  • Automatic Reasoning Enforcement: Implemented a mechanism to automatically detect and enforce reasoning (e.g., via <think> tags) based on the chat template itself, ensuring reasoning is applied even if not explicitly enabled by the user. This is particularly relevant for models like Qwen3-Thinking.
  • Template Manager Enhancements: Added a force_reasoning property and a _detect_reasoning_pattern method to the TemplateManager. This new method scans the loaded chat template for specific patterns (like <think>, </think>, or reasoning_content) to determine if reasoning should be implicitly enforced.
  • Reasoning Parser Integration: Modified the ReasoningParser and its specific model detectors (including DeepSeekR1Detector, Qwen3Detector, and KimiDetector) to accept and utilize the new force_reasoning flag. This flag, propagated from the TemplateManager, dictates whether reasoning parsing should be active regardless of explicit user settings.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@zhyncs
Copy link
Copy Markdown
Collaborator

zhyncs commented Jul 25, 2025

After #8363 do we still need this

gemini-code-assist[bot]

This comment was marked as outdated.

Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
@CatherineSue
Copy link
Copy Markdown
Collaborator

Thanks! Might be a better implementation than #8363. Can we integrate the UTs here?

@JustinTong0323
Copy link
Copy Markdown
Collaborator Author

After #8363 do we still need this在 #8363 之后,我们是否还需要这个

I talked with Qwen team and they think remaining one reasoning parser is better for users, and avoid confusion.

JustinTong0323 and others added 2 commits July 25, 2025 16:24
Simplifies the reasoning parser for Qwen3 models by removing the separate "qwen3-thinking" parser.

Now, the "qwen3" parser handles both standard Qwen3 models (with `enable_thinking`) and Qwen3-Thinking models. This simplifies configuration and usage.

Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
@zhyncs
Copy link
Copy Markdown
Collaborator

zhyncs commented Jul 25, 2025

hold on please

JustinTong0323 and others added 5 commits July 26, 2025 01:02
Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
Refines the detection of force reasoning patterns within chat templates
by using regular expressions for more accurate identification.

This change enhances the system's ability to recognize and handle
reasoning prompts, leading to improved model behavior.

Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
Adds a dedicated `qwen3-thinking` reasoning parser option.

This commit introduces `qwen3-thinking` as a valid option for the `--reasoning-parser` argument. It allows users to explicitly specify the Qwen3-Thinking model and ensures correct handling of reasoning content, as these models always generate thinking content.

This change clarifies documentation and improves the flexibility and accuracy of reasoning extraction for Qwen3-Thinking models.

Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
@JustinTong0323
Copy link
Copy Markdown
Collaborator Author

At present, the behavior is consistent with that of the main branch. Users can choose to use either "qwen3" or "qwen3-thinking". Regarding the Qwen3-Thinking model, these two options exhibit the same behavior.

Copy link
Copy Markdown
Collaborator

@CatherineSue CatherineSue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM. Left two nit comments.

@JustinTong0323 JustinTong0323 requested a review from slin1237 as a code owner July 27, 2025 23:51
@JustinTong0323 JustinTong0323 mentioned this pull request Aug 7, 2025
6 tasks
@zhyncs zhyncs merged commit 3fa3c6c into sgl-project:main Aug 7, 2025
260 of 288 checks passed
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

narutolhy pushed a commit to narutolhy/sglang that referenced this pull request Aug 17, 2025
…l-project#8369)

Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
Signed-off-by: Xinyuan Tong <justinning0323@outlook.com>
Co-authored-by: Chang Su <csu272@usc.edu>
MahmoudAshraf97 pushed a commit to MahmoudAshraf97/sglang that referenced this pull request Sep 8, 2025
…l-project#8369)

Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
Signed-off-by: Xinyuan Tong <justinning0323@outlook.com>
Co-authored-by: Chang Su <csu272@usc.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants