Enables force reasoning based on chat template for Qwen3-Thinking by JustinTong0323 · Pull Request #8369 · sgl-project/sglang

JustinTong0323 · 2025-07-25T23:12:09Z

Motivation

Adds a mechanism to enforce reasoning based on patterns detected in the chat template.

The changes introduce a force_reasoning flag to the TemplateManager and ReasoningParser. This flag is set if reasoning patterns are detected in the chat template, such as <think> tags.

This ensures that reasoning is enforced even when the user does not explicitly enable it.

Modifications

Checklist

Format your code according to the Code Formatting with Pre-Commit.
Add unit tests as outlined in the Running Unit Tests.
Update documentation / docstrings / example tutorials as needed, according to Writing Documentation.
Provide throughput / latency benchmark results and accuracy evaluation results as needed, according to Benchmark and Profiling and Accuracy Results.
For reviewers: If you haven't made any contributions to this PR and are only assisting with merging the main branch, please remove yourself as a co-author when merging the PR.
Please feel free to join our Slack channel at https://slack.sglang.ai to discuss your PR.

Adds a mechanism to enforce reasoning based on patterns detected in the chat template. The changes introduce a `force_reasoning` flag to the `TemplateManager` and `ReasoningParser`. This flag is set if reasoning patterns are detected in the chat template, such as `<think>` tags. This ensures that reasoning is enforced even when the user does not explicitly enable it. Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>

gemini-code-assist

Summary of Changes

Hello @JustinTong0323, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant enhancement to how reasoning is handled within the system, particularly for models like Qwen3-Thinking. It establishes a new capability to automatically detect if a loaded chat template implies a need for reasoning (e.g., through specific tags like <think>) and, if so, enforces this reasoning behavior. This ensures that the model's output consistently includes reasoning steps when the template design dictates it, improving adherence to model-specific output formats without requiring explicit user configuration.

Highlights

Automatic Reasoning Enforcement: Implemented a mechanism to automatically detect and enforce reasoning (e.g., via <think> tags) based on the chat template itself, ensuring reasoning is applied even if not explicitly enabled by the user. This is particularly relevant for models like Qwen3-Thinking.
Template Manager Enhancements: Added a force_reasoning property and a _detect_reasoning_pattern method to the TemplateManager. This new method scans the loaded chat template for specific patterns (like <think>, </think>, or reasoning_content) to determine if reasoning should be implicitly enforced.
Reasoning Parser Integration: Modified the ReasoningParser and its specific model detectors (including DeepSeekR1Detector, Qwen3Detector, and KimiDetector) to accept and utilize the new force_reasoning flag. This flag, propagated from the TemplateManager, dictates whether reasoning parsing should be active regardless of explicit user settings.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

zhyncs · 2025-07-25T23:13:17Z

After #8363 do we still need this

Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>

CatherineSue · 2025-07-25T23:18:15Z

Thanks! Might be a better implementation than #8363. Can we integrate the UTs here?

JustinTong0323 · 2025-07-25T23:19:57Z

After #8363 do we still need this在 #8363 之后，我们是否还需要这个

I talked with Qwen team and they think remaining one reasoning parser is better for users, and avoid confusion.

Simplifies the reasoning parser for Qwen3 models by removing the separate "qwen3-thinking" parser. Now, the "qwen3" parser handles both standard Qwen3 models (with `enable_thinking`) and Qwen3-Thinking models. This simplifies configuration and usage. Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>

zhyncs · 2025-07-25T23:42:35Z

hold on please

Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>

Refines the detection of force reasoning patterns within chat templates by using regular expressions for more accurate identification. This change enhances the system's ability to recognize and handle reasoning prompts, leading to improved model behavior. Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>

Adds a dedicated `qwen3-thinking` reasoning parser option. This commit introduces `qwen3-thinking` as a valid option for the `--reasoning-parser` argument. It allows users to explicitly specify the Qwen3-Thinking model and ensures correct handling of reasoning content, as these models always generate thinking content. This change clarifies documentation and improves the flexibility and accuracy of reasoning extraction for Qwen3-Thinking models. Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>

JustinTong0323 · 2025-07-26T23:11:31Z

At present, the behavior is consistent with that of the main branch. Users can choose to use either "qwen3" or "qwen3-thinking". Regarding the Qwen3-Thinking model, these two options exhibit the same behavior.

CatherineSue

Overall LGTM. Left two nit comments.

docs/backend/openai_api_completions.ipynb

docs/backend/separate_reasoning.ipynb

Co-authored-by: Chang Su <csu272@usc.edu>

Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>

…ingChat Signed-off-by: Xinyuan Tong <justinning0323@outlook.com>

Updates the default behavior for enabling thinking in chat templates. It now disables reasoning unless explicitly enabled, providing more control over the model's behavior. This change also addresses an issue where reasoning was unintentionally triggered. Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>

Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>

gemini-code-assist · 2025-08-07T03:02:49Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

…l-project#8369) Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com> Signed-off-by: Xinyuan Tong <justinning0323@outlook.com> Co-authored-by: Chang Su <csu272@usc.edu>

JustinTong0323 requested review from ByronHsu, CatherineSue, Ying1123, hnyls2002, ispobock, merrymercy, xiezhq-hermann, zhaochenyang20 and zhyncs as code owners July 25, 2025 23:12

gemini-code-assist bot reviewed Jul 25, 2025

View reviewed changes

zhyncs assigned CatherineSue Jul 25, 2025

This comment was marked as outdated.

Sign in to view

update

1e62a55

Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>

JustinTong0323 and others added 2 commits July 25, 2025 16:24

Merge branch 'main' into feat-auto-detect-force-reasoning-qwen3

b6b087a

JustinTong0323 and others added 5 commits July 26, 2025 01:02

fix none chat-template

548651c

Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>

fix: tokenizer is none

2630d3a

Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>

Merge branch 'main' into feat-auto-detect-force-reasoning-qwen3

c940b77

CatherineSue approved these changes Jul 27, 2025

View reviewed changes

docs/backend/openai_api_completions.ipynb Outdated Show resolved Hide resolved

docs/backend/separate_reasoning.ipynb Outdated Show resolved Hide resolved

JustinTong0323 and others added 3 commits July 27, 2025 15:08

Update docs/backend/openai_api_completions.ipynb

371b5fc

Co-authored-by: Chang Su <csu272@usc.edu>

Update docs/backend/separate_reasoning.ipynb

037b469

Co-authored-by: Chang Su <csu272@usc.edu>

Merge branch 'main' into feat-auto-detect-force-reasoning-qwen3

994947e

Merge branch 'main' into feat-auto-detect-force-reasoning-qwen3

af32c23

JustinTong0323 requested a review from slin1237 as a code owner July 27, 2025 23:51

JustinTong0323 and others added 12 commits July 28, 2025 00:09

Merge branch 'main' into feat-auto-detect-force-reasoning-qwen3

7683956

feat: add thinking prompt for request in OpenAIServingChat

5005b07

Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>

Merge branch 'main' into feat-auto-detect-force-reasoning-qwen3

273d252

fix: update thinking token comment and method signature in OpenAIServ…

be24e17

…ingChat Signed-off-by: Xinyuan Tong <justinning0323@outlook.com>

Merge branch 'main' into feat-auto-detect-force-reasoning-qwen3

75eb213

Merge branch 'main' into feat-auto-detect-force-reasoning-qwen3

68d9567

Merge branch 'main' into feat-auto-detect-force-reasoning-qwen3

ba8da26

Merge branch 'main' into feat-auto-detect-force-reasoning-qwen3

d1b4f00

Merge branch 'main' into feat-auto-detect-force-reasoning-qwen3

f557d10

Merge branch 'main' into feat-auto-detect-force-reasoning-qwen3

f1f0c58

fix: tokenizer is none when skip init

b05b795

Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>

JustinTong0323 mentioned this pull request Aug 7, 2025

chore: bump v0.5.0 #8885

Closed

6 tasks

zhyncs merged commit 3fa3c6c into sgl-project:main Aug 7, 2025
260 of 288 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enables force reasoning based on chat template for Qwen3-Thinking#8369

Enables force reasoning based on chat template for Qwen3-Thinking#8369
zhyncs merged 25 commits intosgl-project:mainfrom
JustinTong0323:feat-auto-detect-force-reasoning-qwen3

JustinTong0323 commented Jul 25, 2025 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

zhyncs commented Jul 25, 2025

Uh oh!

This comment was marked as outdated.

Uh oh!

CatherineSue commented Jul 25, 2025

Uh oh!

JustinTong0323 commented Jul 25, 2025

Uh oh!

zhyncs commented Jul 25, 2025

Uh oh!

JustinTong0323 commented Jul 26, 2025

Uh oh!

CatherineSue left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist bot commented Aug 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

JustinTong0323 commented Jul 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Checklist

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Footnotes

Uh oh!

zhyncs commented Jul 25, 2025

Uh oh!

This comment was marked as outdated.

Uh oh!

CatherineSue commented Jul 25, 2025

Uh oh!

JustinTong0323 commented Jul 25, 2025

Uh oh!

zhyncs commented Jul 25, 2025

Uh oh!

JustinTong0323 commented Jul 26, 2025

Uh oh!

CatherineSue left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist bot commented Aug 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

JustinTong0323 commented Jul 25, 2025 •

edited

Loading