[Bugfix] fix the default reasoning mode in Reasoner Grammar.#10831
[Bugfix] fix the default reasoning mode in Reasoner Grammar.#10831jeremyzhang866 wants to merge 22 commits intosgl-project:mainfrom
Conversation
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
|
@minleminzui please have a look |
|
@hnyls2002 I reopened the PR — please review when convenient. Thanks! |
@hnyls2002 have a look.thanks |
|
@zhyncs Can you help review this? |
|
@jeremyzhang866 sglang/test/srt/test_reasoning_parser.py,could you please verify whether you have passed this test |
@minleminzui I tested it and found no issues. Do you have any suggestions? Thanks. |
|
@hnyls2002 @zhyncs Could you help review it when you have time |
|
@xiezhq-hermann Could you help review it when you have time |
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request addresses a bug where grammar constraints were not correctly applied for reasoning models when 'thinking mode' was disabled. The fix involves introducing a may_can_reasoning flag that is passed from the scheduler to the ReasonerGrammarObject to correctly initialize its reasoning state. The overall approach is sound and effectively resolves the issue. My review includes a few suggestions to improve code clarity and maintainability by restoring type hints that were removed in base_grammar_backend.py and correcting an inconsistent type hint for the cache.
JustinTong0323
left a comment
There was a problem hiding this comment.
Thanks for the contribution! Overall LGTM, could you help resolve gemini's comments?
|
|
@JustinTong0323 Could you help review it again? Thanks. |
|
@JustinTong0323 are there any other comments, or can it be merged? thanks :) |
We need to wait for "required" CI green but there seems some issue with huggingface side blocking the PR. I would keep an eye on this PR, thanks for your effort! |
Thank you for your reply. |
|
@hnyls2002 @xiezhq-hermann please have a look. thanks |
Sorry for the late reply, just got a similar issue, and it could be better to add a test for this corner case.. |
| super().__init__() | ||
| self.grammar = grammar | ||
| self.think_end_id = think_end_id | ||
| self.is_in_reasoning = True |
There was a problem hiding this comment.
Why don't we simpily set this field to False?
@JustinTong0323 Sorry Could you please clarify what I should do? I thought the example above already showed the issue. Thanks. |
May you check the PR that just mentioned? Maybe you could discuss and find out how to combine your PRs... (and sorry I am OOO this week so reply maybe late... |
Thanks for your reply. I just checked that PR — the motivation behind it is the same as mine. That PR might have better extensibility, but my changes are a bit simpler. |
Motivation
When we start a hybrid reasoning model on the server side, such as DeepSeek V3.1 or Qwen-14B, and enable both the reasoning parser and speculative decoding, then on the client side disable “thinking mode” in
chat_template_kwargswhile also enabling JSON-constrained decoding, we observe that SGLang produces abnormal results.#10789
Modifications
We found that this may be related to the
ReasonerGrammarBackend, where theReasonerGrammarObjectdefaults tois_in_reasoning = true. Therefore, we are considering some actions to take when “thinking mode” is disabled inchat_template_kwargs.sglang/python/sglang/srt/constrained/reasoner_grammar_backend.py
Line 28 in d21c352
script
Benchmarking and Profiling
Checklist