Skip to content

Unify think_end_id to model_config as single source of truth#22148

Merged
hnyls2002 merged 6 commits intomainfrom
lsyin/unify-think-end-id
Apr 5, 2026
Merged

Unify think_end_id to model_config as single source of truth#22148
hnyls2002 merged 6 commits intomainfrom
lsyin/unify-think-end-id

Conversation

@hnyls2002
Copy link
Copy Markdown
Collaborator

Stacked on #22146.

think_end_id was stored in three redundant locations:

This consolidates to model_config.think_end_id as the single canonical source. The tokenizer patch and scheduler private field are removed. create_grammar_backend now takes think_end_id as an explicit parameter instead of reading it from the tokenizer.

Changes (4 files, net -3 lines):

  • scheduler.py: compute once, store only on model_config
  • scheduler_output_processor_mixin.py: read from self.model_config
  • base_grammar_backend.py: accept think_end_id param, drop hasattr guard
  • grammar_manager.py: pass model_config.think_end_id to grammar backend

Move reasoning token tracking into V1 verify phase (eagle_info,
ngram_info) so it stays alongside output_ids/check_finished/grammar.
Isolate V1's post-processing block with early continue, making the
V1-specific code easy to locate and delete when deprecating V1.
Previously think_end_id was stored in three places: dynamically
patched onto the tokenizer, as self._think_end_id on the Scheduler,
and on model_config. Consolidate to model_config.think_end_id as the
single source of truth. Remove the tokenizer patch and scheduler
private field. Pass think_end_id explicitly to create_grammar_backend
instead of reading it from the tokenizer.
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the handling of the think_end_id by moving it from the tokenizer and scheduler instance variables into the model_config. This change streamlines how reasoning tokens are identified across the grammar backend and scheduler output processor. Feedback was provided regarding a potential IndexError in scheduler.py when encoding the think_end_token, suggesting a safety check for cases where the tokenizer might return an empty list.

Comment on lines +552 to 554
self.model_config.think_end_id = self.tokenizer.encode(
reasoning_parser.detector.think_end_token, add_special_tokens=False
)[0]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current implementation assumes that self.tokenizer.encode will always return at least one token ID. If the tokenizer fails to encode the think_end_token (e.g., due to an empty string or a tokenizer-specific quirk), this will raise an IndexError. Additionally, if the token is split into multiple IDs, only the first one is captured, which might lead to incorrect reasoning detection later. Consider adding a safety check.

Suggested change
self.model_config.think_end_id = self.tokenizer.encode(
reasoning_parser.detector.think_end_token, add_special_tokens=False
)[0]
ids = self.tokenizer.encode(
reasoning_parser.detector.think_end_token, add_special_tokens=False
)
self.model_config.think_end_id = ids[0] if ids else None

Base automatically changed from lsyin/isolate-spec-v1-post-processing to main April 5, 2026 10:16
@hnyls2002
Copy link
Copy Markdown
Collaborator Author

/tag-and-rerun-ci

@github-actions github-actions Bot added the run-ci label Apr 5, 2026
@hnyls2002
Copy link
Copy Markdown
Collaborator Author

/rerun-test test_reasoning_tokens.py

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 5, 2026

test_reasoning_tokens.py: No test file found matching test_reasoning_tokens.py under test/registered/.

@hnyls2002
Copy link
Copy Markdown
Collaborator Author

/rerun-test test_eagle_infer_a.py

@hnyls2002
Copy link
Copy Markdown
Collaborator Author

/rerun-test test_eagle_constrained_decoding.py

@hnyls2002
Copy link
Copy Markdown
Collaborator Author

/rerun-test test_ngram_speculative_decoding.py

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 5, 2026

1-gpu-h100 (1 test): View workflow run

cd test/ && python3 registered/spec/eagle/test_eagle_infer_a.py

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 5, 2026

1-gpu-h100 (1 test): View workflow run

cd test/ && python3 registered/spec/eagle/test_eagle_constrained_decoding.py

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 5, 2026

1-gpu-h100 (1 test): View workflow run

cd test/ && python3 registered/spec/test_ngram_speculative_decoding.py

@hnyls2002
Copy link
Copy Markdown
Collaborator Author

/rerun-test test_reasoning.py

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 5, 2026

1-gpu-h100 (1 test): View workflow run

cd test/ && python3 registered/reasoning/test_reasoning.py

@hnyls2002
Copy link
Copy Markdown
Collaborator Author

/rerun-test test_reasoning.py test_eagle_infer_a.py test_eagle_constrained_decoding.py test_ngram_speculative_decoding.py

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 5, 2026

1-gpu-h100 (4 tests): View workflow run

cd test/ && python3 registered/reasoning/test_reasoning.py
cd test/ && python3 registered/spec/eagle/test_eagle_infer_a.py
cd test/ && python3 registered/spec/eagle/test_eagle_constrained_decoding.py
cd test/ && python3 registered/spec/test_ngram_speculative_decoding.py

@hnyls2002 hnyls2002 merged commit df9c831 into main Apr 5, 2026
81 of 134 checks passed
@hnyls2002 hnyls2002 deleted the lsyin/unify-think-end-id branch April 5, 2026 10:35
JustinTong0323 pushed a commit to JustinTong0323/sglang that referenced this pull request Apr 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant