Skip to content

feat: migrate Qwen3 reasoning parser to token-level OutputRouter (closes #64)#97

Closed
Ai-chan-0411 wants to merge 1 commit intoraullenchai:mainfrom
Ai-chan-0411:feat/qwen3-output-router
Closed

feat: migrate Qwen3 reasoning parser to token-level OutputRouter (closes #64)#97
Ai-chan-0411 wants to merge 1 commit intoraullenchai:mainfrom
Ai-chan-0411:feat/qwen3-output-router

Conversation

@Ai-chan-0411
Copy link
Copy Markdown

Closes #64

Summary

Migrate Qwen3's <think>/</think> reasoning parser from regex-based BaseThinkingReasoningParser to the token-level OutputRouter state machine.

Changes

vllm_mlx/output_router.py

  • Add <think>/</think> token handling in feed()think_start enters THINKING state, think_end switches to CONTENT
  • Add Qwen3/DeepSeek auto-detection in from_tokenizer() via <think> + </think> vocabulary entries
  • Token map already had think_start/think_end fields (added for future migration) — now wired up

tests/test_output_router.py

  • Add Qwen3 vocabulary fixtures and qwen3_router fixture
  • Add TestQwen3ThinkRouting class with 9 tests:
    • <think> enters THINKING state
    • </think> switches to CONTENT state
    • Tokens between tags routed to REASONING channel
    • Content tokens after </think> routed to CONTENT channel
    • Full <think>reasoning</think>content sequence via feed_sequence()
    • Implicit thinking (no <think>, only </think>) handled
    • No-tag output passes through as pure content
    • Control tokens (bos/eos) suppressed
    • Reset clears thinking state
  • Add test_qwen3_detected in TestFromTokenizer

31 tests total, all passing.

Design

Follows the same pattern as the existing Gemma 4 implementation — token-level state machine with no text-level regex matching. This eliminates partial-token split issues that affect the current regex-based parser.

The detection uses <think> + </think> in the tokenizer vocabulary, which covers both Qwen3 and DeepSeek R1 model families.

 raullenchai#64)

Add <think>/<\/think> token detection and routing to OutputRouter,
replacing fragile regex-based text matching with token-level state
machine transitions. Supports explicit, implicit, and no-tag scenarios.

- Detect Qwen3/DeepSeek tokenizers via <think> + </think> vocab entries
- Route tokens between <think>/<\/think> to REASONING channel
- 9 new tests covering all Qwen3 scenarios (31 total, all passing)
@Ai-chan-0411
Copy link
Copy Markdown
Author

Closing as maintainer has not reviewed after 168h. Thank you for the opportunity!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: migrate Qwen3 reasoning parser to OutputRouter (token-level)

1 participant