Add support for Olmo 3 reasoning parser#26039
Add support for Olmo 3 reasoning parser#26039soldni wants to merge 7 commits intovllm-project:mainfrom soldni:main
Conversation
Signed-off-by: Luca Soldaini <luca@soldaini.net>
Signed-off-by: Luca Soldaini <luca@soldaini.net>
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run You ask your reviewers to trigger select CI tests on top of Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add If you have any questions, please reach out to us on Slack at https://slack.vllm.ai. 🚀 |
There was a problem hiding this comment.
Code Review
This pull request adds a reasoning parser for Olmo 3 models. The implementation includes a new parser class, Olmo3ReasoningParser, which handles both streaming and non-streaming outputs by operating on strings to find <think> and </think> tags. Unit tests are also included to verify the parser's behavior. The overall approach is sound, but I've found a critical issue in the string overlap detection logic that could lead to incorrect parsing under certain tokenization scenarios. The rest of the changes, including the test cases and module exports, look good.
| for i in range(len(a) - 1, 1, -1): | ||
| if a[-i:] == b[:i]: | ||
| ind_a = Indices(len(a) - i, len(a)) | ||
| ind_b = Indices(0, i) | ||
| return (ind_b, ind_a) if swap else (ind_a, ind_b) | ||
|
|
||
| # third check: does the beginning of a overlap with | ||
| # the end of b? | ||
| for i in range(len(a) - 1, 1, -1): | ||
| if b[-i:] == a[:i]: | ||
| ind_a = Indices(0, i) | ||
| ind_b = Indices(len(b) - i, len(b)) | ||
| return (ind_b, ind_a) if swap else (ind_a, ind_b) |
There was a problem hiding this comment.
The loops for detecting string overlaps miss single-character overlaps. The range function's stop parameter is exclusive. Using 1 as the stop value in range(len(a) - 1, 1, -1) means the loop will not execute for an overlap of length 1.
For example, if checking for an overlap between "...a" and "b..." where the overlap is a single character, the current implementation would fail. This could cause the parser to miss <think> or </think> tags if they are split across tokens in a way that results in a single-character overlap.
To fix this, the stop value in the range calls should be 0 to include checks for overlaps of length 1.
| for i in range(len(a) - 1, 1, -1): | |
| if a[-i:] == b[:i]: | |
| ind_a = Indices(len(a) - i, len(a)) | |
| ind_b = Indices(0, i) | |
| return (ind_b, ind_a) if swap else (ind_a, ind_b) | |
| # third check: does the beginning of a overlap with | |
| # the end of b? | |
| for i in range(len(a) - 1, 1, -1): | |
| if b[-i:] == a[:i]: | |
| ind_a = Indices(0, i) | |
| ind_b = Indices(len(b) - i, len(b)) | |
| return (ind_b, ind_a) if swap else (ind_a, ind_b) | |
| for i in range(len(a) - 1, 0, -1): | |
| if a[-i:] == b[:i]: | |
| ind_a = Indices(len(a) - i, len(a)) | |
| ind_b = Indices(0, i) | |
| return (ind_b, ind_a) if swap else (ind_a, ind_b) | |
| # third check: does the beginning of a overlap with | |
| # the end of b? | |
| for i in range(len(a) - 1, 0, -1): | |
| if b[-i:] == a[:i]: | |
| ind_a = Indices(0, i) | |
| ind_b = Indices(len(b) - i, len(b)) | |
| return (ind_b, ind_a) if swap else (ind_a, ind_b) |
|
Will open a new one with DSO. |
Purpose
This PR adds support for parsing reasoning traces for the Olmo 3 family of models. (support for models has been already merged in #24534).
Olmo models use
<think>and</think>string to bracket their reasoning traces; however, unlike Qwen or Deepseek families, the strings are not special tokens in the vocabulary, therefore requiring a new parser.Test Plan
This PR includes tests to verify that the parser behaves as expected. Existing tests should not be impacted.
Test Result
Run tests:
Test output:
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.