[Bugfix] Fix step3p5 reasoning with interleaved thinking by mariohong128 · Pull Request #34211 · vllm-project/vllm

mariohong128 · 2026-02-10T07:20:06Z

Purpose

When there are multiple rounds of conversation, the prompt contains </think> from the previous round, and the step3p5 reasoning parser failed to correctly determine the end of reasoning.

Test Plan

Add tests: tests/reasoning/test_step3p5_reasoning_parser.py

Test Result

All passed.

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: mariohong <mariohong128@gmail.com>

gemini-code-assist

Code Review

This pull request fixes a bug in the step3p5 reasoning parser where it failed to correctly detect the end of a reasoning block in multi-turn conversations. The change introduces more robust logic to identify the last <think> or </think> token to determine the reasoning state, which correctly handles prompts containing artifacts from previous turns. A comprehensive set of tests has been added to cover various scenarios, including the multi-turn case that triggered the bug. My review found a gap in the new tests where the streaming logic for is_reasoning_end_streaming is not properly exercised. I've provided a suggestion to improve the test coverage.

gemini-code-assist · 2026-02-10T07:21:58Z

tests/reasoning/test_step3p5_reasoning_parser.py

+    if streaming:
+        is_reasoning_end = parser.is_reasoning_end(output_ids)
+        assert is_reasoning_end == param_dict["is_reasoning_end"]


The test for is_reasoning_end in streaming mode is not correctly testing the streaming behavior. It calls the non-streaming parser.is_reasoning_end(output_ids) method, which evaluates the entire output at once. The stateful, incremental logic of is_reasoning_end_streaming which operates on delta_ids is not being exercised. This is a significant testing gap for a critical part of the bug fix.

To properly test the streaming logic, you should simulate the streaming process and call is_reasoning_end_streaming at each step. Here is an example of how you could structure such a test:

def run_is_reasoning_end_streaming(parser, output_ids): is_end = False current_ids = [] for token_id in output_ids: delta_ids = [token_id] current_ids.append(token_id) is_end = parser.is_reasoning_end_streaming(current_ids, delta_ids) return is_end # In test_reasoning, under the `if streaming:` block: streaming_parser = ReasoningParserManager.get_reasoning_parser(parser_name)(step3p5_tokenizer) is_reasoning_end = run_is_reasoning_end_streaming(streaming_parser, output_ids) assert is_reasoning_end == param_dict["is_reasoning_end"]

This would ensure the state transitions of the parser are correctly handled in streaming mode.

jnargi · 2026-02-10T14:35:51Z

Much appreciated for this PR, hope it gets merged soon!

mudaaaa · 2026-02-10T23:03:19Z

Posted about this fix in r/BlackwellPerformance. Someone said this patch still does not work:

Sadly it is still broken with that patch and tool calls do not work. You can easily try the patch yourself: assuming you use a venv, the patched file lives in:
.venv/lib/python3.12/site- packages/vllm/ reasoning/ step3p5_reasoning_parser.py applied the patch to vLLM 0.15.2rc1 .dev184+gfOca0671 c and it still can't call tools. 100% fail. Claude Code reports: 2026-02 -10T22:50:33.948Z [DEBUG] Bash tool input error: Bash failed due to the following issue: The required parameter 'command' is missing

mariohong128 · 2026-02-11T02:43:08Z

Posted about this fix in r/BlackwellPerformance. Someone said this patch still does not work:

Sadly it is still broken with that patch and tool calls do not work. You can easily try the patch yourself: assuming you use a venv, the patched file lives in: .venv/lib/python3.12/site- packages/vllm/ reasoning/ step3p5_reasoning_parser.py applied the patch to vLLM 0.15.2rc1 .dev184+gfOca0671 c and it still can't call tools. 100% fail. Claude Code reports: 2026-02 -10T22:50:33.948Z [DEBUG] Bash tool input error: Bash failed due to the following issue: The required parameter 'command' is missing

Can you provide more details such as input request? I tested locally and it works. Note that not support v1/messages. Thank you for your comment.

windswand · 2026-02-11T02:52:36Z

Posted about this fix in r/BlackwellPerformance. Someone said this patch still does not work:
Sadly it is still broken with that patch and tool calls do not work. You can easily try the patch yourself: assuming you use a venv, the patched file lives in: .venv/lib/python3.12/site- packages/vllm/ reasoning/ step3p5_reasoning_parser.py applied the patch to vLLM 0.15.2rc1 .dev184+gfOca0671 c and it still can't call tools. 100% fail. Claude Code reports: 2026-02 -10T22:50:33.948Z [DEBUG] Bash tool input error: Bash failed due to the following issue: The required parameter 'command' is missing

Can you provide more details such as input request? I tested locally and it works. Note that not support v1/messages. Thank you for your comment.

I'm the Reddit poster and yes, you nailed it: I'm using v1/messages with claude cli.

Interesting, I didn't know the parsers were so tightly coupled to the API type (openai vs anthropic). That's... unfortunate.

mariohong128 · 2026-02-11T02:56:48Z

Posted about this fix in r/BlackwellPerformance. Someone said this patch still does not work:
Sadly it is still broken with that patch and tool calls do not work. You can easily try the patch yourself: assuming you use a venv, the patched file lives in: .venv/lib/python3.12/site- packages/vllm/ reasoning/ step3p5_reasoning_parser.py applied the patch to vLLM 0.15.2rc1 .dev184+gfOca0671 c and it still can't call tools. 100% fail. Claude Code reports: 2026-02 -10T22:50:33.948Z [DEBUG] Bash tool input error: Bash failed due to the following issue: The required parameter 'command' is missing

Can you provide more details such as input request? I tested locally and it works. Note that not support v1/messages. Thank you for your comment.

I'm the Reddit poster and yes, you nailed it: I'm using v1/messages with claude cli.

Interesting, I didn't know the parsers were so tightly coupled to the API type (openai vs anthropic). That's... unfortunate.

Beacuse v1/messages in vllm not support multiple tool calls in one chunk now. Other model like Qwen3 Coder, same problem. Need this pr:#33671. Or use cc-switch

windswand · 2026-02-11T04:07:27Z

Posted about this fix in r/BlackwellPerformance. Someone said this patch still does not work:
Sadly it is still broken with that patch and tool calls do not work. You can easily try the patch yourself: assuming you use a venv, the patched file lives in: .venv/lib/python3.12/site- packages/vllm/ reasoning/ step3p5_reasoning_parser.py applied the patch to vLLM 0.15.2rc1 .dev184+gfOca0671 c and it still can't call tools. 100% fail. Claude Code reports: 2026-02 -10T22:50:33.948Z [DEBUG] Bash tool input error: Bash failed due to the following issue: The required parameter 'command' is missing

Can you provide more details such as input request? I tested locally and it works. Note that not support v1/messages. Thank you for your comment.

I'm the Reddit poster and yes, you nailed it: I'm using v1/messages with claude cli.
Interesting, I didn't know the parsers were so tightly coupled to the API type (openai vs anthropic). That's... unfortunate.

Beacuse v1/messages in vllm not support multiple tool calls in one chunk now. Other model like Qwen3 Coder, same problem. Need this pr:#33671. Or use cc-switch

I don't understand. How are you proxying claude cli into vLLM such that Steps can even make tool calls? I've tried with LiteLLM, but vLLM barfs with (APIServer pid=42087) ERROR 02-10 21:03:50 [serving.py:315] NotImplementedError: Unknown part type: tool_use = tool_use

mudaaaa · 2026-02-11T04:20:21Z

Posted about this fix in r/BlackwellPerformance. Someone said this patch still does not work:
Sadly it is still broken with that patch and tool calls do not work. You can easily try the patch yourself: assuming you use a venv, the patched file lives in: .venv/lib/python3.12/site- packages/vllm/ reasoning/ step3p5_reasoning_parser.py applied the patch to vLLM 0.15.2rc1 .dev184+gfOca0671 c and it still can't call tools. 100% fail. Claude Code reports: 2026-02 -10T22:50:33.948Z [DEBUG] Bash tool input error: Bash failed due to the following issue: The required parameter 'command' is missing

Can you provide more details such as input request? I tested locally and it works. Note that not support v1/messages. Thank you for your comment.

I'm the Reddit poster and yes, you nailed it: I'm using v1/messages with claude cli.
Interesting, I didn't know the parsers were so tightly coupled to the API type (openai vs anthropic). That's... unfortunate.

Beacuse v1/messages in vllm not support multiple tool calls in one chunk now. Other model like Qwen3 Coder, same problem. Need this pr:#33671. Or use cc-switch

I don't understand. How are you proxying claude cli into vLLM such that Steps can even make tool calls? I've tried with LiteLLM, but vLLM barfs with (APIServer pid=42087) ERROR 02-10 21:03:50 [serving.py:315] NotImplementedError: Unknown part type: tool_use = tool_use

I believe cc-switch works as a proxy on the machine that you point Claude code to which handles these streaming calls and such how CC expects it for tool calls and such if I'm not mistaken, issue with how vLLM parses this? Might give it a go tomorrow if I have time with this reasoning parser fix + that other PR fix to test it out with vLLM/LiteLLM and see how it does

Maximilian-Staab · 2026-02-11T09:16:19Z

I'll try to test this later today (GMT+1) with the regular openai-endpoints (opencode).

Edit: Tried it and can confirm the reasoning parser works now. Tested in opencode with the openai endpoint.

chaunceyjiang

Thanks~

…t#34211) Signed-off-by: mariohong <mariohong128@gmail.com> Co-authored-by: Chauncey <chaunceyjiang@gmail.com>

…t#34211) Signed-off-by: mariohong <mariohong128@gmail.com> Co-authored-by: Chauncey <chaunceyjiang@gmail.com> Signed-off-by: Andrii Skliar <askliar@nvidia.com>

…t#34211) Signed-off-by: mariohong <mariohong128@gmail.com> Co-authored-by: Chauncey <chaunceyjiang@gmail.com>

fix step3p5 reasoning

7875fdb

Signed-off-by: mariohong <mariohong128@gmail.com>

mergify bot added the bug Something isn't working label Feb 10, 2026

gemini-code-assist bot reviewed Feb 10, 2026

View reviewed changes

mariohong128 marked this pull request as ready for review February 12, 2026 03:04

mariohong128 requested review from aarnphm and chaunceyjiang as code owners February 12, 2026 03:04

chaunceyjiang approved these changes Feb 25, 2026

View reviewed changes

chaunceyjiang added the ready ONLY add when PR is ready to merge/full CI is needed label Feb 25, 2026

Merge branch 'main' into fix_step3p5_reasoning

ab1e739

chaunceyjiang enabled auto-merge (squash) February 25, 2026 07:20

chaunceyjiang merged commit af5e6af into vllm-project:main Feb 25, 2026
49 checks passed

haanjack pushed a commit to haanjack/vllm that referenced this pull request Feb 26, 2026

[Bugfix] Fix step3p5 reasoning with interleaved thinking (vllm-projec…

058f68e

…t#34211) Signed-off-by: mariohong <mariohong128@gmail.com> Co-authored-by: Chauncey <chaunceyjiang@gmail.com>

tom-zju pushed a commit to tom-zju/vllm that referenced this pull request Feb 26, 2026

[Bugfix] Fix step3p5 reasoning with interleaved thinking (vllm-projec…

a9e5bfb

…t#34211) Signed-off-by: mariohong <mariohong128@gmail.com> Co-authored-by: Chauncey <chaunceyjiang@gmail.com>

mariohong128 mentioned this pull request Feb 26, 2026

Add patch for vLLM local deployment stepfun-ai/Step-3.5-Flash#36

Merged

llsj14 pushed a commit to llsj14/vllm that referenced this pull request Mar 1, 2026

[Bugfix] Fix step3p5 reasoning with interleaved thinking (vllm-projec…

4b41276

…t#34211) Signed-off-by: mariohong <mariohong128@gmail.com> Co-authored-by: Chauncey <chaunceyjiang@gmail.com>

tunglinwood pushed a commit to tunglinwood/vllm that referenced this pull request Mar 4, 2026

[Bugfix] Fix step3p5 reasoning with interleaved thinking (vllm-projec…

6b91914

…t#34211) Signed-off-by: mariohong <mariohong128@gmail.com> Co-authored-by: Chauncey <chaunceyjiang@gmail.com>

Copilot AI pushed a commit to machov/vllm that referenced this pull request Mar 10, 2026

[Bugfix] Fix step3p5 reasoning with interleaved thinking (vllm-projec…

aa0d8d7

…t#34211) Signed-off-by: mariohong <mariohong128@gmail.com> Co-authored-by: Chauncey <chaunceyjiang@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bugfix] Fix step3p5 reasoning with interleaved thinking#34211

[Bugfix] Fix step3p5 reasoning with interleaved thinking#34211
chaunceyjiang merged 2 commits intovllm-project:mainfrom
mariohong128:fix_step3p5_reasoning

mariohong128 commented Feb 10, 2026 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Feb 10, 2026

Uh oh!

jnargi commented Feb 10, 2026

Uh oh!

mudaaaa commented Feb 10, 2026

Uh oh!

mariohong128 commented Feb 11, 2026 •

edited

Loading

Uh oh!

windswand commented Feb 11, 2026

Uh oh!

mariohong128 commented Feb 11, 2026 •

edited

Loading

Uh oh!

windswand commented Feb 11, 2026

Uh oh!

mudaaaa commented Feb 11, 2026

Uh oh!

Maximilian-Staab commented Feb 11, 2026 •

edited

Loading

Uh oh!

chaunceyjiang left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Uh oh!

Conversation

mariohong128 commented Feb 10, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

jnargi commented Feb 10, 2026

Uh oh!

mudaaaa commented Feb 10, 2026

Uh oh!

mariohong128 commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

windswand commented Feb 11, 2026

Uh oh!

mariohong128 commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

windswand commented Feb 11, 2026

Uh oh!

mudaaaa commented Feb 11, 2026

Uh oh!

Maximilian-Staab commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chaunceyjiang left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

mariohong128 commented Feb 10, 2026 •

edited by github-actions bot

Loading

mariohong128 commented Feb 11, 2026 •

edited

Loading

mariohong128 commented Feb 11, 2026 •

edited

Loading

Maximilian-Staab commented Feb 11, 2026 •

edited

Loading