Skip to content

[Bugfix] Fix step3p5 reasoning with interleaved thinking#34211

Merged
chaunceyjiang merged 2 commits intovllm-project:mainfrom
mariohong128:fix_step3p5_reasoning
Feb 25, 2026
Merged

[Bugfix] Fix step3p5 reasoning with interleaved thinking#34211
chaunceyjiang merged 2 commits intovllm-project:mainfrom
mariohong128:fix_step3p5_reasoning

Conversation

@mariohong128
Copy link
Contributor

@mariohong128 mariohong128 commented Feb 10, 2026

Purpose

When there are multiple rounds of conversation, the prompt contains </think> from the previous round, and the step3p5 reasoning parser failed to correctly determine the end of reasoning.
image

Test Plan

Add tests: tests/reasoning/test_step3p5_reasoning_parser.py

Test Result

All passed.


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: mariohong <mariohong128@gmail.com>
@mergify mergify bot added the bug Something isn't working label Feb 10, 2026
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request fixes a bug in the step3p5 reasoning parser where it failed to correctly detect the end of a reasoning block in multi-turn conversations. The change introduces more robust logic to identify the last <think> or </think> token to determine the reasoning state, which correctly handles prompts containing artifacts from previous turns. A comprehensive set of tests has been added to cover various scenarios, including the multi-turn case that triggered the bug. My review found a gap in the new tests where the streaming logic for is_reasoning_end_streaming is not properly exercised. I've provided a suggestion to improve the test coverage.

Comment on lines +304 to +306
if streaming:
is_reasoning_end = parser.is_reasoning_end(output_ids)
assert is_reasoning_end == param_dict["is_reasoning_end"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The test for is_reasoning_end in streaming mode is not correctly testing the streaming behavior. It calls the non-streaming parser.is_reasoning_end(output_ids) method, which evaluates the entire output at once. The stateful, incremental logic of is_reasoning_end_streaming which operates on delta_ids is not being exercised. This is a significant testing gap for a critical part of the bug fix.

To properly test the streaming logic, you should simulate the streaming process and call is_reasoning_end_streaming at each step. Here is an example of how you could structure such a test:

def run_is_reasoning_end_streaming(parser, output_ids):
    is_end = False
    current_ids = []
    for token_id in output_ids:
        delta_ids = [token_id]
        current_ids.append(token_id)
        is_end = parser.is_reasoning_end_streaming(current_ids, delta_ids)
    return is_end

# In test_reasoning, under the `if streaming:` block:
streaming_parser = ReasoningParserManager.get_reasoning_parser(parser_name)(step3p5_tokenizer)
is_reasoning_end = run_is_reasoning_end_streaming(streaming_parser, output_ids)
assert is_reasoning_end == param_dict["is_reasoning_end"]

This would ensure the state transitions of the parser are correctly handled in streaming mode.

@jnargi
Copy link

jnargi commented Feb 10, 2026

Much appreciated for this PR, hope it gets merged soon!

@mudaaaa
Copy link

mudaaaa commented Feb 10, 2026

Posted about this fix in r/BlackwellPerformance. Someone said this patch still does not work:

Sadly it is still broken with that patch and tool calls do not work. You can easily try the patch yourself: assuming you use a venv, the patched file lives in:
.venv/lib/python3.12/site- packages/vllm/ reasoning/ step3p5_reasoning_parser.py applied the patch to vLLM 0.15.2rc1 .dev184+gfOca0671 c and it still can't call tools. 100% fail. Claude Code reports: 2026-02 -10T22:50:33.948Z [DEBUG] Bash tool input error: Bash failed due to the following issue: The required parameter 'command' is missing

@mariohong128
Copy link
Contributor Author

mariohong128 commented Feb 11, 2026

Posted about this fix in r/BlackwellPerformance. Someone said this patch still does not work:

Sadly it is still broken with that patch and tool calls do not work. You can easily try the patch yourself: assuming you use a venv, the patched file lives in: .venv/lib/python3.12/site- packages/vllm/ reasoning/ step3p5_reasoning_parser.py applied the patch to vLLM 0.15.2rc1 .dev184+gfOca0671 c and it still can't call tools. 100% fail. Claude Code reports: 2026-02 -10T22:50:33.948Z [DEBUG] Bash tool input error: Bash failed due to the following issue: The required parameter 'command' is missing

Can you provide more details such as input request? I tested locally and it works. Note that not support v1/messages. Thank you for your comment.

@windswand
Copy link

Posted about this fix in r/BlackwellPerformance. Someone said this patch still does not work:
Sadly it is still broken with that patch and tool calls do not work. You can easily try the patch yourself: assuming you use a venv, the patched file lives in: .venv/lib/python3.12/site- packages/vllm/ reasoning/ step3p5_reasoning_parser.py applied the patch to vLLM 0.15.2rc1 .dev184+gfOca0671 c and it still can't call tools. 100% fail. Claude Code reports: 2026-02 -10T22:50:33.948Z [DEBUG] Bash tool input error: Bash failed due to the following issue: The required parameter 'command' is missing

Can you provide more details such as input request? I tested locally and it works. Note that not support v1/messages. Thank you for your comment.

I'm the Reddit poster and yes, you nailed it: I'm using v1/messages with claude cli.

Interesting, I didn't know the parsers were so tightly coupled to the API type (openai vs anthropic). That's... unfortunate.

@mariohong128
Copy link
Contributor Author

mariohong128 commented Feb 11, 2026

Posted about this fix in r/BlackwellPerformance. Someone said this patch still does not work:
Sadly it is still broken with that patch and tool calls do not work. You can easily try the patch yourself: assuming you use a venv, the patched file lives in: .venv/lib/python3.12/site- packages/vllm/ reasoning/ step3p5_reasoning_parser.py applied the patch to vLLM 0.15.2rc1 .dev184+gfOca0671 c and it still can't call tools. 100% fail. Claude Code reports: 2026-02 -10T22:50:33.948Z [DEBUG] Bash tool input error: Bash failed due to the following issue: The required parameter 'command' is missing

Can you provide more details such as input request? I tested locally and it works. Note that not support v1/messages. Thank you for your comment.

I'm the Reddit poster and yes, you nailed it: I'm using v1/messages with claude cli.

Interesting, I didn't know the parsers were so tightly coupled to the API type (openai vs anthropic). That's... unfortunate.

Beacuse v1/messages in vllm not support multiple tool calls in one chunk now. Other model like Qwen3 Coder, same problem. Need this pr:#33671. Or use cc-switch

@windswand
Copy link

Posted about this fix in r/BlackwellPerformance. Someone said this patch still does not work:
Sadly it is still broken with that patch and tool calls do not work. You can easily try the patch yourself: assuming you use a venv, the patched file lives in: .venv/lib/python3.12/site- packages/vllm/ reasoning/ step3p5_reasoning_parser.py applied the patch to vLLM 0.15.2rc1 .dev184+gfOca0671 c and it still can't call tools. 100% fail. Claude Code reports: 2026-02 -10T22:50:33.948Z [DEBUG] Bash tool input error: Bash failed due to the following issue: The required parameter 'command' is missing

Can you provide more details such as input request? I tested locally and it works. Note that not support v1/messages. Thank you for your comment.

I'm the Reddit poster and yes, you nailed it: I'm using v1/messages with claude cli.
Interesting, I didn't know the parsers were so tightly coupled to the API type (openai vs anthropic). That's... unfortunate.

Beacuse v1/messages in vllm not support multiple tool calls in one chunk now. Other model like Qwen3 Coder, same problem. Need this pr:#33671. Or use cc-switch

I don't understand. How are you proxying claude cli into vLLM such that Steps can even make tool calls? I've tried with LiteLLM, but vLLM barfs with (APIServer pid=42087) ERROR 02-10 21:03:50 [serving.py:315] NotImplementedError: Unknown part type: tool_use = tool_use

@mudaaaa
Copy link

mudaaaa commented Feb 11, 2026

Posted about this fix in r/BlackwellPerformance. Someone said this patch still does not work:
Sadly it is still broken with that patch and tool calls do not work. You can easily try the patch yourself: assuming you use a venv, the patched file lives in: .venv/lib/python3.12/site- packages/vllm/ reasoning/ step3p5_reasoning_parser.py applied the patch to vLLM 0.15.2rc1 .dev184+gfOca0671 c and it still can't call tools. 100% fail. Claude Code reports: 2026-02 -10T22:50:33.948Z [DEBUG] Bash tool input error: Bash failed due to the following issue: The required parameter 'command' is missing

Can you provide more details such as input request? I tested locally and it works. Note that not support v1/messages. Thank you for your comment.

I'm the Reddit poster and yes, you nailed it: I'm using v1/messages with claude cli.
Interesting, I didn't know the parsers were so tightly coupled to the API type (openai vs anthropic). That's... unfortunate.

Beacuse v1/messages in vllm not support multiple tool calls in one chunk now. Other model like Qwen3 Coder, same problem. Need this pr:#33671. Or use cc-switch

I don't understand. How are you proxying claude cli into vLLM such that Steps can even make tool calls? I've tried with LiteLLM, but vLLM barfs with (APIServer pid=42087) ERROR 02-10 21:03:50 [serving.py:315] NotImplementedError: Unknown part type: tool_use = tool_use

I believe cc-switch works as a proxy on the machine that you point Claude code to which handles these streaming calls and such how CC expects it for tool calls and such if I'm not mistaken, issue with how vLLM parses this? Might give it a go tomorrow if I have time with this reasoning parser fix + that other PR fix to test it out with vLLM/LiteLLM and see how it does

@Maximilian-Staab
Copy link

Maximilian-Staab commented Feb 11, 2026

I'll try to test this later today (GMT+1) with the regular openai-endpoints (opencode).

Edit: Tried it and can confirm the reasoning parser works now. Tested in opencode with the openai endpoint.

@mariohong128 mariohong128 marked this pull request as ready for review February 12, 2026 03:04
Copy link
Collaborator

@chaunceyjiang chaunceyjiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks~

@chaunceyjiang chaunceyjiang added the ready ONLY add when PR is ready to merge/full CI is needed label Feb 25, 2026
@chaunceyjiang chaunceyjiang enabled auto-merge (squash) February 25, 2026 07:20
@chaunceyjiang chaunceyjiang merged commit af5e6af into vllm-project:main Feb 25, 2026
49 checks passed
haanjack pushed a commit to haanjack/vllm that referenced this pull request Feb 26, 2026
…t#34211)

Signed-off-by: mariohong <mariohong128@gmail.com>
Co-authored-by: Chauncey <chaunceyjiang@gmail.com>
tom-zju pushed a commit to tom-zju/vllm that referenced this pull request Feb 26, 2026
…t#34211)

Signed-off-by: mariohong <mariohong128@gmail.com>
Co-authored-by: Chauncey <chaunceyjiang@gmail.com>
llsj14 pushed a commit to llsj14/vllm that referenced this pull request Mar 1, 2026
…t#34211)

Signed-off-by: mariohong <mariohong128@gmail.com>
Co-authored-by: Chauncey <chaunceyjiang@gmail.com>
tunglinwood pushed a commit to tunglinwood/vllm that referenced this pull request Mar 4, 2026
…t#34211)

Signed-off-by: mariohong <mariohong128@gmail.com>
Co-authored-by: Chauncey <chaunceyjiang@gmail.com>
askliar pushed a commit to askliar/vllm that referenced this pull request Mar 9, 2026
…t#34211)

Signed-off-by: mariohong <mariohong128@gmail.com>
Co-authored-by: Chauncey <chaunceyjiang@gmail.com>
Signed-off-by: Andrii Skliar <askliar@nvidia.com>
Copilot AI pushed a commit to machov/vllm that referenced this pull request Mar 10, 2026
…t#34211)

Signed-off-by: mariohong <mariohong128@gmail.com>
Co-authored-by: Chauncey <chaunceyjiang@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants