Skip to content

[Frontend] Skip stop in reasoning content#24941

Open
chaunceyjiang wants to merge 17 commits intovllm-project:mainfrom
chaunceyjiang:stop
Open

[Frontend] Skip stop in reasoning content#24941
chaunceyjiang wants to merge 17 commits intovllm-project:mainfrom
chaunceyjiang:stop

Conversation

@chaunceyjiang
Copy link
Collaborator

@chaunceyjiang chaunceyjiang commented Sep 16, 2025

Purpose

When the stop="" parameter is set, it causes the reasoning phase to stop when encountering a stop token, which interrupts the process and prevents the user from seeing any content.

Test Plan

from openai import OpenAI

# Modify OpenAI's API key and API base to use vLLM's API server.
openai_api_key = "EMPTY"
openai_api_base = "http://localhost:8000/v1"

client = OpenAI(
    api_key=openai_api_key,
    base_url=openai_api_base,
)

models = client.models.list()
model = models.data[0].id

# Round 1
messages = [{"role": "user", "content": "9.11 and 9.8, which is greater?"}]
response = client.chat.completions.create(
    model=model,
    messages=messages,
    stop="9.8",
)

reasoning_content = response.choices[0].message.reasoning_content
content = response.choices[0].message.content

print("reasoning_content for Round 1:", reasoning_content)
print("-" * 80)
print("content for Round 1:", content)

Test Result

reasoning_content for Round 1: 
Okay, so I need to figure out whether 9.11 is greater than 9.8 or not. Let me start by recalling how to compare decimal numbers. I think the general rule is to look at the digits from left to right and compare them one by one. 

First, both numbers are in the ones place. Let me write them down to visualize better:

9.11 and 9.8
...
...
...
So, in conclusion, 9.8 is greater than 9.11.

--------------------------------------------------------------------------------
content for Round 1: 

To determine which number is greater between **9.11** and **


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

@chaunceyjiang chaunceyjiang marked this pull request as ready for review September 16, 2025 07:35
@chaunceyjiang chaunceyjiang changed the title [Frontend] Skip stop in reasoning content [Frontend] Skip stop in reasoning content Sep 16, 2025
@chaunceyjiang chaunceyjiang changed the title [Frontend] Skip stop in reasoning content [Frontend] Skip stop in reasoning content Sep 16, 2025
@chaunceyjiang chaunceyjiang changed the title [Frontend] Skip stop in reasoning content [Frontend] Skip stop in reasoning content Sep 16, 2025
@chaunceyjiang
Copy link
Collaborator Author

/cc @mgoin @njhill @aarnphm PTAL.

@mergify
Copy link

mergify bot commented Sep 17, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @chaunceyjiang.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify
Copy link

mergify bot commented Sep 21, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @chaunceyjiang.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@chaunceyjiang
Copy link
Collaborator Author

/cc @njhill PTAL.

@gaocegege
Copy link
Contributor

I think this only addresses part of the performance problem though. For the other part, I think changes would be needed on the reasoning parser side. There should be a way of doing incremental checking, rather than searching the entire prompt + output tokens for every generated token (while reasoning)...

@njhill I agree with your point. I’ll look into this issue and then submit a new PR. Currently, is_reasoning_end is used in multiple places.

I’ll explore how to perform incremental checks.

Could you please create an issue to keep track? It is a potential problem since day 1, I am also interested in it.

@mergify
Copy link

mergify bot commented Oct 1, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @chaunceyjiang.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Oct 1, 2025
@chaunceyjiang
Copy link
Collaborator Author

I think this only addresses part of the performance problem though. For the other part, I think changes would be needed on the reasoning parser side. There should be a way of doing incremental checking, rather than searching the entire prompt + output tokens for every generated token (while reasoning)...

Hi, @gaocegege @njhill PTAL. #25735

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
@mergify
Copy link

mergify bot commented Oct 13, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @chaunceyjiang.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Oct 13, 2025
@bbartels
Copy link
Contributor

@chaunceyjiang is there something still missing from having this merged?

@mergify
Copy link

mergify bot commented Dec 5, 2025

Hi @chaunceyjiang, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

@github-actions
Copy link

github-actions bot commented Mar 6, 2026

This pull request has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this pull request should remain open. Thank you!

@github-actions github-actions bot added the stale Over 90 days of inactivity label Mar 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs-rebase ready ONLY add when PR is ready to merge/full CI is needed stale Over 90 days of inactivity tool-calling v1

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

4 participants