[Feature] default --extra-body param to disable thinking in vllm bench serve by lengrongfu · Pull Request #26784 · vllm-project/vllm

lengrongfu · 2025-10-14T08:42:03Z

Purpose

FIX: #26760

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>

gemini-code-assist

Code Review

This pull request refactors the benchmark serving script by renaming sampling_params to extra_body for better clarity, as it now includes more than just sampling parameters. It also introduces a change to disable 'thinking' by default in chat templates during benchmarks. My review focuses on a key aspect of this new feature: while disabling thinking by default is a good goal, the current implementation hardcodes this setting, which limits the benchmark's flexibility. I've suggested making this configurable via a command-line argument to maintain the tool's versatility.

vllm/benchmarks/serve.py

DarkLight1337

I think this should not be applied by default. Users should be able to pass --extra-body explicitly via CLI which is merged with the sampling params

lengrongfu · 2025-10-14T11:41:44Z

I think this should not be applied by default. Users should be able to pass --extra-body explicitly via CLI which is merged with the sampling params

Ok, i will add --extra-body param. test comand:

$ vllm bench serve --model Qwen/Qwen3-0.6B --backend openai --max-concurrency 1 --num-prompts 10 --extra-body '{"chat_template_kwargs":{"enable_thinking":false}}'

Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>

vllm/benchmarks/serve.py

Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>

vllm/benchmarks/serve.py

Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>

DarkLight1337

Thanks!

lengrongfu · 2025-10-15T01:19:05Z

CI fail not related to the current modification.

…h serve (vllm-project#26784) Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io> Signed-off-by: bbartels <benjamin@bartels.dev>

…h serve (vllm-project#26784) Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>

…h serve (vllm-project#26784) Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>

…h serve (vllm-project#26784) Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>

[Feature] default disable thinking in vllm bench serve

0fc8388

Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>

mergify bot added the performance Performance-related issues label Oct 14, 2025

gemini-code-assist bot reviewed Oct 14, 2025

View reviewed changes

vllm/benchmarks/serve.py Outdated Show resolved Hide resolved

DarkLight1337 reviewed Oct 14, 2025

View reviewed changes

lengrongfu requested review from aarnphm and chaunceyjiang as code owners October 14, 2025 11:40

mergify bot added the frontend label Oct 14, 2025

add --extra-body pararm to cli

500bc66

Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>

lengrongfu force-pushed the feat/opt-bench branch from 99f7e32 to 500bc66 Compare October 14, 2025 11:42

lengrongfu changed the title ~~[Feature] default disable thinking in vllm bench serve~~ [Feature] default --extra-body param to disable thinking in vllm bench serve Oct 14, 2025

DarkLight1337 reviewed Oct 14, 2025

View reviewed changes

vllm/benchmarks/serve.py Outdated Show resolved Hide resolved

still call this sampling_params

d4c9ecd

Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>

DarkLight1337 reviewed Oct 14, 2025

View reviewed changes

vllm/benchmarks/serve.py Outdated Show resolved Hide resolved

still call this sampling_params

12ceb67

Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>

lengrongfu force-pushed the feat/opt-bench branch from 6338d57 to 12ceb67 Compare October 14, 2025 16:09

DarkLight1337 approved these changes Oct 14, 2025

View reviewed changes

DarkLight1337 enabled auto-merge (squash) October 14, 2025 16:10

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 14, 2025

Merge branch 'main' into feat/opt-bench

47be117

DarkLight1337 merged commit a27b288 into vllm-project:main Oct 15, 2025
46 checks passed

lywa1998 pushed a commit to lywa1998/vllm that referenced this pull request Oct 20, 2025

[Feature] default --extra-body param to disable thinking in vllm benc…

4cd1a99

…h serve (vllm-project#26784) Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>

lengrongfu deleted the feat/opt-bench branch October 21, 2025 02:54

alhridoy pushed a commit to alhridoy/vllm that referenced this pull request Oct 24, 2025

[Feature] default --extra-body param to disable thinking in vllm benc…

96a2589

…h serve (vllm-project#26784) Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>

rtourgeman pushed a commit to rtourgeman/vllm that referenced this pull request Nov 10, 2025

[Feature] default --extra-body param to disable thinking in vllm benc…

c6260f6

…h serve (vllm-project#26784) Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>

devpatelio pushed a commit to SumanthRH/vllm that referenced this pull request Nov 29, 2025

[Feature] default --extra-body param to disable thinking in vllm benc…

25f1727

…h serve (vllm-project#26784) Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature] default --extra-body param to disable thinking in vllm bench serve#26784

[Feature] default --extra-body param to disable thinking in vllm bench serve#26784
DarkLight1337 merged 5 commits intovllm-project:mainfrom
lengrongfu:feat/opt-bench

lengrongfu commented Oct 14, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

DarkLight1337 left a comment •

edited

Loading

Uh oh!

lengrongfu commented Oct 14, 2025

Uh oh!

Uh oh!

Uh oh!

DarkLight1337 left a comment

Uh oh!

lengrongfu commented Oct 15, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

lengrongfu commented Oct 14, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

DarkLight1337 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lengrongfu commented Oct 14, 2025

Uh oh!

Uh oh!

Uh oh!

DarkLight1337 left a comment

Choose a reason for hiding this comment

Uh oh!

lengrongfu commented Oct 15, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

lengrongfu commented Oct 14, 2025 •

edited by github-actions bot

Loading

DarkLight1337 left a comment •

edited

Loading