[Feature] default --extra-body param to disable thinking in vllm bench serve#26784
[Feature] default --extra-body param to disable thinking in vllm bench serve#26784DarkLight1337 merged 5 commits intovllm-project:mainfrom
Conversation
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
There was a problem hiding this comment.
Code Review
This pull request refactors the benchmark serving script by renaming sampling_params to extra_body for better clarity, as it now includes more than just sampling parameters. It also introduces a change to disable 'thinking' by default in chat templates during benchmarks. My review focuses on a key aspect of this new feature: while disabling thinking by default is a good goal, the current implementation hardcodes this setting, which limits the benchmark's flexibility. I've suggested making this configurable via a command-line argument to maintain the tool's versatility.
Ok, i will add |
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
99f7e32 to
500bc66
Compare
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
6338d57 to
12ceb67
Compare
|
CI fail not related to the current modification. |
…h serve (vllm-project#26784) Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io> Signed-off-by: bbartels <benjamin@bartels.dev>
…h serve (vllm-project#26784) Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
…h serve (vllm-project#26784) Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
…h serve (vllm-project#26784) Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
…h serve (vllm-project#26784) Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
…h serve (vllm-project#26784) Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
…h serve (vllm-project#26784) Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
Purpose
FIX: #26760
Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.