-
-
Notifications
You must be signed in to change notification settings - Fork 11.8k
[CI/Build][Doc] Fully deprecate old bench scripts for serving / throughput / latency #24411
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request effectively deprecates the old benchmark scripts (benchmark_latency.py, benchmark_serving.py, benchmark_throughput.py) by replacing their contents with a helpful message pointing to the new vLLM CLI commands. The documentation in benchmarks/README.md is also updated accordingly. My main feedback is to improve the deprecation scripts to exit with a non-zero status code and print to stderr, which is crucial for automation and CI/CD pipelines to correctly detect that the old scripts are no longer functional.
|
This pull request has merge conflicts that must be resolved before it can be |
benchmarks/README.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe something for a follow up PR but could the content of this README be moved to docs/contributing/benchmarks.md? Right now the information about benchmarking in the docs is quite sparse and this README contains loads of useful information
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's a good idea. possibly i can follow up with @ywang96 to see if we can take a more systematic approach about all these random benchmark scripts. :D
7c5cf6e to
77f9f70
Compare
Signed-off-by: Ye (Charlotte) Qi <[email protected]>
Signed-off-by: Ye (Charlotte) Qi <[email protected]>
Signed-off-by: Ye (Charlotte) Qi <[email protected]>
Signed-off-by: Ye (Charlotte) Qi <[email protected]>
Signed-off-by: Ye (Charlotte) Qi <[email protected]>
77f9f70 to
cefc4da
Compare
hmellor
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
…ghput / latency (vllm-project#24411) Signed-off-by: Ye (Charlotte) Qi <[email protected]>
…ghput / latency (vllm-project#24411) Signed-off-by: Ye (Charlotte) Qi <[email protected]>
…ghput / latency (vllm-project#24411) Signed-off-by: Ye (Charlotte) Qi <[email protected]>
…ghput / latency (vllm-project#24411) Signed-off-by: Ye (Charlotte) Qi <[email protected]> Signed-off-by: xuebwang-amd <[email protected]>
…ghput / latency (vllm-project#24411) Signed-off-by: Ye (Charlotte) Qi <[email protected]> Signed-off-by: xuebwang-amd <[email protected]>
Purpose
Follow up of #21355. Delete old benchmarks/benchmark_(latency|throughput|serving).py to avoid confusion for contributing into these scripts.
Given CI has been running on the new script for 1 month so far, it should be fine to delete these scripts.
Test Plan
Make sure no more references in: https://github.com/search?q=repo%3Avllm-project%2Fvllm+%2Fbenchmark_%28latency%7Cserving%7Cthroughput%29.py%2F&type=code
Test current output:
Test Result
terminal shows it prints deprecation and exits 1

Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.