[Benchmark] Add plot utility for parameter sweep#27168
[Benchmark] Add plot utility for parameter sweep#27168vllm-bot merged 50 commits intovllm-project:mainfrom
Conversation
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
Documentation preview: https://vllm--27168.org.readthedocs.build/en/27168/ |
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
| ```bash | ||
| python vllm/benchmarks/serve_multi.py \ | ||
| python -m vllm.benchmarks.sweep.serve_sla \ | ||
| --serve-cmd 'vllm serve meta-llama/Llama-2-7b-chat-hf' \ |
There was a problem hiding this comment.
If we use a vllm serve api address, we should how to config?
Maybe we should add a --serve-host param, user can set a vllm online server, then this --serve-params param can be invalid.
There was a problem hiding this comment.
You can set the server's host via --serve-cmd. And for resetting the server cache after each benchmark run, you can use --after-bench-cmd.
There was a problem hiding this comment.
If you mean that the benchmark should not be responsible for launching the server, you can just use a dummy command that sleeps infinitely and adjust --bench-cmd to access the real server. Of course, you should also set --after-bench-cmd in this case.
There was a problem hiding this comment.
I see, maybe i not need set --serve-cmd param, use --bench-cmd param to set vllm bench serve --model meta-llama/Llama-2-7b-chat-hf --backend openai is enough.
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Fixed now |
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>


Purpose
Follow-up to #27085
vllm/benchmarks/sweep, abstracting away common code.vllm/benchmarks/sweep/plot.py.cc @lengrongfu
Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.