Expose vLLM Metrics to serve.llm API#52719
Conversation
33cca0b to
56e7858
Compare
kouroshHakha
left a comment
There was a problem hiding this comment.
Just some V0 vs. V1 stuff. Could you also ask from observability team to review as well??
python/ray/llm/_internal/serve/deployments/llm/vllm/vllm_engine.py
Outdated
Show resolved
Hide resolved
python/ray/dashboard/modules/metrics/dashboards/serve_dashboard_panels.py
Outdated
Show resolved
Hide resolved
python/ray/dashboard/modules/metrics/dashboards/serve_llm_dashboard_panels.py
Outdated
Show resolved
Hide resolved
kouroshHakha
left a comment
There was a problem hiding this comment.
The changes to server_models and vllm_engine looks good to me. Thanks a ton.
kouroshHakha
left a comment
There was a problem hiding this comment.
Could You create docs for logging?
Basically you want to cover:
- How to enable logging?
- What does logging give you: i.e engine emitted metrics like vllm metrics about cache hit rate, spec decoding hit rate, etc + service level metrics like number of input tokens served, output tokens, etc
Maybe with some nice screenshots.
You don't need to create an extensive list of all metrics.
python/ray/llm/_internal/serve/deployments/llm/vllm/vllm_loggers.py
Outdated
Show resolved
Hide resolved
kouroshHakha
left a comment
There was a problem hiding this comment.
some minor change requests:
dstrodtman
left a comment
There was a problem hiding this comment.
Some suggestions, mostly for clarity and to improve readability and SEO.
|
Thanks @dstrodtman for comments! |
angelinalg
left a comment
There was a problem hiding this comment.
Just some nits. Thanks for doing the tech writer review, Douglas and the quick resolutions, @eicherseiji!
|
Thanks @angelinalg! |
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
…1 only Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com> Signed-off-by: Seiji Eicher <58963096+eicherseiji@users.noreply.github.com>
Adding back some default panel configurations that were accidentally removed in a prior PR #52719 Signed-off-by: Alan Guo <aguo@anyscale.com>
Why are these changes needed?
This change provides visibility into Ray Serve LLM deployments, including vLLM-specific statistics.
Dashboard panels:
Docs:


Related issue number
JR-1864
Checks
git commit -s) in this PR.scripts/format.shto lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/under thecorresponding
.rstfile.Tested following steps on https://docs.ray.io/en/latest/cluster/metrics.html