Skip to content

Expose vLLM Metrics to serve.llm API#52719

Merged
kouroshHakha merged 12 commits intoray-project:masterfrom
eicherseiji:JR_1864
May 13, 2025
Merged

Expose vLLM Metrics to serve.llm API#52719
kouroshHakha merged 12 commits intoray-project:masterfrom
eicherseiji:JR_1864

Conversation

@eicherseiji
Copy link
Copy Markdown
Contributor

@eicherseiji eicherseiji commented May 1, 2025

Why are these changes needed?

This change provides visibility into Ray Serve LLM deployments, including vLLM-specific statistics.

Dashboard panels:

Screenshot 2025-05-08 at 5 47 34 PM Screenshot 2025-05-08 at 5 47 42 PM Screenshot 2025-05-08 at 5 47 46 PM Screenshot 2025-05-08 at 5 47 49 PM

Docs:
Screenshot 2025-05-12 at 3 44 19 PM
Screenshot 2025-05-12 at 3 44 27 PM

Related issue number

JR-1864

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Tested following steps on https://docs.ray.io/en/latest/cluster/metrics.html

@eicherseiji eicherseiji self-assigned this May 1, 2025
@eicherseiji eicherseiji force-pushed the JR_1864 branch 2 times, most recently from 33cca0b to 56e7858 Compare May 7, 2025 02:52
@hainesmichaelc hainesmichaelc added the community-contribution Contributed by the community label May 7, 2025
Copy link
Copy Markdown
Contributor

@kouroshHakha kouroshHakha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some V0 vs. V1 stuff. Could you also ask from observability team to review as well??

@kouroshHakha kouroshHakha requested a review from alanwguo May 8, 2025 16:58
@kouroshHakha kouroshHakha removed the community-contribution Contributed by the community label May 8, 2025
@hainesmichaelc hainesmichaelc added the community-contribution Contributed by the community label May 8, 2025
@eicherseiji eicherseiji marked this pull request as ready for review May 9, 2025 00:46
@eicherseiji eicherseiji requested a review from a team as a code owner May 9, 2025 00:46
Copy link
Copy Markdown
Contributor

@kouroshHakha kouroshHakha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes to server_models and vllm_engine looks good to me. Thanks a ton.

@eicherseiji eicherseiji added the go add ONLY when ready to merge, run all tests label May 10, 2025
Copy link
Copy Markdown
Contributor

@kouroshHakha kouroshHakha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could You create docs for logging?

Basically you want to cover:

  1. How to enable logging?
  2. What does logging give you: i.e engine emitted metrics like vllm metrics about cache hit rate, spec decoding hit rate, etc + service level metrics like number of input tokens served, output tokens, etc
    Maybe with some nice screenshots.

You don't need to create an extensive list of all metrics.

@eicherseiji eicherseiji requested review from a team, akshay-anyscale, edoakes and zcin as code owners May 12, 2025 20:48
Copy link
Copy Markdown
Contributor

@kouroshHakha kouroshHakha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some minor change requests:

Copy link
Copy Markdown
Contributor

@kouroshHakha kouroshHakha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kouroshHakha kouroshHakha enabled auto-merge (squash) May 12, 2025 22:16
@mascharkh mascharkh added serve Ray Serve Related Issue usability labels May 12, 2025
Copy link
Copy Markdown
Contributor

@dstrodtman dstrodtman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some suggestions, mostly for clarity and to improve readability and SEO.

@eicherseiji
Copy link
Copy Markdown
Contributor Author

Thanks @dstrodtman for comments!

Copy link
Copy Markdown
Contributor

@angelinalg angelinalg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some nits. Thanks for doing the tech writer review, Douglas and the quick resolutions, @eicherseiji!

@eicherseiji
Copy link
Copy Markdown
Contributor Author

Thanks @angelinalg!

eicherseiji and others added 12 commits May 13, 2025 18:17
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
…1 only

Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com>
Signed-off-by: Seiji Eicher <58963096+eicherseiji@users.noreply.github.com>
@kouroshHakha kouroshHakha merged commit 881cd91 into ray-project:master May 13, 2025
5 checks passed
matthewdeng pushed a commit that referenced this pull request May 20, 2025
Adding back some default panel configurations that were accidentally
removed in a prior PR #52719


Signed-off-by: Alan Guo <aguo@anyscale.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-backlog community-contribution Contributed by the community go add ONLY when ready to merge, run all tests serve Ray Serve Related Issue usability

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants