[Misc] Add Ray Prometheus logger to V1#17925
Conversation
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
|
This pull request has merge conflicts that must be resolved before it can be |
05b63be to
1c773e2
Compare
fc1838a to
b793da4
Compare
markmc
left a comment
There was a problem hiding this comment.
No fundamental objection to adding a vllm.v1.metrics.ray.RayPrometheusStatLogger custom logger
I'd prefer not to add the extra PrometheusMetrics indirection
And ~90% of the diff is unnecessary reformatting of code which makes it more difficult to review
|
Thanks for the review @markmc! Apologies for reformatting changes. My pre-commit hook may have been configured incorrectly. |
a923f87 to
a41f879
Compare
|
Hey @eicherseiji I made some changes to the interface of PrometheusStatLogger that I think will conflict your stuff, but at the same time it will make this PR easier to write and review if you incorporate those changes. for easier review https://github.com/njhill/vllm/pull/6/files but this one is actually against master (which includes other stuff) #18053 |
|
Done. Thanks @kouroshHakha |
ece396d to
3079a87
Compare
|
Starting with a rebase due to test failure: |
markmc
left a comment
There was a problem hiding this comment.
Pretty happy to merge the ray stuff, but there's still a lot of unrelated changes I'd like to see removed
eicherseiji
left a comment
There was a problem hiding this comment.
Thanks again @markmc for the review! Going to remove mentions to multiprocess_mode and unnecessary type hint changes for now.
Let me know what the recommendation is for Ray wrapper class injection (should we prefer static class variables for now?)
What you had the first time around with the static class variables was fine - I just asked to remove the extra |
858f20b to
a0f1b00
Compare
markmc
left a comment
There was a problem hiding this comment.
Looks great, thanks for your patience. Just some questions on the test you added
tests/metrics/test_ray_metrics.py
Outdated
There was a problem hiding this comment.
You're not actually verifying metrics behavior?
There was a problem hiding this comment.
Just a smoke test for now; added a comment.
There was a problem hiding this comment.
In the future it would be helpful to verify output to avoid regressions. But setting up a ray cluster in the test environment could be non-trivial.
So I think that ROI is okay with this to start, given the current design (class injection with a wrapper that's verified in ray-project).
eicherseiji
left a comment
There was a problem hiding this comment.
Responded to comments @markmc! 🙏
tests/metrics/test_ray_metrics.py
Outdated
There was a problem hiding this comment.
Just a smoke test for now; added a comment.
tests/metrics/test_ray_metrics.py
Outdated
There was a problem hiding this comment.
In the future it would be helpful to verify output to avoid regressions. But setting up a ray cluster in the test environment could be non-trivial.
So I think that ROI is okay with this to start, given the current design (class injection with a wrapper that's verified in ray-project).
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
4af2dd1 to
ffc3331
Compare
markmc
left a comment
There was a problem hiding this comment.
Thanks again for your patience!
Signed-off-by: Seiji Eicher <seiji@anyscale.com> Signed-off-by: Yuqi Zhang <yuqizhang@google.com>
Implements the following item from #10582:
Tested with V1 as follows (with ray-project/ray#52719):
VLLM_USE_V1=1 python serve.pyRay dashboard output:
