[V1][Metrics] Deprecate metrics with gpu_ prefix for non GPU specific metrics. by sahelib25 · Pull Request #18354 · vllm-project/vllm

sahelib25 · 2025-05-19T14:22:45Z

This PR deprecates metrics with gpu_ prefix for existing non-GPU specific metrics-

gpu_cache_usage
gpu_prefix_cache_queries
gpu_prefix_cache_hits

and, introduce new metrics after renaming-

kv_cache_usage
prefix_cache_queries
prefix_cache_hits

github-actions · 2025-05-19T14:22:53Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

achandrasekar · 2025-05-22T03:08:51Z

vllm/v1/metrics/loggers.py

Renaming existing metrics will break anyone relying on this. Right way to do this would be to add new metrics with the new name and deprecate the old ones so we give enough notice before removing them in a future release. I'm not sure of the exact deprecation policy with vLLM, but it would be good to follow the policy here.

Thanks @achandrasekar , makes sense!
Referring to the metrics deprecation policy here,

Note: when metrics are deprecated in version X.Y, they are hidden in version X.Y+1 but can be re-enabled using the --show-hidden-metrics-for-version=X.Y escape hatch, and are then removed in version X.Y+2.

I have declared gpu_prefix_cache_queries and gpu_prefix_cache_hits as deprecated, and introduced the new ones. Could you please take a look at it?

It looks like we need to create separate pull requests for hiding and removing the metrics, following this one gets merged?

achandrasekar · 2025-05-22T03:09:15Z

vllm/v1/metrics/loggers.py

Is there a plan to rename gpu_cache_usage too?

Hi @achandrasekar ,
it looks like gpu_cache_usage is calculated in BlockPool.get_usage() as 1.0 - (self.get_num_free_blocks() / self.num_gpu_blocks), it could be a GPU specific metric?

This is applicable for TPUs too. We should probably name this kv_cache_usage instead of gpu_cache_usage.

I've updated the script, please have a look. Thanks!

Add metrics prefix_cache_queries and prefix_cache_hits Signed-off-by: Saheli Bhattacharjee <saheli@krai.ai>

Signed-off-by: Saheli Bhattacharjee <saheli@krai.ai>

markmc

Thank you for following the deprecation policy, lgtm

Signed-off-by: Saheli Bhattacharjee <saheli@krai.ai>

mergify · 2025-05-30T15:19:57Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @sahelib25.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Signed-off-by: Saheli Bhattacharjee <saheli@krai.ai>

In vllm-project#18354, these metrics were deprecated, and the change was included in the v0.9.2 release. We probably should only deprecate things in a v0.N.0 minor release, so let's say these were deprecated in v0.10.0. According to https://docs.vllm.ai/en/latest/usage/metrics.html: > Note: when metrics are deprecated in version X.Y, they are hidden in > version X.Y+1 but can be re-enabled using the > --show-hidden-metrics-for-version=X.Y escape hatch, and are then > removed in version X.Y+2. The deprecated metrics should be hidden in the v0.11.0 release, but with a --show-hidden-metrics-for-version=0.10 escape hatch. They should then be removed in the v0.12.0 release. Signed-off-by: Mark McLoughlin <markmc@redhat.com>

The following are due for removal: - `vllm:gpu_cache_usage_perc` - `vllm:gpu_prefix_cache_queries` - `vllm:gpu_prefix_cache_hits` See vllm-project#18354 And the following is due to be hidden: - `vllm:time_per_output_token_seconds` See vllm-project#24110 The deprecation policy is documented [here](https://docs.vllm.ai/en/latest/usage/metrics/) > when metrics are deprecated in version X.Y, they are > hidden in version X.Y+1 but can be re-enabled using > the --show-hidden-metrics-for-version=X.Y escape hatch, > and are then removed in version X.Y+2. Signed-off-by: Mark McLoughlin <markmc@redhat.com>

sahelib25 requested review from WoosukKwon, alexm-redhat, comaniac, njhill, robertgshaw2-redhat and ywang96 as code owners May 19, 2025 14:22

mergify bot added the v1 label May 19, 2025

achandrasekar reviewed May 22, 2025

View reviewed changes

Deprecate metrics gpu_prefix_cache_queries and gpu_prefix_cache_hits

50e4828

Add metrics prefix_cache_queries and prefix_cache_hits Signed-off-by: Saheli Bhattacharjee <saheli@krai.ai>

sahelib25 force-pushed the remove_gpu_prefix branch from aba4fa0 to 50e4828 Compare May 22, 2025 11:37

sahelib25 changed the title ~~[V1][Metrics] Remove gpu_ prefix from non GPU specific metrics.~~ [V1][Metrics] Deprecate metrics with gpu_ prefix from non GPU specific metrics. May 22, 2025

sahelib25 changed the title ~~[V1][Metrics] Deprecate metrics with gpu_ prefix from non GPU specific metrics.~~ [V1][Metrics] Deprecate metrics with gpu_ prefix for non GPU specific metrics. May 22, 2025

Fix pre-commit issues.

9175876

Signed-off-by: Saheli Bhattacharjee <saheli@krai.ai>

markmc approved these changes May 22, 2025

View reviewed changes

sahelib25 added 4 commits May 27, 2025 11:04

Rename gpu_cache_usage with kv_cache_usage.

d07aa66

Signed-off-by: Saheli Bhattacharjee <saheli@krai.ai>

Update script to follow the vLLM metrics deprecation policy.

b20df25

Signed-off-by: Saheli Bhattacharjee <saheli@krai.ai>

Merge branch 'main' into remove_gpu_prefix

dbff2c3

Merge branch 'main' into remove_gpu_prefix

7e499ef

mergify bot added the needs-rebase label May 30, 2025

Merge branch 'main' into remove_gpu_prefix

b28879b

mergify bot removed the needs-rebase label May 30, 2025

Fix bug.

33810e5

Signed-off-by: Saheli Bhattacharjee <saheli@krai.ai>

mgoin added the ready ONLY add when PR is ready to merge/full CI is needed label Jun 13, 2025

DarkLight1337 merged commit d1e34cc into vllm-project:main Jun 14, 2025
70 checks passed

psyhtest deleted the remove_gpu_prefix branch August 8, 2025 10:48

markmc mentioned this pull request Sep 4, 2025

[Metrics] Hide deprecated metrics with gpu_ prefix #24245

Merged

markmc mentioned this pull request Nov 24, 2025

[Metrics] Scheduled removal of deprecated metrics #29330

Merged

ezrasilvera mentioned this pull request Nov 26, 2025

Use the correct vllm metric gpu_cache_usage_perc --> kv_cache_usage_perc kubernetes-sigs/gateway-api-inference-extension#1905

Merged

liu-cong mentioned this pull request Nov 26, 2025

[Feature]: Conformance test for Gateway API Inference Extension #29508

Closed

1 task

markmc mentioned this pull request Dec 4, 2025

docs: update metrics design doc to use new vllm:kv_cache_usage_perc #30041

Merged

haitwang-cloud mentioned this pull request Dec 5, 2025

[Docs]: update metric names for clarity and consistency vllm-project/aibrix#1822

Open

adam-d-young mentioned this pull request Dec 19, 2025

🧠 llmisvc: set kv-cache metric to vllm:kv_cache_usage_perc opendatahub-io/kserve#1020

Open

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[V1][Metrics] Deprecate metrics with gpu_ prefix for non GPU specific metrics.#18354

[V1][Metrics] Deprecate metrics with gpu_ prefix for non GPU specific metrics.#18354
DarkLight1337 merged 8 commits intovllm-project:mainfrom
krai:remove_gpu_prefix

sahelib25 commented May 19, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented May 19, 2025

Uh oh!

achandrasekar May 22, 2025

Uh oh!

sahelib25 May 22, 2025

Uh oh!

achandrasekar May 22, 2025

Uh oh!

sahelib25 May 22, 2025

Uh oh!

achandrasekar May 22, 2025

Uh oh!

sahelib25 May 27, 2025

Uh oh!

markmc left a comment

Uh oh!

mergify bot commented May 30, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

Conversation

sahelib25 commented May 19, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented May 19, 2025

Uh oh!

achandrasekar May 22, 2025

Choose a reason for hiding this comment

Uh oh!

sahelib25 May 22, 2025

Choose a reason for hiding this comment

Uh oh!

achandrasekar May 22, 2025

Choose a reason for hiding this comment

Uh oh!

sahelib25 May 22, 2025

Choose a reason for hiding this comment

Uh oh!

achandrasekar May 22, 2025

Choose a reason for hiding this comment

Uh oh!

sahelib25 May 27, 2025

Choose a reason for hiding this comment

Uh oh!

markmc left a comment

Choose a reason for hiding this comment

Uh oh!

mergify bot commented May 30, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

sahelib25 commented May 19, 2025 •

edited by github-actions bot

Loading