[Metrics] Complete removal of deprecated vllm:time_per_output_token_seconds metric by carlory · Pull Request #32661 · vllm-project/vllm

carlory · 2026-01-20T10:34:26Z

Summary

This PR completes the removal of the deprecated vllm:time_per_output_token_seconds metric that was deprecated in v0.11, hidden in v0.12, and scheduled for removal in v0.13.

Changes Made

1. Code Removal (vllm/v1/metrics/loggers.py)

Removed deprecated histogram definition (39 lines)
Removed conditional observation to deprecated metric

2. Test Updates (tests/entrypoints/instrumentator/test_metrics.py)

Removed from HIDDEN_DEPRECATED_METRICS list
Updated _get_expected_values() to use vllm:inter_token_latency_seconds
Removed from EXPECTED_METRICS_V1 list

3. Dashboard Updates

Grafana: 10 references updated to vllm:inter_token_latency_seconds
Perses: 10 references updated to vllm:inter_token_latency_seconds

4. Documentation

Updated metrics.md with correct metric reference

Test Validation

✅ Python syntax checks passed
✅ JSON validation passed
✅ YAML validation passed
✅ No deprecated metric references remain
✅ 34+ replacement metric references confirmed

Notes

vllm:request_time_per_output_token_seconds (different metric) preserved
Replacement metric has identical functionality and buckets
Complete removal following v0.13 deprecation policy

mergify · 2026-01-20T10:35:04Z

Documentation preview: https://vllm--32661.org.readthedocs.build/en/32661/

gemini-code-assist

Code Review

This pull request completes the removal of the deprecated vllm:time_per_output_token_seconds metric. The changes are comprehensive, covering code, tests, documentation, and dashboard configurations. The deprecated metric is consistently replaced with vllm:inter_token_latency_seconds. The removal of the old metric's definition and observation logic in vllm/v1/metrics/loggers.py is clean. The corresponding test updates in tests/entrypoints/instrumentator/test_metrics.py correctly reflect this removal. The updates to Grafana and Perses dashboards, as well as the documentation, are also correct. The changes are well-executed and I have no issues to report.

…econds metric This commit completes the removal of the deprecated metrics that were: - Deprecated in v0.11 (replaced by vllm:inter_token_latency_seconds) - Hidden in v0.12 (behind --show-hidden-metrics-for-version=0.11 flag) - Completely removed in v0.13 (this commit) Changes: 1. Removed deprecated histogram definition from PrometheusStatLogger 2. Updated test files to use replacement metric vllm:inter_token_latency_seconds 3. Updated Grafana dashboard with replacement metric (10 references) 4. Updated Perses dashboard with replacement metric (10 references) 5. Updated design documentation to reflect current metrics The replacement metric vllm:inter_token_latency_seconds has identical functionality and bucket definitions. The different metric vllm:request_time_per_output_token_seconds is preserved as it is still actively used. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com> Signed-off-by: carlory <baofa.fan@daocloud.io>

markmc · 2026-01-20T10:46:33Z

Thank you!

This duplicates #30992 and #31675 but looks more comprehensive

carlory · 2026-01-20T10:54:06Z

This duplicates #30992 and #31675

Sorry for that, I didn't know there're another PRs.

…econds metric (vllm-project#32661) This PR completes the removal of the deprecated vllm:time_per_output_token_seconds metric that was deprecated in v0.11, hidden in v0.12, scheduled for removal in v0.13, but delayed until v0.15. Signed-off-by: carlory <baofa.fan@daocloud.io> Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>

…econds metric (vllm-project#32661) This PR completes the removal of the deprecated vllm:time_per_output_token_seconds metric that was deprecated in v0.11, hidden in v0.12, scheduled for removal in v0.13, but delayed until v0.15. Signed-off-by: carlory <baofa.fan@daocloud.io> Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com> Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>

…econds metric (vllm-project#32661) This PR completes the removal of the deprecated vllm:time_per_output_token_seconds metric that was deprecated in v0.11, hidden in v0.12, scheduled for removal in v0.13, but delayed until v0.15. Signed-off-by: carlory <baofa.fan@daocloud.io> Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com> Signed-off-by: mohammad najafi <mohammad.najafi@amd.com>

…econds metric (vllm-project#32661) This PR completes the removal of the deprecated vllm:time_per_output_token_seconds metric that was deprecated in v0.11, hidden in v0.12, scheduled for removal in v0.13, but delayed until v0.15. Signed-off-by: carlory <baofa.fan@daocloud.io> Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>

carlory requested review from DarkLight1337, NickLucche, aarnphm, markmc and robertgshaw2-redhat as code owners January 20, 2026 10:34

mergify bot added documentation Improvements or additions to documentation v1 labels Jan 20, 2026

gemini-code-assist bot reviewed Jan 20, 2026

View reviewed changes

carlory force-pushed the metrics-0-13-removal branch from 7d9f21f to 3c455d7 Compare January 20, 2026 10:36

markmc added this to Metrics & Tracing Jan 20, 2026

github-project-automation bot moved this to Backlog in Metrics & Tracing Jan 20, 2026

markmc moved this from Backlog to Ready in Metrics & Tracing Jan 20, 2026

markmc added the ready ONLY add when PR is ready to merge/full CI is needed label Jan 20, 2026

markmc enabled auto-merge (squash) January 20, 2026 10:47

This was referenced Jan 20, 2026

[Misc] Remove deprecated metric vllm:time_per_output_token_seconds for v0.13 release #30992

Closed

[Cleanup] Remove deprecated vllm:time_per_output_token_seconds metric #31675

Closed

markmc approved these changes Jan 20, 2026

View reviewed changes

markmc merged commit bb91720 into vllm-project:main Jan 20, 2026
49 checks passed

github-project-automation bot moved this from Ready to Done in Metrics & Tracing Jan 20, 2026

carlory deleted the metrics-0-13-removal branch January 21, 2026 02:04

markmc moved this from Done to Done - 0.15 in Metrics & Tracing Feb 4, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Metrics] Complete removal of deprecated vllm:time_per_output_token_seconds metric#32661

[Metrics] Complete removal of deprecated vllm:time_per_output_token_seconds metric#32661
markmc merged 1 commit intovllm-project:mainfrom
carlory:metrics-0-13-removal

carlory commented Jan 20, 2026

Uh oh!

mergify bot commented Jan 20, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

markmc commented Jan 20, 2026

Uh oh!

carlory commented Jan 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

carlory commented Jan 20, 2026

Summary

Changes Made

1. Code Removal (vllm/v1/metrics/loggers.py)

2. Test Updates (tests/entrypoints/instrumentator/test_metrics.py)

3. Dashboard Updates

4. Documentation

Test Validation

Notes

Uh oh!

mergify bot commented Jan 20, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

markmc commented Jan 20, 2026

Uh oh!

carlory commented Jan 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants