[Metrics] Deprecate TPOT in favor of ITL by markmc · Pull Request #24110 · vllm-project/vllm

markmc · 2025-09-02T15:35:43Z

As per #24015, what we currently call as TPOT should instead be called ITL since what we are actually measuring is the time between iterations, and a single iteration can produce multiple tokens.

I'm flagging the TPOT metric as deprecated from 0.11 - even if this gets released on a 0.10.x release, I think we should only start the deprecation period from when it gets released in a new minor 0.N.0 release.

The only case where we don't want to assert the existance of a metric is where it is deprecated and we're not showing hidden deprecated metrics. Signed-off-by: Mark McLoughlin <markmc@redhat.com>

gemini-code-assist

Code Review

This pull request correctly deprecates the vllm:time_per_output_token_seconds (TPOT) metric in favor of the more accurately named vllm:inter_token_latency_seconds (ITL). The changes are consistently applied across the codebase, including metrics definitions, logging, tests, and the Grafana dashboard example. The deprecation strategy of retaining the old metric for backward compatibility while introducing the new one is sound. I've found one minor issue with the documentation of the new metric, which appears to be a copy-paste error.

vllm/engine/metrics.py

As per vllm-project#24015, what we currently call as TPOT should instead be called ITL since what we are actually measuring is the time between iterations, and a single iteration can produce multiple tokens. Signed-off-by: Mark McLoughlin <markmc@redhat.com>

DarkLight1337

LGTM, thanks for updating

* 'main' of https://github.com/845473182/vllm: (457 commits) [BugFix] Fix routed_scaling_factor double mul for dots1 and glm4 MoE models (vllm-project#24132) [Misc] Add check for dual_chunk_attention (vllm-project#24070) [Doc]: fix typos in Python comments (vllm-project#24115) [Doc]: fix typos in Python comments (vllm-project#24093) [Compile] Fix Compile Warning for `w4a8_mm_entry.cu` (vllm-project#23660) fix some typos (vllm-project#24071) [V1] Wrapper which plumbs request-level logits processors into vLLM batch-level logits processing (vllm-project#23656) Upgrade xgrammar to 0.1.23 (vllm-project#22988) Update release pipeline post PyTorch 2.8.0 update (vllm-project#24073) [XPU] Fix the bug of LoRA logits on the XPU platform (vllm-project#24081) [CI/Build] Disable SiluMul NVFP4 quant fusion tests (vllm-project#24121) [Bug] R1 Accuracy: Fix `routed_scaling_factor` Double Mul Issue (vllm-project#24119) [AMD][Kernel][Bugfix] Cast offsets tensor bn to tl.int64 to avoid GPU segfault (vllm-project#23692) [CI] Enable all hf transformers baselines in test_hybrid (vllm-project#23936) [Log] Only Print Profiler Results on Rank 0 (vllm-project#23370) Fix weights loading for Apertus (vllm-project#24100) [Metrics] Deprecate TPOT in favor of ITL (vllm-project#24110) [Bugfix] Fix packed_factor missing attribute error (vllm-project#23902) Run ruff format on a few files. (vllm-project#24075) [Bugfix] Fix transform_config parsing in Compressed Tensors (vllm-project#23945) ...

Signed-off-by: Mark McLoughlin <markmc@redhat.com>

The following are due for removal: - `vllm:gpu_cache_usage_perc` - `vllm:gpu_prefix_cache_queries` - `vllm:gpu_prefix_cache_hits` See vllm-project#18354 And the following is due to be hidden: - `vllm:time_per_output_token_seconds` See vllm-project#24110 The deprecation policy is documented [here](https://docs.vllm.ai/en/latest/usage/metrics/) > when metrics are deprecated in version X.Y, they are > hidden in version X.Y+1 but can be re-enabled using > the --show-hidden-metrics-for-version=X.Y escape hatch, > and are then removed in version X.Y+2. Signed-off-by: Mark McLoughlin <markmc@redhat.com>

[Metrics] Fix handling of deprecated metrics in openai test

b15e031

The only case where we don't want to assert the existance of a metric is where it is deprecated and we're not showing hidden deprecated metrics. Signed-off-by: Mark McLoughlin <markmc@redhat.com>

markmc requested review from DarkLight1337, WoosukKwon, aarnphm, alexm-redhat, comaniac, njhill, robertgshaw2-redhat, simon-mo, youkaichao, ywang96 and zhuohan123 as code owners September 2, 2025 15:35

markmc mentioned this pull request Sep 2, 2025

[V1][Metrics] Add per-request TPOT histogram #24015

Merged

5 tasks

mergify bot added documentation Improvements or additions to documentation v1 labels Sep 2, 2025

gemini-code-assist bot reviewed Sep 2, 2025

View reviewed changes

vllm/engine/metrics.py Outdated Show resolved Hide resolved

markmc force-pushed the metrics-rename-tpot-to-itl branch from b176439 to 09dbc43 Compare September 2, 2025 15:42

DarkLight1337 approved these changes Sep 2, 2025

View reviewed changes

DarkLight1337 enabled auto-merge (squash) September 2, 2025 15:48

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 2, 2025

DarkLight1337 merged commit 2417798 into vllm-project:main Sep 2, 2025
44 of 46 checks passed

eicherseiji pushed a commit to eicherseiji/vllm that referenced this pull request Sep 9, 2025

[Metrics] Deprecate TPOT in favor of ITL (vllm-project#24110)

01157ad

Signed-off-by: Mark McLoughlin <markmc@redhat.com>

FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025

[Metrics] Deprecate TPOT in favor of ITL (vllm-project#24110)

246f432

Signed-off-by: Mark McLoughlin <markmc@redhat.com>

markmc mentioned this pull request Oct 7, 2025

[Bug][Spec Decode]: TPOT in prometheus is ITL in vllm serve #19776

Closed

1 task

markmc mentioned this pull request Nov 24, 2025

[Metrics] Scheduled removal of deprecated metrics #29330

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Metrics] Deprecate TPOT in favor of ITL#24110

[Metrics] Deprecate TPOT in favor of ITL#24110
DarkLight1337 merged 2 commits intovllm-project:mainfrom
markmc:metrics-rename-tpot-to-itl

markmc commented Sep 2, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

DarkLight1337 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

markmc commented Sep 2, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

DarkLight1337 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

markmc commented Sep 2, 2025 •

edited by github-actions bot

Loading