[Feature][gpt-oss] Add support for num_cached_tokens and num_reasoning_tokens tracking #23460

NagyGeorge · 2025-08-23T03:41:07Z

Purpose

This implements tracking for num_cached_tokens and num_reasoning_tokens in the Response API's ResponseUsage object as requested in issue #23363.

Before/After:

Before: num_cached_tokens and num_reasoning_tokens were always 0
After: These fields accurately reflect the actual cached and reasoning token usage

Fixes #23363

Test Plan

Pre-commit Validation: All pre-commit hooks pass
Existing Test Coverage: Rely on existing CI pipeline tests for HarmonyContext and StreamingHarmonyContext to validate no regressions

- Add _update_num_cached_tokens() method to track cached tokens from RequestOutput - Add _update_num_reasoning_tokens() method to track reasoning tokens based on: - Analysis channel content (parser.current_channel == 'analysis') - Tool directed messages - Integrate token tracking into append_output() methods for both context types - Cached tokens only tracked on first token in streaming mode Signed-off-by: George Nagy II <[email protected]>

github-actions · 2025-08-23T03:41:14Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors.

You ask your reviewers to trigger select CI tests on top of fastcheck CI.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

If you have any questions, please reach out to us on Slack at https://slack.vllm.ai.

🚀

gemini-code-assist

Code Review

This pull request adds tracking for num_cached_tokens and num_reasoning_tokens to HarmonyContext and StreamingHarmonyContext. The changes look good and correctly implement the token counting logic. I have one suggestion to improve the readability of a complex condition in the new _update_num_reasoning_tokens method. By breaking down the condition into smaller, named variables, the code becomes easier to understand and maintain.

vllm/entrypoints/context.py

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: George Nagy II <[email protected]>

heheda12345

LGTM! Thank you very much.

heheda12345 · 2025-09-03T00:22:48Z

@NagyGeorge Can you fix the pre-commit error?

NagyGeorge · 2025-09-03T07:51:42Z

@NagyGeorge Can you fix the pre-commit error?

@heheda12345 yes I'm out of town right now so I should be able to within a couple days.

Signed-off-by: Chen Zhang <[email protected]>

heheda12345 · 2025-09-03T18:56:53Z

Thanks for letting me know. I've formatted it.

…g_tokens tracking (vllm-project#23460) Signed-off-by: George Nagy II <[email protected]> Signed-off-by: Chen Zhang <[email protected]>

NagyGeorge requested a review from aarnphm as a code owner August 23, 2025 03:41

mergify bot added the frontend label Aug 23, 2025

gemini-code-assist bot reviewed Aug 23, 2025

View reviewed changes

vllm/entrypoints/context.py Outdated Show resolved Hide resolved

NagyGeorge and others added 2 commits August 23, 2025 03:25

Update vllm/entrypoints/context.py

c4ca23f

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: George Nagy II <[email protected]>

Merge branch 'main' into feature/work-branch

01c423e

heheda12345 approved these changes Sep 3, 2025

View reviewed changes

heheda12345 changed the title ~~[Feature] Add support for num_cached_tokens and num_reasoning_tokens tracking~~ [Feature][gpt-oss] Add support for num_cached_tokens and num_reasoning_tokens tracking Sep 3, 2025

heheda12345 added the gpt-oss Related to GPT-OSS models label Sep 3, 2025

NagyGeorge and others added 2 commits September 3, 2025 03:14

Merge branch 'vllm-project:main' into feature/work-branch

1b1d343

format

6c86103

Signed-off-by: Chen Zhang <[email protected]>

heheda12345 enabled auto-merge (squash) September 3, 2025 18:57

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 3, 2025

heheda12345 merged commit 36c260d into vllm-project:main Sep 3, 2025
45 of 47 checks passed

NagyGeorge deleted the feature/work-branch branch September 4, 2025 18:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Feature][gpt-oss] Add support for num_cached_tokens and num_reasoning_tokens tracking #23460

[Feature][gpt-oss] Add support for num_cached_tokens and num_reasoning_tokens tracking #23460

Uh oh!

NagyGeorge commented Aug 23, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Aug 23, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

heheda12345 left a comment

Uh oh!

heheda12345 commented Sep 3, 2025

Uh oh!

NagyGeorge commented Sep 3, 2025

Uh oh!

heheda12345 commented Sep 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

[Feature][gpt-oss] Add support for num_cached_tokens and num_reasoning_tokens tracking #23460

[Feature][gpt-oss] Add support for num_cached_tokens and num_reasoning_tokens tracking #23460

Uh oh!

Conversation

NagyGeorge commented Aug 23, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Uh oh!

github-actions bot commented Aug 23, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

heheda12345 left a comment

Choose a reason for hiding this comment

Uh oh!

heheda12345 commented Sep 3, 2025

Uh oh!

NagyGeorge commented Sep 3, 2025

Uh oh!

heheda12345 commented Sep 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

NagyGeorge commented Aug 23, 2025 •

edited by github-actions bot

Loading