Skip to content

fix: prefill stats added to EngineCoreOutput#133

Merged
njhill merged 1 commit intomainfrom
nick/fix-output-msgpack
Apr 24, 2026
Merged

fix: prefill stats added to EngineCoreOutput#133
njhill merged 1 commit intomainfrom
nick/fix-output-msgpack

Conversation

@njhill
Copy link
Copy Markdown
Member

@njhill njhill commented Apr 24, 2026

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request replaces the num_cached_tokens and num_external_computed_tokens fields in EngineCoreOutput with a new PrefillStats struct, providing a more granular breakdown of prefill token sources. The metrics tracking logic has been updated to utilize this new structure, and the vllm:prompt_tokens_recomputed metric was removed. Feedback suggests implementing a fallback in the metrics tracker to record total prompt tokens if the prefill_stats object is absent, ensuring consistency between cumulative counters and request histograms.

Comment on lines +100 to +102
if let Some(prefill_stats) = &output.prefill_stats {
record_prompt_tokens(&self.model_name, engine_index, prefill_stats);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If prefill_stats is missing from the first output (e.g., due to an unexpected backend response or protocol mismatch), the vllm:prompt_tokens counter will not be incremented at all, while the vllm:request_prompt_tokens histogram in record_finished will still be updated using self.prompt_len. This creates a discrepancy between cumulative counters and request histograms. Consider providing a fallback that records the total prompt tokens even if the breakdown is missing.

Suggested change
if let Some(prefill_stats) = &output.prefill_stats {
record_prompt_tokens(&self.model_name, engine_index, prefill_stats);
}
if let Some(prefill_stats) = &output.prefill_stats {
record_prompt_tokens(&self.model_name, engine_index, prefill_stats);
} else {
metrics()
.prompt_tokens
.get_or_create(&engine_labels(&self.model_name, engine_index))
.inc_by(self.prompt_len as u64);
}

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c72e817046

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/llm/src/request_metrics.rs
@njhill njhill merged commit 49f7b3f into main Apr 24, 2026
3 checks passed
@njhill njhill deleted the nick/fix-output-msgpack branch April 24, 2026 00:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant