Skip to content

Add spec_verify_calls_total metric for speculative decoding#25689

Merged
merrymercy merged 1 commit into
mainfrom
add-spec-verify-calls-metric
May 19, 2026
Merged

Add spec_verify_calls_total metric for speculative decoding#25689
merrymercy merged 1 commit into
mainfrom
add-spec-verify-calls-metric

Conversation

@merrymercy
Copy link
Copy Markdown
Contributor

@merrymercy merrymercy commented May 18, 2026

Summary

  • Add a new Prometheus counter sglang:spec_verify_calls_total that tracks the number of speculative decoding verification calls per request.
  • The counter is incremented in observe_one_finished_request when spec_verify_ct > 0.
  • The spec_verify_ct field is extracted from the recv object in tokenizer_manager, with safe fallback to 0 if unavailable.

Original commits

  • 318a5f14e

Test plan

  • Verify metric is registered and visible at /metrics endpoint
  • Confirm counter increments correctly during speculative decoding
  • Confirm no impact when speculative decoding is not in use (counter stays at 0)

CI States

Latest PR Test (Base): ❌ Run #26062371070
Latest PR Test (Extra): ⚠️ Not enabled -- add run-ci-extra label to opt in.

Add a new Prometheus counter `sglang:spec_verify_calls_total` that tracks
the number of speculative decoding verification calls per request. The
metric is incremented in `observe_one_finished_request` when `spec_verify_ct`
is greater than zero.

Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@merrymercy
Copy link
Copy Markdown
Contributor Author

/tag-and-rerun-ci

@merrymercy merrymercy merged commit b45b52e into main May 19, 2026
198 of 236 checks passed
@merrymercy merrymercy deleted the add-spec-verify-calls-metric branch May 19, 2026 01:35
Shunkangz pushed a commit to Shunkangz/sglang that referenced this pull request May 27, 2026
alphabetc1 pushed a commit to alphabetc1/sglang that referenced this pull request Jun 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant