Skip to content

fix: standardize all planner ttft/itl units to float ms and fix docs#3673

Merged
Aphoh merged 1 commit into
mainfrom
warnold/standardize-planner-units
Oct 17, 2025
Merged

fix: standardize all planner ttft/itl units to float ms and fix docs#3673
Aphoh merged 1 commit into
mainfrom
warnold/standardize-planner-units

Conversation

@Aphoh
Copy link
Copy Markdown
Contributor

@Aphoh Aphoh commented Oct 16, 2025

Overview:

Convert planner units to all use ms. Before some tools took float seconds, float milliseconds, or integer milliseconds. Let's just use float ms.

Summary by CodeRabbit

  • Chores

    • Updated profiler and planner timing metric units from seconds to milliseconds for TTFT (Time to First Token) and ITL (Inter-Token Latency). Configuration values and command-line arguments now accept milliseconds.
  • Documentation

    • Updated profiling and deployment guides to reflect timing metrics now expressed in milliseconds.
    • Updated example commands, configurations, and output samples with millisecond-based values.

@Aphoh Aphoh requested a review from a team as a code owner October 16, 2025 17:32
@Aphoh Aphoh requested a review from a team October 16, 2025 17:32
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Oct 16, 2025

Walkthrough

This pull request converts timing metrics TTFT (Time to First Token) and ITL (Inter-Token Latency) from seconds to milliseconds across the profiler and planner codebase. Changes include updating argument parsers, default values, configuration files, core logic conversions, and documentation to consistently express these metrics in milliseconds.

Changes

Cohort / File(s) Summary
Argument Parsers
benchmarks/profiler/utils/profiler_argparse.py, components/src/dynamo/planner/utils/planner_argparse.py
Updated TTFT and ITL argument definitions to accept floats with millisecond units; changed default values and help text to reflect ms semantics.
Defaults and Configuration
components/src/dynamo/planner/defaults.py
Updated SLAPlannerDefaults.ttft and SLAPlannerDefaults.itl from second-based values (0.5, 0.05) to millisecond-based values (500.0, 50.0).
Core Logic
components/src/dynamo/planner/utils/perf_interpolation.py
Removed divisions by 1000; treated TTFT/ITL values natively as milliseconds throughout interpolation scaffolding and CLI output; added clarifying comments for unit semantics.
Metrics Conversion
components/src/dynamo/planner/utils/planner_core.py
Added conversion from Prometheus-sourced seconds to milliseconds (multiply by 1000) in observe_metrics; updated log messages to display values in ms.
Documentation
docs/benchmarks/pre_deployment_profiling.md, docs/kubernetes/sla_planner_quickstart.md, tests/planner/README.md
Updated example values, expected outputs, and explanatory text to reflect millisecond units for TTFT and ITL (e.g., 0.2s → 200ms).
Test Configuration
tests/planner/perf_test_configs/disagg_8b_planner.yaml, tests/planner/scaling/disagg_planner.yaml
Updated CLI flag values for --ttft and --itl from second-based decimals (0.2, 0.1, 0.01) to millisecond integers (200, 100, 10).
Test Fixtures
tests/planner/test_replica_calculation.py
Updated planner fixture TTFT and ITL literals to floats (80 → 80.0, 10 → 10.0) for consistency with millisecond semantics.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

The changes involve systematic unit conversion across 11 files with primarily repetitive numeric updates and documentation adjustments. However, the logic modifications in perf_interpolation.py and planner_core.py (removal of 1000 divisors and conversion operations) require careful verification to ensure consistent application across prefill/decode paths and metric capture points.

Poem

🐰 Hops through metrics with glee,
From seconds to milliseconds, wild and free!
TTFT and ITL, now in tiny ms bits,
Your benchmarks shall sparkle—what a perfect fit!

Pre-merge checks

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Description Check ⚠️ Warning The PR description is largely incomplete when compared against the required template structure. While an Overview section is provided and conveys the main purpose (converting planner units to use float milliseconds), the description is missing three critical sections: Details (which should describe specific changes made), Where should the reviewer start (identifying files to review), and Related Issues (to link any associated GitHub issues). The current description provides only a high-level summary without the structured information expected by the template.
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (1 passed)
Check name Status Explanation
Title Check ✅ Passed The PR title "standardize all planner ttft/itl units to float ms and fix docs" is specific, clear, and directly summarizes the main objective of the changeset. It accurately conveys that the primary change involves standardizing TTFT/ITL units across the planner to use float milliseconds and updating documentation accordingly. The title is neither vague nor misleading, and it aligns with the actual changes evident in the raw summary across multiple files.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@Aphoh Aphoh changed the title Standardize all planner ttft/itl units to float ms fix: standardize all planner ttft/itl units to float ms and fix docs Oct 16, 2025
@github-actions github-actions Bot added the fix label Oct 16, 2025
@Aphoh Aphoh requested a review from tedzhouhk October 16, 2025 17:33
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2f793b4 and b4c4dd8.

📒 Files selected for processing (11)
  • benchmarks/profiler/utils/profiler_argparse.py (2 hunks)
  • components/src/dynamo/planner/defaults.py (1 hunks)
  • components/src/dynamo/planner/utils/perf_interpolation.py (6 hunks)
  • components/src/dynamo/planner/utils/planner_argparse.py (1 hunks)
  • components/src/dynamo/planner/utils/planner_core.py (2 hunks)
  • docs/benchmarks/pre_deployment_profiling.md (2 hunks)
  • docs/kubernetes/sla_planner_quickstart.md (1 hunks)
  • tests/planner/README.md (2 hunks)
  • tests/planner/perf_test_configs/disagg_8b_planner.yaml (1 hunks)
  • tests/planner/scaling/disagg_planner.yaml (1 hunks)
  • tests/planner/test_replica_calculation.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
components/src/dynamo/planner/utils/planner_argparse.py (1)
components/src/dynamo/planner/defaults.py (1)
  • SLAPlannerDefaults (85-96)
components/src/dynamo/planner/utils/planner_core.py (1)
components/src/dynamo/planner/utils/prometheus.py (2)
  • get_avg_time_to_first_token (103-109)
  • get_avg_inter_token_latency (95-101)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: trtllm (arm64)
  • GitHub Check: vllm (arm64)
  • GitHub Check: Build and Test - dynamo
🔇 Additional comments (13)
components/src/dynamo/planner/utils/planner_core.py (2)

252-266: LGTM! Conversion from seconds to milliseconds is correct.

The conversion logic properly handles Prometheus metrics (which return seconds) by multiplying by 1000 to convert to milliseconds. The clarifying comment is helpful for future maintainers.


294-294: LGTM! Log message correctly reflects millisecond units.

The log output now properly displays TTFT and ITL values with "ms" units, consistent with the conversion applied above.

components/src/dynamo/planner/defaults.py (1)

92-93: LGTM! Default values correctly converted to milliseconds.

The conversions are mathematically correct:

  • TTFT: 0.5s → 500.0ms
  • ITL: 0.05s → 50.0ms

The float type provides appropriate precision for millisecond values.

tests/planner/test_replica_calculation.py (1)

52-53: LGTM! Test fixture values correctly use float milliseconds.

The conversion from integer to float (80 → 80.0, 10 → 10.0) aligns with the standardization to float millisecond units across the codebase while maintaining the same test semantics.

tests/planner/scaling/disagg_planner.yaml (1)

60-61: LGTM! Configuration values correctly converted to milliseconds.

The conversions are accurate:

  • TTFT: 0.1s → 100ms
  • ITL: 0.01s → 10ms
tests/planner/perf_test_configs/disagg_8b_planner.yaml (1)

90-91: LGTM! Test configuration values correctly converted to milliseconds.

The conversions are accurate:

  • TTFT: 0.2s → 200ms
  • ITL: 0.01s → 10ms
docs/kubernetes/sla_planner_quickstart.md (1)

209-209: LGTM! Documentation correctly reflects millisecond units.

The example log output now accurately shows the "ms" units that match the actual planner output after the conversion changes.

benchmarks/profiler/utils/profiler_argparse.py (1)

182-190: LGTM! Argument types and defaults correctly updated to float milliseconds.

The changes properly standardize TTFT and ITL arguments:

  • Type change from int to float enables millisecond precision
  • Defaults updated to float values (50.0ms, 10.0ms)
  • Help text clearly documents the expected unit and type
docs/benchmarks/pre_deployment_profiling.md (1)

122-124: LGTM! Documentation correctly updated to reflect float millisecond units.

The comments and examples now accurately document that TTFT and ITL values should be provided as floats in milliseconds, consistent with the profiler argument parser changes.

Also applies to: 293-294

components/src/dynamo/planner/utils/planner_argparse.py (1)

93-93: LGTM! Clear documentation of millisecond units.

The help text updates correctly document that TTFT and ITL are now in milliseconds. The multi-line formatting for --itl improves readability and consistency with the --ttft argument style.

Also applies to: 96-99

tests/planner/README.md (1)

37-38: LGTM! Documentation consistently reflects millisecond units.

All examples, CLI flags, and output messages have been updated to use millisecond units. The documentation is now consistent with the code changes across the PR.

Also applies to: 43-43, 50-51, 55-55, 59-59, 64-64, 114-115

components/src/dynamo/planner/utils/perf_interpolation.py (2)

231-232: LGTM! CLI and output formatting correctly use milliseconds.

The CLI argument defaults and help text clearly indicate milliseconds, and all print statements consistently display values with the "ms" suffix. The formatting is appropriate for the new unit system.

Also applies to: 236-236, 249-249, 253-253, 275-275, 282-282


151-151: LGTM! Helpful inline documentation.

The inline comment clarifies the unit assumption at a critical point in the decode interpolation logic.

Comment thread components/src/dynamo/planner/utils/perf_interpolation.py
@Aphoh Aphoh force-pushed the warnold/standardize-planner-units branch from b4c4dd8 to 065c0fa Compare October 17, 2025 16:27
@Aphoh Aphoh enabled auto-merge (squash) October 17, 2025 16:29
@Aphoh Aphoh force-pushed the warnold/standardize-planner-units branch from 065c0fa to 9593188 Compare October 17, 2025 16:45
Signed-off-by: William Arnold <7565007+Aphoh@users.noreply.github.com>
@Aphoh Aphoh force-pushed the warnold/standardize-planner-units branch from 9593188 to 09e0823 Compare October 17, 2025 17:05
@ai-dynamo ai-dynamo deleted a comment from copy-pr-bot Bot Oct 17, 2025
@Aphoh Aphoh merged commit ee56782 into main Oct 17, 2025
32 of 35 checks passed
@Aphoh Aphoh deleted the warnold/standardize-planner-units branch October 17, 2025 18:33
Aphoh added a commit that referenced this pull request Oct 17, 2025
…3673)

Signed-off-by: William Arnold <7565007+Aphoh@users.noreply.github.com>
ziqifan617 pushed a commit that referenced this pull request Oct 20, 2025
…3673)

Signed-off-by: William Arnold <7565007+Aphoh@users.noreply.github.com>
nv-kmcgill53 pushed a commit that referenced this pull request Oct 23, 2025
…3673)

Signed-off-by: William Arnold <7565007+Aphoh@users.noreply.github.com>
yao531441 pushed a commit to yao531441/dynamo that referenced this pull request May 13, 2026
…i-dynamo#3673)

Signed-off-by: William Arnold <7565007+Aphoh@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants