fix: standardize all planner ttft/itl units to float ms and fix docs#3673
Conversation
WalkthroughThis pull request converts timing metrics TTFT (Time to First Token) and ITL (Inter-Token Latency) from seconds to milliseconds across the profiler and planner codebase. Changes include updating argument parsers, default values, configuration files, core logic conversions, and documentation to consistently express these metrics in milliseconds. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes The changes involve systematic unit conversion across 11 files with primarily repetitive numeric updates and documentation adjustments. However, the logic modifications in Poem
Pre-merge checks❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
benchmarks/profiler/utils/profiler_argparse.py(2 hunks)components/src/dynamo/planner/defaults.py(1 hunks)components/src/dynamo/planner/utils/perf_interpolation.py(6 hunks)components/src/dynamo/planner/utils/planner_argparse.py(1 hunks)components/src/dynamo/planner/utils/planner_core.py(2 hunks)docs/benchmarks/pre_deployment_profiling.md(2 hunks)docs/kubernetes/sla_planner_quickstart.md(1 hunks)tests/planner/README.md(2 hunks)tests/planner/perf_test_configs/disagg_8b_planner.yaml(1 hunks)tests/planner/scaling/disagg_planner.yaml(1 hunks)tests/planner/test_replica_calculation.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
components/src/dynamo/planner/utils/planner_argparse.py (1)
components/src/dynamo/planner/defaults.py (1)
SLAPlannerDefaults(85-96)
components/src/dynamo/planner/utils/planner_core.py (1)
components/src/dynamo/planner/utils/prometheus.py (2)
get_avg_time_to_first_token(103-109)get_avg_inter_token_latency(95-101)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: trtllm (arm64)
- GitHub Check: vllm (arm64)
- GitHub Check: Build and Test - dynamo
🔇 Additional comments (13)
components/src/dynamo/planner/utils/planner_core.py (2)
252-266: LGTM! Conversion from seconds to milliseconds is correct.The conversion logic properly handles Prometheus metrics (which return seconds) by multiplying by 1000 to convert to milliseconds. The clarifying comment is helpful for future maintainers.
294-294: LGTM! Log message correctly reflects millisecond units.The log output now properly displays TTFT and ITL values with "ms" units, consistent with the conversion applied above.
components/src/dynamo/planner/defaults.py (1)
92-93: LGTM! Default values correctly converted to milliseconds.The conversions are mathematically correct:
- TTFT: 0.5s → 500.0ms
- ITL: 0.05s → 50.0ms
The float type provides appropriate precision for millisecond values.
tests/planner/test_replica_calculation.py (1)
52-53: LGTM! Test fixture values correctly use float milliseconds.The conversion from integer to float (80 → 80.0, 10 → 10.0) aligns with the standardization to float millisecond units across the codebase while maintaining the same test semantics.
tests/planner/scaling/disagg_planner.yaml (1)
60-61: LGTM! Configuration values correctly converted to milliseconds.The conversions are accurate:
- TTFT: 0.1s → 100ms
- ITL: 0.01s → 10ms
tests/planner/perf_test_configs/disagg_8b_planner.yaml (1)
90-91: LGTM! Test configuration values correctly converted to milliseconds.The conversions are accurate:
- TTFT: 0.2s → 200ms
- ITL: 0.01s → 10ms
docs/kubernetes/sla_planner_quickstart.md (1)
209-209: LGTM! Documentation correctly reflects millisecond units.The example log output now accurately shows the "ms" units that match the actual planner output after the conversion changes.
benchmarks/profiler/utils/profiler_argparse.py (1)
182-190: LGTM! Argument types and defaults correctly updated to float milliseconds.The changes properly standardize TTFT and ITL arguments:
- Type change from
inttofloatenables millisecond precision- Defaults updated to float values (50.0ms, 10.0ms)
- Help text clearly documents the expected unit and type
docs/benchmarks/pre_deployment_profiling.md (1)
122-124: LGTM! Documentation correctly updated to reflect float millisecond units.The comments and examples now accurately document that TTFT and ITL values should be provided as floats in milliseconds, consistent with the profiler argument parser changes.
Also applies to: 293-294
components/src/dynamo/planner/utils/planner_argparse.py (1)
93-93: LGTM! Clear documentation of millisecond units.The help text updates correctly document that TTFT and ITL are now in milliseconds. The multi-line formatting for
--itlimproves readability and consistency with the--ttftargument style.Also applies to: 96-99
tests/planner/README.md (1)
37-38: LGTM! Documentation consistently reflects millisecond units.All examples, CLI flags, and output messages have been updated to use millisecond units. The documentation is now consistent with the code changes across the PR.
Also applies to: 43-43, 50-51, 55-55, 59-59, 64-64, 114-115
components/src/dynamo/planner/utils/perf_interpolation.py (2)
231-232: LGTM! CLI and output formatting correctly use milliseconds.The CLI argument defaults and help text clearly indicate milliseconds, and all print statements consistently display values with the "ms" suffix. The formatting is appropriate for the new unit system.
Also applies to: 236-236, 249-249, 253-253, 275-275, 282-282
151-151: LGTM! Helpful inline documentation.The inline comment clarifies the unit assumption at a critical point in the decode interpolation logic.
b4c4dd8 to
065c0fa
Compare
065c0fa to
9593188
Compare
Signed-off-by: William Arnold <7565007+Aphoh@users.noreply.github.com>
9593188 to
09e0823
Compare
…3673) Signed-off-by: William Arnold <7565007+Aphoh@users.noreply.github.com>
…3673) Signed-off-by: William Arnold <7565007+Aphoh@users.noreply.github.com>
…3673) Signed-off-by: William Arnold <7565007+Aphoh@users.noreply.github.com>
…i-dynamo#3673) Signed-off-by: William Arnold <7565007+Aphoh@users.noreply.github.com>
Overview:
Convert planner units to all use ms. Before some tools took float seconds, float milliseconds, or integer milliseconds. Let's just use float ms.
Summary by CodeRabbit
Chores
Documentation