-
Notifications
You must be signed in to change notification settings - Fork 2
feat: add --fixed-schedule-speedup parameter for scaling loadgen #309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
# Conflicts: # aiperf/timing/config.py
WalkthroughAdds a new fixed_schedule_speedup option threaded from InputDefaults through InputConfig and TimingManagerConfig into FixedScheduleStrategy, which scales fixed-schedule timestamp offsets during wait calculations. Test suite expanded to cover speedup behavior across default, auto/manual offsets, and edge cases. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor User
participant CLI as CLI / InputConfig
participant TMC as TimingManagerConfig
participant FSS as FixedScheduleStrategy
participant CLK as Clock
participant CM as CreditManager
User->>CLI: Set --fixed-schedule-speedup (float | None)
CLI->>TMC: from_user_config(input.fixed_schedule_speedup)
TMC->>FSS: Initialize with fixed_schedule_speedup
Note over FSS: _time_scale = speedup or 1.0
loop For each schedule timestamp
FSS->>CLK: Now()
CLK-->>FSS: t_now
Note right of FSS: scaled_offset = (ts - schedule_zero_ms) / _time_scale
FSS->>FSS: wait_ms = scaled_offset - (t_now - start_ms)
alt wait_ms > 0
FSS->>CLK: Sleep(wait_ms)
CLK-->>FSS: Wakes
else wait_ms <= 0
Note over FSS: Execute immediately
end
FSS->>CM: Request credit
CM-->>FSS: Grant/deny
FSS->>FSS: Proceed / drop based on credit
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (5)
🧰 Additional context used🧠 Learnings (2)📚 Learning: 2025-09-24T23:02:10.982Z
Applied to files:
📚 Learning: 2025-09-24T23:02:10.982Z
Applied to files:
🧬 Code graph analysis (3)aiperf/common/config/input_config.py (2)
aiperf/timing/config.py (1)
tests/timing_manager/test_fixed_schedule_strategy.py (4)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
🔇 Additional comments (13)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
), | ||
] = InputDefaults.FIXED_SCHEDULE_END_OFFSET | ||
|
||
# NEW AIPerf Option |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Instead of calling it 'New AIPerf Option,' maybe we could add a descriptive comment about the option itself? Since this option won’t stay 'new' for long. :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The description itself is in the data right below it. I was mostly using as a way to track features that do not have GenAI-Perf equivalents for tracking feature parity. I think maybe it is safe to remove these comments, and just use the ones inside the alias portions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing also sounds good to me.
Thank you @ajcasagrande
@pytest.mark.parametrize( | ||
"speedup,schedule", | ||
[ | ||
# 2x faster - should take half the time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of additional tests I can think of:
-
speedup = 0.0
Looks like there’s no test for this. That case would hit base_duration_sec / speedup and cause a div-by-zero. -
speedup < 0
Negative values aren’t tested. That produces a negative time_scale (effectively time going backwards). It’s unclear whether the intended behavior is to error out or actually “rewind” the schedule. -
Extreme floating-point cases
You’ve got tests at 0.001 and 1000.0, but nothing near the edges of floating-point behavior. Something like 1e-9 (super slow) or 1e9 (super fast) could trigger precision issues. -
Uneven schedules
All the current tests use evenly spaced timestamps. A case like (0, "a"), (1, "b"), (1000, "c") would check whether scaling compresses or stretches things as expected. -
Overlap after scaling
With a large speedup (say 1000x), two nearby events could collapse into the same timestamp after rounding. Right now tests assume spacing survives scaling.
Do we want to add some checks around this and warn user about this? -
Very large timestamps
Something like (10**9, "conv1") could overflow ints or lose precision once scaled. No test covers that edge case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ganeshku1 I can look into adding these, but some of them would more be config tests:
- speedup = 0.0
Looks like there’s no test for this. That case would hit base_duration_sec / speedup and cause a div-by-zero.
not possible, as config option is specified as
gt=0
. Also the speedup calculation takes those into accounttime_scale = 1 / (speedup or 1)
, but can look at adding tests for the config
- speedup < 0
Negative values aren’t tested. That produces a negative time_scale (effectively time going backwards). It’s unclear whether the intended behavior is to error out or actually “rewind” the schedule.
Again, not possible due to
gt=0
, but can look at adding tests for the config
-
Extreme floating-point cases
You’ve got tests at 0.001 and 1000.0, but nothing near the edges of floating-point behavior. Something like 1e-9 (super slow) or 1e9 (super fast) could trigger precision issues. -
Uneven schedules
All the current tests use evenly spaced timestamps. A case like (0, "a"), (1, "b"), (1000, "c") would check whether scaling compresses or stretches things as expected.
👍
- Overlap after scaling
With a large speedup (say 1000x), two nearby events could collapse into the same timestamp after rounding. Right now tests assume spacing survives scaling.
Do we want to add some checks around this and warn user about this?
Hmm, this might be a good point, as I do the scaling afterwards during the sleep. Will need to check.
- Very large timestamps
Something like (10**9, "conv1") could overflow ints or lose precision once scaled. No test covers that edge case.
Worth checking.
Maybe we can reuse Note that that option was configured to 2x if the ratio was 2, .5x if the ratio was .5. If this does the inverse, then it should be made the same for both options. |
@the-david-oy that flag is not hooked up in Edit: Looked at the GAP implementation, looks like it affects the delays between turns in input files as well. Not sure best way to combine these fields. As this field is for fixed timestamps, but it would likely want to be applied to delays as well. So maybe a new cli option that replaces both? Edit: Also, I am in favor of using |
That's fine. I think this feature wasn't heavily used, so we can probably update any users who were using this. Yeah, I can create a PR to get it merged to main sooner if it'd be useful. The flag already exists on main but isn't hooked up to do anything. Fixed timestamps and delays should be mutually exclusive. The only place they wouldn't be is if we wanted to allow the first request to have a fixed timestamp + subsequent requests to have delays, which is a valid albeit niche use case that we do not support today. The previous feedback we had was that we had too many CLI flags, so I wanted to mention this here since it seems like two flags that have significant overlap for most use cases. |
Yeah thats what I had in mind with regards to them working together.
agreed |
Tip
--fixed-schedule-speedup <float>
Specifies a scaling factor to apply to the timestamps in the fixed schedule. For example, a value of 2.0 will make the schedule run twice as fast, while a value of 0.5 will make it run half as fast.
Summary by CodeRabbit