Use system time for scheduling.#549
Conversation
A refactor that makes the client rely on system time instead of monotonic time for scheduling the start of an execution. Prelude to horizontal scalability: system clocks can be synchronized accross machines. Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
|
/retest |
|
🔨 rebuilding |
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
…heduling-start Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
…heduling-start Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
dubious90
left a comment
There was a problem hiding this comment.
Looks good. Just one question.
| TerminationPredicate::Status DurationTerminationPredicateImpl::evaluate() { | ||
| return time_source_.monotonicTime() - start_ > duration_ ? TerminationPredicate::Status::TERMINATE | ||
| : TerminationPredicate::Status::PROCEED; | ||
| return time_source_.systemTime() - start_ > duration_ ? TerminationPredicate::Status::TERMINATE |
There was a problem hiding this comment.
This change looks good. in fact, I think it's possible this is more accurate now. Is there any chance this represents a change in behavior that we should document in the description?
There was a problem hiding this comment.
Well, SystemTime doesn't guarantee to always move forward across calls to get snapshots of it, like MonotonicTime does, and may it be adjusted while we are polling it. But there's only a very small window in which this can affect operation here: the duration that the main thread requests workers to wait before starting execution. That delay is computed here:
nighthawk/source/client/process_impl.cc
Line 167 in 6aa0331
Reasoning through clock updates that get applied right between our scheduling and starting of operations:
- with small updates, load generation may start a little earlier or later, no problem. Any durations that get measured for latency or execution are based on monotonic time and will not be affected.
- when the clock jumps forward a lot, worst case workers won't have sufficient time to get ready to start because the clock moved back in time significantly, but they will observe that and complain in the logs about it. (Execution results may be noisy because of workers having missed their schedules to start).
- when the clock jumps backwards a lot, workers will wait longer before starting execution. This isn't a problem, unless it's a huge leap backwards in time, in which case the wait might take a long time as well.
Also, suspend/sleep might work a little differently, I suspect that MonotonicTime may not track time spend suspended/sleeping. 2. from above more or less applies here as well.
All in all, I think chances are pretty small of anyone running into trouble because of this?
Small refactor that makes the client rely on system time instead of monotonic time
for scheduling the start of worker executions. System clocks can be synchronized
across machines, and this may come in handy when we start facilitating horizontal
scaling.
Note:
SequencerImplgets modified to re-use the execution duration that theRateLimiterit uses already tracks, in favour of its own tracking. This is a small clean up.
Apart from the actual switching from monotonic time to wall clock time, this should be a
mechanical change.
This change will make things easier if we would like to add an option to schedule the time at
which an execution will start.
This in turn could be useful when directing clients running on multiple machines to start, as a
means to have them start at approximately the same time.
(the approximation would mostly depend on how well the wall clock time is synchronised across
machines that are involved).
Signed-off-by: Otto van der Schaaf oschaaf@we-amp.com