Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[prism] Smarter "globally" aware dynamic splits. #32538

Open
Tracked by #29650
lostluck opened this issue Sep 23, 2024 · 0 comments
Open
Tracked by #29650

[prism] Smarter "globally" aware dynamic splits. #32538

lostluck opened this issue Sep 23, 2024 · 0 comments

Comments

@lostluck
Copy link
Contributor

lostluck commented Sep 23, 2024

User issue #32498 once again had Prism's fairly aggressive splitting strategy was negatively affected by user latency. In particular that issue was that Prism is aggressive to start splitting when there hasn't been any measurable progress, and continues

That was resolved by #32526 where the split approach adds per stage progress interval tracking, to at least consolidate split strategy within a given stage. (eg. Avoid making the same mistake twice). This does a multiplicative increase in interval time if a split needs to happen (up to a maximum), and an additive decrease if no progress ticks resolve during a bundle (down to a minimum).

Four problems our approach:

  1. Because of that maximum, there's still a possibility that a bundle that takes longer than that maximum to produce a first element will oversplit, exacerbating the problem.
  2. Since the progress and split intervals are aligned, we reduce the opportunity to receive any signal.
  3. We may oversplit even though we've reached our parallelism cap for execution.
  4. We only split whenever there's been no progress, not when there's slow progress.

An ideal algorithm would not experience these problems.

The solution for 1 and 3 are the same: plumb in an awareness of the "parallel capacity" for the job, and only split when there's spare capacity. This is closer to how Dataflow approaches splitting. This requires providing visibility of the job execution state that the progress loop can examine on each iteration, eg a channel, or some atomic object, to know if it even is permitted to split.

2 is solved by splitting the progress and split intervals, such as by having independent tickers that fire independantly. This would look about the same as it currently does, but with the progress intervals ticker for the stage hitting a much smaller maximum value, and the split interval ticker reaching a higher one. The split interval must be >= the progress interval.

4 is solved by making the "slowness" detection more opinionated, and taking advantage of the "stage level state" that we're beginning to accumulated. We can calculate the average rate of progress (input elements, and total elements) for the stage, and determine the approximate time to first element, and use that information to make better choices. If we have an estimate of time to first element, then we don't aggressively split until some amount of coverage above that. We only split if the determined rate is slower than the bounds in question.

The goal should also still be to allow for eager splits relatively quickly if there's capacity, so Separation Harness style tests (where splitting is forced) are able to execute.

Loosely, this would look like:

  • At each Progress Interval
    • Get tentative metrics from the SDK.
    • Contribute metrics to store.
    • Update stage state estimated rates, and projected completion of bundle.
    • Split if and only if
      • The split interval has passed (clearing the ticker tick) AND
      • There is a reason to split (slowness, lack of progress) AND
      • There is available global capacity to process the split bundle (likely marking somehow that this capacity will be used to avoid multiple bundles splitting based on the same capacity).
    • Adjust split and progress intervals accordingly.

Note that due to how we execute bundles for a given parallelism of N, we have N goroutines executing bundles, + 1 bundle blocked in the execution time thread, +1 bundle in the unbuffered channel ready to be executed. The global signal to allow splits would need to take this into account.

We would probably also add some debug split logging to prism as well along with this work (eg. Desired to split for [reason] but no capacity to process new bundles at the moment.)

@lostluck lostluck changed the title [prism] Smarter "globally" aware splits. [prism] Smarter "globally" aware dynamic splits. Sep 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant