-
Notifications
You must be signed in to change notification settings - Fork 147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do correct resource accounting if running multiple dataframe collections in the pyrunner #2493
Comments
I think #2502 should fix most of this. However, there is still a potential problem though with multiple dataframe iterators are running in parallel. Our current implementation doesn't place any hard limits on how many inflight partitions each generator can have running. That is to say, if we run To fix this, we need something like #2279 to help us limit the number of in-flight tasks per iterator. |
…pyrunner (#2502) 1. Fixes the thread pool aspect of #2493 by lifting the `thread_pool` from a local per-generator variable to a shared `self._thread_pool` in the singleton runner 2. Fixes the shared resources aspect of #2493 by lifting the `inflight_tasks_resources` and `inflight_tasks` to shared variables in the singleton runner The second part of this is a little dangerous because of the way these variables are used that can introduce a potential race condition: 1. `self._can_admit_task` checks against the `inflight_tasks_resources` to see if a task can be admitted 2. If so, we then proceed with submitting a task to the thread pool and updating `inflight_tasks_resources` If another iterator is somehow able to update `inflight_tasks_resources` between steps 1 and 2, then our resource accounting would fail. However, I think this should never happen because our generators are not pre-emptable, and I don't think this race condition is possible. --------- Co-authored-by: Jay Chia <[email protected]@users.noreply.github.com>
failure case
The text was updated successfully, but these errors were encountered: