Replies: 1 comment
-
I'm making some progress https://dask.discourse.group/t/understanding-work-stealing/335/2 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Cross posted from Understanding Work Stealing, sorry, don't know the best place to ask.
I’m trying to understand work stealing, with the plan to allow running workers in different datacenters, but prevent any but essential transfers between them.
I expected to be able to prevent stealing on a case-by-case basis by overriding _can_steal in stealing.py.
I set up an experiment where a subset of workers load data, and then run some calculations on it. What I see is that with DASK_DISTRIBUTED__SCHEDULER__WORK_STEALING="False", the calculations are only run on the workers that loaded the data. But with _can_steal always returning False, tasks are dispatched to other workers.
Am I misunderstanding data locality and work stealing?
Beta Was this translation helpful? Give feedback.
All reactions