-
Notifications
You must be signed in to change notification settings - Fork 325
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Shuffle] isolate mappers in different subtasks for fetch_by_index mode #3239
[Shuffle] isolate mappers in different subtasks for fetch_by_index mode #3239
Conversation
036c9dd
to
02847bc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. We can find a better way to avoid multiple graph iterations in the future, e.g. normalize shuffle operands to remove many special cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. This may cause more subtasks. Maybe we can optimize graph later to reduce subtasks.
…de (mars-project#3239) * ensure every subtask contains only at most one mapper * enable test_align_execution * fix isolate mappers for `DataFrameIndexAlign` which are not mappers (cherry picked from commit ee152b6)
…de (mars-project#3239) * ensure every subtask contains only at most one mapper * enable test_align_execution * fix isolate mappers for `DataFrameIndexAlign` which are not mappers (cherry picked from commit ee152b6)
…de (mars-project#3239) * ensure every subtask contains only at most one mapper * enable test_align_execution * fix isolate mappers for `DataFrameIndexAlign` which are not mappers (cherry picked from commit ee152b6)
What do these changes do?
Mars coloring-based fusion algorithm may fuse multiple mapper subtasks of different shuffles into one subtask, which conflict with
ShuffleFetchType.FETCH_BY_INDEX
shuffle block resolution mode. InShuffleFetchType.FETCH_BY_INDEX
shuffle block resolution mode, shuffle data is resolved by index. If one subtask contains shuffle mapper of different shuffle, current resolution mode won't work.One solution is adding second-level index to
ShuffleManager
to work around this. It's complicated and not compatible with push-based shuffle and other shuffle system such as CloudShuffleService、alibaba/RemoteShuffleService、Firestorm.Considering very little shuffle operands will be fused into the same subtask(currently only
![image](https://user-images.githubusercontent.com/12445254/188384951-2d2b2e11-133f-4140-8845-3e6fafd2a2fa.png)
DataFrameIndexAlign
mappers of different shuffle are fused into one subtask). We can just change the fuse algorithm to make those mappers isolated with each other into different subtasks.For other shuffle operands, there won’t be any change in the generated subtask graph.
Related issue number
Fixes #3226
#2719
#2916
Check code requirements