-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-2125] Add sort flag and move sort into shuffle implementations #1210
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Merged build triggered. |
|
Merged build started. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not take up a lot of memory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's true. But I will not change the original implementation, since PR931 will solve this issue.
|
Merged build finished. |
|
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16113/ |
|
Hi @mateiz, mind taking a look at this PR, thanks a lot. |
|
Sorry for the delay, Saisai -- will take a look soon. I've been away traveling this week. |
|
I'm not sure is that what you want, so I hope you can review it and give me come comments. Thanks a lot. |
|
This looks pretty good to me API-wise, but an Option[Boolean] is kind of confusing. Maybe we should have an enumeration called SortOrder and pass in an Option[SortOrder]. Also, when this value is set, please add a check that keyOrdering is also set. Otherwise the user made an error in configuring the ShuffleDependency but we'll just return unsorted data. We should also wait on #931 to be merged and then base this on that. |
|
Hi Matei, thanks a lot for your review, I will change the code according to your comments. |
|
@jerryshao after looking more at #931, I'd actually like to hold off on merging that the way it's set up, so would you mind updating this now? I can merge this as is (without external sort) and then we can add external sort later. |
|
Hi Matei, I've updated the code according to your comments, would you please review this change? Thanks a lot. |
|
QA tests have started for PR 1210. This patch merges cleanly. |
|
QA results for PR 1210: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be private[spark] or at least @DeveloperApi
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually probably private[spark] works for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, I will add it.
|
@ScrapCodes @pwendell there's a MIMA error on this that seems spurious: it complains that |
|
Shall we add this to the MimaExcludes? |
|
QA tests have started for PR 1210. This patch merges cleanly. |
|
QA results for PR 1210: |
|
Sorry for the delay -- yeah, try adding that for now. |
|
Hi Matei, should I wait until your sort-based shuffle is merged into master branch, so I can change the current in memory sort to external sort? |
|
QA tests have started for PR 1210. This patch DID NOT merge cleanly! |
|
QA results for PR 1210: |
|
QA tests have started for PR 1210. This patch merges cleanly. |
|
QA results for PR 1210: |
|
@jerryshao I'll just merge this as is for now, seems simpler. |
This patch adds a sort flag into ShuffleDependecy and moves sort into hash shuffle implementation. Moving sort into shuffle implementation can give space for other shuffle implementations (like sort-based shuffle) to better optimize sort through shuffle. Author: jerryshao <[email protected]> Closes apache#1210 from jerryshao/SPARK-2125 and squashes the following commits: 2feaf7b [jerryshao] revert MimaExcludes ceddf75 [jerryshao] add MimaExeclude f674ff4 [jerryshao] Add missing Scope restriction b9fe0dd [jerryshao] Fix some style issues according to comments ef6b729 [jerryshao] Change sort flag into Option 3f6eeed [jerryshao] Fix issues related to unit test 2f552a5 [jerryshao] Minor changes about naming and order c92a281 [jerryshao] Move sort into shuffle implementations
This patch adds a sort flag into ShuffleDependecy and moves sort into hash shuffle implementation.
Moving sort into shuffle implementation can give space for other shuffle implementations (like sort-based shuffle) to better optimize sort through shuffle.