-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-21319][SQL] Fix memory leak in sorter #18679
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Override?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since SortComparator does not extend RecordComparator, @Override may not be necessary.
|
LGTM |
|
I'm not quite convinced this works in the way we want if spilling occurs. |
|
I would propose copying the comparator before giving it to anything that might use this, so that we never hold onto a comparator that someone might have left dirty. |
|
Test build #79746 has finished for PR 18679 at commit
|
|
Since this PR definitely fixes the non-spilling case, I'm going to merge it and @j-baker can continue his work to fix the spilling case(if it's problematic). |
|
But the spilling case is the issue? |
|
I think that #18543 (post rewrite) is a better approach than this - instead of retaining a memory leak, we remove the opportunity for one to appear. |
|
I'm not convinced spilling case is the issue, can you verify that after this PR, your workload still OOM? |
|
So in my build, I definitely have spilling (quite a lot of it). If I have spilling, then the last thing that uses the comparator will definitely be the merge sorter. The last thing that uses the comparator is the thing that causes the memory leak. |
|
Oh, or are you saying that because in the external case most of the buffers stay on disk, you wouldn't see this issue because there's not enough stuff in memory? That'd make sense. This seems fine to merge - not sure if you prefer the approach in my PR (make sure the comparator exists only in ephemeral objects). |
|
I left a comment in #18543 (comment) your concern is valid, in the case of There are 3 possible fixes:
|
|
cc @hvanhovell |
|
3 is not doable, this PR goes with option 2. |
|
cc @ueshin |
|
Test build #79971 has finished for PR 18679 at commit
|
ueshin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM except for 1 comment.
| this, taskMemoryManager, recordComparator, prefixComparator, initialSize, canUseRadixSort); | ||
| this, | ||
| taskMemoryManager, | ||
| recordComparatorSupplier.get(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like we need to null check for recordComparatorSupplier.
|
Test build #79994 has finished for PR 18679 at commit
|
|
retest this please |
|
LGTM - pending jenkins |
|
Test build #80000 has finished for PR 18679 at commit
|
|
thanks for the review, merging to master! |
What changes were proposed in this pull request?
UnsafeExternalSorter.recordComparatorcan be eitherKVComparatororRowComparator, and both of them will keep the reference to the input rows they compared last time.After sorting, we return the sorted iterator to upstream operators. However, the upstream operators may take a while to consume up the sorted iterator, and
UnsafeExternalSorteris registered toTaskContextat here, which means we will keep theUnsafeExternalSorterinstance and keep the last compared input rows in memory until the sorted iterator is consumed up.Things get worse if we sort within partitions of a dataset and coalesce all partitions into one, as we will keep a lot of input rows in memory and the time to consume up all the sorted iterators is long.
This PR takes over #18543 , the idea is that, we do not keep the record comparator instance in
UnsafeExternalSorter, but a generator of record comparator.close #18543
How was this patch tested?
N/A