[SPARK-19839][Core]release longArray in BytesToBytesMap#17180
[SPARK-19839][Core]release longArray in BytesToBytesMap#17180zhzhan wants to merge 2 commits intoapache:masterfrom
Conversation
|
Probably one for @JoshRosen |
|
Test build #74036 has finished for PR 17180 at commit
|
There was a problem hiding this comment.
nit: i would do this after the memory is cleaned up
There was a problem hiding this comment.
This line seems to explain the behavior of current code (before applying this PR). It would be good to explain the behavior of the new code.
There was a problem hiding this comment.
Out of curiosity, what does this short-circuiting achieve?
There was a problem hiding this comment.
AFAIK, spill is called when other consumers want to allocate in-memory pages but can't get any memory. If we are able to release ample memory by free'ing up the longArray, then the purpose of spill is achieved. Hence the short circuit.
There was a problem hiding this comment.
IIUC we can only free this pointer array if we spill its related data pages. Is this short circuit safe to do?
There was a problem hiding this comment.
@cloud-fan, I was wondering this, too: I think that this is okay because we're in a destructive iterator over the map, so we can just walk through the data pages without needing to worry about hash-style lookups. If this is the case, though, then we should just destroy the pointer array earlier (as I commented downthread on the PR).
|
If we truly don't need the |
|
Hi @zhzhan is this PR active? if so, would you answer or address the review comment? |
|
per review comments, release the longArray on destructive iterator creation. |
|
Test build #79918 has finished for PR 17180 at commit
|
|
The test failure us caused by call method on the map after We should remove this test case as it does not follow the restriction. Please let me know the feedback. Stack trace of the failure. |
|
Is it better to fix this test instead of remove it? |
|
Will fix the unit test. |
|
Test build #79989 has finished for PR 17180 at commit
|
|
retest it please. |
|
retest this please |
|
Test build #80048 has finished for PR 17180 at commit
|
|
LGTM Wait for @JoshRosen for final sign off. |
|
This seems fine to me. That said, the updated test case is a bit confusing,
but I don't think the original test was too clear to begin with. The
original test was using the `iterator()` method to make an assertion about
the internal state of the map, then was checking whether the pattern of
getting a buffer from the map, updating it, then getting it again from the
map would reflect the update. After your update to the test, the comment
doesn't quite align with the test anymore. The right way to fix this
involves splitting up the affected test into two separate tests. We can do
that in a followup, though, since I think that we still have sufficient
code coverage via other tests and this PR has already been under review for
months now so it'd be better to get it merged and move on.
…On Sat, Jul 29, 2017 at 6:19 PM Xiao Li ***@***.***> wrote:
LGTM Wait for @JoshRosen <https://github.com/joshrosen> for final sign
off.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#17180 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AADGPLGZbKkiLGK6qsrunDEpRj_Vhs0mks5sS9ojgaJpZM4MUoYt>
.
|
|
Thanks! Merging to master. @zhzhan Could you address the comments about the test case in the follow-up PR? Thanks! |
What changes were proposed in this pull request?
When BytesToBytesMap spills, its longArray should be released. Otherwise, it may not released until the task complete. This array may take a significant amount of memory, which cannot be used by later operator, such as UnsafeShuffleExternalSorter, resulting in more frequent spill in sorter. This patch release the array as destructive iterator will not use this array anymore.
How was this patch tested?
Manual test in production