-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-31226][CORE][TESTS] SizeBasedCoalesce logic will lose partition #27988
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
ok to test |
|
Sorry for late visiting here, @AngersZhuuuu .
|
|
Test build #120689 has finished for PR 27988 at commit
|
| index += 1 | ||
| if (index == partitions.length) { | ||
| updateGroups() | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, the previous code was added at 2.0.0. Did you hit a test flakiness due to this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, the previous code was added at 2.0.0. Did you hit a test flakiness due to this?
Yeah, these days I am testing a way to fix spark small out put file problem, and I use this logic, during test, found that this logic is wrong.
#27248
|
Retest this please. |
|
Test build #120690 has finished for PR 27988 at commit
|
Changed |
|
@dongjoon-hyun Any more need to add for this? |
| addPartition(partition, splitSize) | ||
| index += 1 | ||
| } else { | ||
| updateGroups |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you point out which line/lines cause the bug?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you point out which line/lines cause the bug?
In the pr description I have show the case this bug happen. Not one line or some lines cause this error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you point out which line/lines cause the bug?
Change the desc make it more clear.
| } | ||
| index += 1 | ||
| if (index == partitions.length) { | ||
| updateGroups() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can we do this after the loop exit? There is no early stop in this loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can we do this after the loop exit? There is no early stop in this loop.
Sure, updated, since before exit loop, the last action is addPartition so currentGourps if not empty, we don't need to check if currentGroup is empty, just updateGroups
|
Test build #125580 has finished for PR 27988 at commit
|
|
retest this please |
|
Test build #125615 has finished for PR 27988 at commit
|
|
@cloud-fan The test failed is not related since only test |
|
retest this please |
| updateGroups | ||
| updateGroups() | ||
| addPartition(partition, splitSize) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of the following,
if (currentGroup.partitions.isEmpty) {
addPartition(partition, splitSize)
} else {
updateGroups()
addPartition(partition, splitSize)
}The following will be better.
if (currentGroup.partitions.nonEmpty) {
updateGroups()
}
addPartition(partition, splitSize)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated, thanks
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM (except one minor refactoring comment)
|
Test build #125650 has finished for PR 27988 at commit
|
|
Test build #125653 has finished for PR 27988 at commit
|
|
retest this please |
|
Test build #125665 has finished for PR 27988 at commit
|
|
retest this please |
|
Test build #125670 has finished for PR 27988 at commit
|
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM. Merged to master. Thanks, @AngersZhuuuu .
What changes were proposed in this pull request?
When last partition's splitFile's split size is larger then maxSize, this partition will be lost
Origin logic error like below as 1, 2, 3, 4, 5
Why are the changes needed?
Fix bug
Does this PR introduce any user-facing change?
NO
How was this patch tested?
Manual code review.