-
Notifications
You must be signed in to change notification settings - Fork 29k
SPARK-22345: Fix sort-merge joins with conditions and codegen. #19568
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @rdblue .
Since this is a correctness issue, can we have a test case including the incorrect result you met?
|
Test build #83023 has finished for PR 19568 at commit
|
|
@dongjoon-hyun, yes, I'm currently working on it. I just wanted to get the rest up. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we generate Java program, is it necessary to declare JoinedRow type for $joinedRow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, I think we need.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's causing the test failures. This is a typo from some restructuring I did to get this upstream.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
|
Could you please change title from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we only need to do this when there is CodegenFallback in the condition expressions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The joined row should always be used for correctness. We don't know what code the expression will generate, so we should plan on always passing the correct input row. Setting left and right on a joined row is a cheap operation, so I'd rather do it correctly than rely on something brittle like isInstanceOf[CodegenFallback].
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It ended up being a bit more complicated. There are two problems (in 2.0.0 and 2.1.1 at least). The first is what this fixes, which is that the INPUT_ROW in the codegen context points to the wrong row. This is fixed and now has a test that fails if you uncomment the line that sets INPUT_ROW.
The second problem is in the check for CodegenFallback fails to check whether the condition supports codegen in some plans. To get the test to fail, I had to add a projection to exercise the path where this happens. I'll add a second commit for this problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The second problem was fixed in this commit: 6b6dd68
I still think that the codegen problem should be fixed. Detecting CodgenFallback is imperfect, but will still generate code and run it. I think we should either remove codegen from CodegenFallback or add this fix to ensure that code works, even if we don't expect to run it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also leave a comment explaining why add a JoinedRow as INPUT_ROW.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
Code for the condition was generated to depend on the right row instead of the joined row.
4afb088 to
3431778
Compare
|
Test build #83056 has finished for PR 19568 at commit
|
|
Test build #83090 has finished for PR 19568 at commit
|
| ) | ||
| ) | ||
|
|
||
| testInnerJoin( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It still can pass without the changes in this PR. What is the purpose of this test case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test fails in 2.1.1 and versions before 6b6dd68. I'm not sure how to exercise the code generated by CodegenFallback with that fix, but this test is valid for the 2.1.1 branch.
| leftPlan, rightPlan) | ||
| EnsureRequirements(spark.sessionState.conf).apply(sortMergeJoin) | ||
| EnsureRequirements(spark.sessionState.conf) | ||
| .apply(ProjectExec(sortMergeJoin.output, sortMergeJoin)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we need to change this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In 2.1.1, an extra project causes WholeStageCodegenExec to not detect that the expression contains CodegenFallback. This is no longer the case. Like I said, there is no longer a good way to test what happens when CodegenFallback generates code. If there were, I'd use that here to test the case.
I guess I could add a testing case to WholeStageCodegenExec to make sure the code is generated correctly.
|
This PR is similar to the initial commit when I try to fix SPARK-21441 in #18656 (92dc106). |
|
@DonnyZone, I don't know of any cases that use codgen after the fix for If Spark is going to generate code, it should generate correct code. That means either we remove the codegen implementation from |
|
@rdblue Yes, the current implementation implicitly assumes the rule |
|
@gatorsmile, I think it would be better to fix codegen than to prevent it from happening with an assertion. If |
|
We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. |
What changes were proposed in this pull request?
This adds a joined row to sort-merge join codegen. That joined row is used to generate code for filter expressions, which may fall back to using the result row. Previously, the right side of the join was used, which is incorrect (the non-codegen implementations use a joined row).
How was this patch tested?
Current tests.