-
Notifications
You must be signed in to change notification settings - Fork 1.4k
fix: Fix smj result mismatch issue in semi, anit and full outer join #11771
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
✅ Deploy Preview for meta-velox canceled.
|
|
@pedroerp @xiaoxmeng Can you help to review this PR? Thanks. |
c598eef to
77d2d90
Compare
4e3de7f to
8032743
Compare
velox/exec/MergeJoin.cpp
Outdated
|
|
||
| if (leftMatch_ && !previousLeftMatch_) { | ||
| joinTracker_->noMoreFilterResults(onMiss); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you comment this?
we found a bug due to this code @JkSelf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fzhedu It aims to fix the MergeJoinTest#antiJoinWithFilterWithMultiMatchedRowsInDifferentBatches test. Although leftMatch is not null, it pertains to the next batch rather than the current one. Therefore, we also need to call noMoreFilterResults to update the status of the current batch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JkSelf here exists a bug: when leftMatch is set a new one, but the output is only contains results from previous leftMatch when compare results 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fzhedu Sorry for delay response. Can you help to provide the unit tests that can be reproduced? Thanks.
5b1a17b to
6697bdc
Compare
03dafb6 to
ba7046e
Compare
|
@JkSelf the unit tests for merge join are not compiling, could you please check? |
|
@zhouyuan I have fixed the failed unit tests. |
291396e to
6bdbe13
Compare
|
@xiaoxmeng gentle ping |
|
@xiaoxmeng gentle ping |
|
@xiaoxmeng would you please help to take a look? |
cf4c2b1 to
a066a7f
Compare
|
@pedroerp can you find someone to review the PR? The PR is essential to Gluten project. It solved a bug Gluten customer observed. |
a066a7f to
cb21c6a
Compare
semi and anti
When we were validating the results of SMJ on TPCH, we discovered an issue where semi join and anti join produce inconsistent results when the left join key has multiple right matched rows.
For semi join: Suppose the left input is 2, and it matches with two right inputs, right input1: 2 and right input2: 2. The current code produces an output of
, resulting in two records that meet the condition, which is inconsistent with the semantics of left semi join. In this PR, we leverage the features of JoinTracker, using the firstMatch variable to ensure that in a group of left key matches, the output only records the first matching record, and other matching records are not recorded.
For anti join: In the case of anti join with a filter, we encountered a similar issue. The solution is similar to that of semi join, utilizing the features of JoinTracker to retain only the rows in the same left key match group that have no matches on the right side.
full outer join fix
Assume the left table has columns a and b:
The right table has columns c and d:
The two tables are joined using a full outer join on the condition a == c and b < d. During the doGetOutput phase, the result is matched using a left join, resulting in 3 * 4 = 12 records:
Then, in the filter method, the records are filtered based on the condition b < d, resulting in the following:
Finally, records from the left table that do not have a match are filled with nulls, resulting in the following final output:
The above result is incorrect because it is missing rows from the right table that do not have a match. Among the 12 rows above, rows 0, 4, and 8 correspond to the first record (2, 3) from the right table, rows 1, 5, and 9 correspond to the second record (2, -1) from the right table, rows 2, 6, and 10 correspond to the third record (2, -1) from the right table, and rows 3, 7, and 11 correspond to the fourth record (2, 3) from the right table. From the matching results above, rows 1, 5, and 9, as well as rows 2, 6, and 10, are all false, meaning that the third and fourth records from the right table do not have matching rows. Therefore, the final result is missing rows from the right table that do not have matches. The correct final result should be:
This PR calls the filter function when the keys are the same to filter out rows from the right table that do not have matches. If a row from the right table does not have a match, a new record is inserted with the corresponding columns from the left table set to null.