Commit 6ef57d3
[SPARK-34514][SQL] Push down limit for LEFT SEMI and LEFT ANTI join
### What changes were proposed in this pull request?
I found out during code review of #31567 (comment), where we can push down limit to the left side of LEFT SEMI and LEFT ANTI join, if the join condition is empty.
Why it's safe to push down limit:
The semantics of LEFT SEMI join without condition:
(1). if right side is non-empty, output all rows from left side.
(2). if right side is empty, output nothing.
The semantics of LEFT ANTI join without condition:
(1). if right side is non-empty, output nothing.
(2). if right side is empty, output all rows from left side.
With the semantics of output all rows from left side or nothing (all or nothing), it's safe to push down limit to left side.
NOTE: LEFT SEMI / LEFT ANTI join with non-empty condition is not safe for limit push down, because output can be a portion of left side rows.
Reference: physical operator implementation for LEFT SEMI / LEFT ANTI join without condition - https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastNestedLoopJoinExec.scala#L200-L204 .
### Why are the changes needed?
Better performance. Save CPU and IO for these joins, as limit being pushed down before join.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Added unit test in `LimitPushdownSuite.scala` and `SQLQuerySuite.scala`.
Closes #31630 from c21/limit-pushdown.
Authored-by: Cheng Su <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>1 parent 14934f4 commit 6ef57d3
File tree
3 files changed
+62
-8
lines changed- sql
- catalyst/src
- main/scala/org/apache/spark/sql/catalyst/optimizer
- test/scala/org/apache/spark/sql/catalyst/optimizer
- core/src/test/scala/org/apache/spark/sql
3 files changed
+62
-8
lines changedLines changed: 13 additions & 7 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
502 | 502 | | |
503 | 503 | | |
504 | 504 | | |
505 | | - | |
| 505 | + | |
506 | 506 | | |
507 | 507 | | |
508 | 508 | | |
| |||
539 | 539 | | |
540 | 540 | | |
541 | 541 | | |
542 | | - | |
543 | | - | |
544 | | - | |
545 | | - | |
546 | | - | |
547 | | - | |
| 542 | + | |
| 543 | + | |
| 544 | + | |
| 545 | + | |
| 546 | + | |
| 547 | + | |
| 548 | + | |
| 549 | + | |
| 550 | + | |
| 551 | + | |
548 | 552 | | |
549 | 553 | | |
550 | 554 | | |
| |||
555 | 559 | | |
556 | 560 | | |
557 | 561 | | |
| 562 | + | |
| 563 | + | |
558 | 564 | | |
559 | 565 | | |
560 | 566 | | |
| |||
Lines changed: 19 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
25 | | - | |
| 25 | + | |
26 | 26 | | |
27 | 27 | | |
28 | 28 | | |
| |||
212 | 212 | | |
213 | 213 | | |
214 | 214 | | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
215 | 233 | | |
Lines changed: 30 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4034 | 4034 | | |
4035 | 4035 | | |
4036 | 4036 | | |
| 4037 | + | |
| 4038 | + | |
| 4039 | + | |
| 4040 | + | |
| 4041 | + | |
| 4042 | + | |
| 4043 | + | |
| 4044 | + | |
| 4045 | + | |
| 4046 | + | |
| 4047 | + | |
| 4048 | + | |
| 4049 | + | |
| 4050 | + | |
| 4051 | + | |
| 4052 | + | |
| 4053 | + | |
| 4054 | + | |
| 4055 | + | |
| 4056 | + | |
| 4057 | + | |
| 4058 | + | |
| 4059 | + | |
| 4060 | + | |
| 4061 | + | |
| 4062 | + | |
| 4063 | + | |
| 4064 | + | |
| 4065 | + | |
| 4066 | + | |
4037 | 4067 | | |
4038 | 4068 | | |
4039 | 4069 | | |
0 commit comments