Commit 6e63201
[SPARK-14830][SQL] Add RemoveRepetitionFromGroupExpressions optimizer.
## What changes were proposed in this pull request?
This PR aims to optimize GroupExpressions by removing repeating expressions. `RemoveRepetitionFromGroupExpressions` is added.
**Before**
```scala
scala> sql("select a+1 from values 1,2 T(a) group by a+1, 1+a, A+1, 1+A").explain()
== Physical Plan ==
WholeStageCodegen
: +- TungstenAggregate(key=[(a#0 + 1)#6,(1 + a#0)#7,(A#0 + 1)#8,(1 + A#0)#9], functions=[], output=[(a + 1)#5])
: +- INPUT
+- Exchange hashpartitioning((a#0 + 1)#6, (1 + a#0)#7, (A#0 + 1)#8, (1 + A#0)#9, 200), None
+- WholeStageCodegen
: +- TungstenAggregate(key=[(a#0 + 1) AS (a#0 + 1)#6,(1 + a#0) AS (1 + a#0)#7,(A#0 + 1) AS (A#0 + 1)#8,(1 + A#0) AS (1 + A#0)#9], functions=[], output=[(a#0 + 1)#6,(1 + a#0)#7,(A#0 + 1)#8,(1 + A#0)#9])
: +- INPUT
+- LocalTableScan [a#0], [[1],[2]]
```
**After**
```scala
scala> sql("select a+1 from values 1,2 T(a) group by a+1, 1+a, A+1, 1+A").explain()
== Physical Plan ==
WholeStageCodegen
: +- TungstenAggregate(key=[(a#0 + 1)#6], functions=[], output=[(a + 1)#5])
: +- INPUT
+- Exchange hashpartitioning((a#0 + 1)#6, 200), None
+- WholeStageCodegen
: +- TungstenAggregate(key=[(a#0 + 1) AS (a#0 + 1)#6], functions=[], output=[(a#0 + 1)#6])
: +- INPUT
+- LocalTableScan [a#0], [[1],[2]]
```
## How was this patch tested?
Pass the Jenkins tests (with a new testcase)
Author: Dongjoon Hyun <[email protected]>
Closes #12590 from dongjoon-hyun/SPARK-14830.1 parent a35a67a commit 6e63201
File tree
2 files changed
+33
-2
lines changed- sql/catalyst/src
- main/scala/org/apache/spark/sql/catalyst/optimizer
- test/scala/org/apache/spark/sql/catalyst/optimizer
2 files changed
+33
-2
lines changedLines changed: 14 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
68 | 68 | | |
69 | 69 | | |
70 | 70 | | |
71 | | - | |
| 71 | + | |
| 72 | + | |
72 | 73 | | |
73 | 74 | | |
74 | 75 | | |
| |||
1439 | 1440 | | |
1440 | 1441 | | |
1441 | 1442 | | |
| 1443 | + | |
| 1444 | + | |
| 1445 | + | |
| 1446 | + | |
| 1447 | + | |
| 1448 | + | |
| 1449 | + | |
| 1450 | + | |
| 1451 | + | |
| 1452 | + | |
| 1453 | + | |
| 1454 | + | |
1442 | 1455 | | |
1443 | 1456 | | |
1444 | 1457 | | |
| |||
Lines changed: 19 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
17 | 17 | | |
18 | 18 | | |
19 | 19 | | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
20 | 23 | | |
21 | 24 | | |
22 | 25 | | |
| |||
25 | 28 | | |
26 | 29 | | |
27 | 30 | | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
28 | 34 | | |
29 | 35 | | |
30 | 36 | | |
31 | | - | |
| 37 | + | |
| 38 | + | |
32 | 39 | | |
33 | 40 | | |
34 | 41 | | |
| |||
42 | 49 | | |
43 | 50 | | |
44 | 51 | | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
45 | 63 | | |
0 commit comments