Commit 4e06475
[SPARK-31078][SQL] Respect aliases in output ordering
### What changes were proposed in this pull request?
Currently, in the following scenario, an unnecessary `Sort` node is introduced:
```scala
withSQLConf(SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.key -> "0") {
val df = (0 until 20).toDF("i").as("df")
df.repartition(8, df("i")).write.format("parquet")
.bucketBy(8, "i").sortBy("i").saveAsTable("t")
val t1 = spark.table("t")
val t2 = t1.selectExpr("i as ii")
t1.join(t2, t1("i") === t2("ii")).explain
}
```
```
== Physical Plan ==
*(3) SortMergeJoin [i#8], [ii#10], Inner
:- *(1) Project [i#8]
: +- *(1) Filter isnotnull(i#8)
: +- *(1) ColumnarToRow
: +- FileScan parquet default.t[i#8] Batched: true, DataFilters: [isnotnull(i#8)], Format: Parquet, Location: InMemoryFileIndex[file:/..., PartitionFilters: [], PushedFilters: [IsNotNull(i)], ReadSchema: struct<i:int>, SelectedBucketsCount: 8 out of 8
+- *(2) Sort [ii#10 ASC NULLS FIRST], false, 0 <==== UNNECESSARY
+- *(2) Project [i#8 AS ii#10]
+- *(2) Filter isnotnull(i#8)
+- *(2) ColumnarToRow
+- FileScan parquet default.t[i#8] Batched: true, DataFilters: [isnotnull(i#8)], Format: Parquet, Location: InMemoryFileIndex[file:/..., PartitionFilters: [], PushedFilters: [IsNotNull(i)], ReadSchema: struct<i:int>, SelectedBucketsCount: 8 out of 8
```
Notice that `Sort [ii#10 ASC NULLS FIRST], false, 0` is introduced even though the underlying data is already sorted. This is because `outputOrdering` doesn't handle aliases correctly. This PR proposes to fix this issue.
### Why are the changes needed?
To better handle aliases in `outputOrdering`.
### Does this PR introduce any user-facing change?
Yes, now with the fix, the `explain` prints out the following:
```
== Physical Plan ==
*(3) SortMergeJoin [i#8], [ii#10], Inner
:- *(1) Project [i#8]
: +- *(1) Filter isnotnull(i#8)
: +- *(1) ColumnarToRow
: +- FileScan parquet default.t[i#8] Batched: true, DataFilters: [isnotnull(i#8)], Format: Parquet, Location: InMemoryFileIndex[file:/..., PartitionFilters: [], PushedFilters: [IsNotNull(i)], ReadSchema: struct<i:int>, SelectedBucketsCount: 8 out of 8
+- *(2) Project [i#8 AS ii#10]
+- *(2) Filter isnotnull(i#8)
+- *(2) ColumnarToRow
+- FileScan parquet default.t[i#8] Batched: true, DataFilters: [isnotnull(i#8)], Format: Parquet, Location: InMemoryFileIndex[file:/..., PartitionFilters: [], PushedFilters: [IsNotNull(i)], ReadSchema: struct<i:int>, SelectedBucketsCount: 8 out of 8
```
### How was this patch tested?
Tests added.
Closes apache#27842 from imback82/alias_aware_sort_order.
Authored-by: Terry Kim <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
(cherry picked from commit 294f605)1 parent 48fc6e3 commit 4e06475
File tree
6 files changed
+87
-23
lines changed- sql/core/src
- main/scala/org/apache/spark/sql/execution
- aggregate
- test/scala/org/apache/spark/sql
- execution
- sources
6 files changed
+87
-23
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
19 | | - | |
| 19 | + | |
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
23 | | - | |
24 | | - | |
| 23 | + | |
25 | 24 | | |
26 | | - | |
| 25 | + | |
27 | 26 | | |
28 | 27 | | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
29 | 50 | | |
30 | 51 | | |
31 | 52 | | |
| |||
36 | 57 | | |
37 | 58 | | |
38 | 59 | | |
| 60 | + | |
39 | 61 | | |
40 | | - | |
41 | | - | |
42 | | - | |
43 | | - | |
44 | | - | |
45 | | - | |
46 | | - | |
47 | | - | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
48 | 68 | | |
49 | | - | |
50 | | - | |
51 | | - | |
52 | | - | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
53 | 79 | | |
54 | 80 | | |
55 | 81 | | |
Lines changed: 3 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
53 | 53 | | |
54 | 54 | | |
55 | 55 | | |
56 | | - | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
57 | 59 | | |
58 | 60 | | |
59 | 61 | | |
| |||
Lines changed: 5 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
27 | | - | |
| 27 | + | |
28 | 28 | | |
29 | 29 | | |
30 | 30 | | |
| |||
38 | 38 | | |
39 | 39 | | |
40 | 40 | | |
41 | | - | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
42 | 44 | | |
43 | 45 | | |
44 | 46 | | |
| |||
68 | 70 | | |
69 | 71 | | |
70 | 72 | | |
71 | | - | |
| 73 | + | |
72 | 74 | | |
73 | 75 | | |
74 | 76 | | |
| |||
Lines changed: 6 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
39 | 39 | | |
40 | 40 | | |
41 | 41 | | |
42 | | - | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
43 | 46 | | |
44 | 47 | | |
45 | 48 | | |
| |||
80 | 83 | | |
81 | 84 | | |
82 | 85 | | |
83 | | - | |
84 | | - | |
85 | 86 | | |
86 | 87 | | |
| 88 | + | |
| 89 | + | |
87 | 90 | | |
88 | 91 | | |
89 | 92 | | |
| |||
Lines changed: 19 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
962 | 962 | | |
963 | 963 | | |
964 | 964 | | |
| 965 | + | |
| 966 | + | |
| 967 | + | |
| 968 | + | |
| 969 | + | |
| 970 | + | |
| 971 | + | |
| 972 | + | |
| 973 | + | |
| 974 | + | |
| 975 | + | |
| 976 | + | |
| 977 | + | |
| 978 | + | |
| 979 | + | |
| 980 | + | |
| 981 | + | |
| 982 | + | |
| 983 | + | |
965 | 984 | | |
966 | 985 | | |
967 | 986 | | |
| |||
Lines changed: 12 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
639 | 639 | | |
640 | 640 | | |
641 | 641 | | |
| 642 | + | |
| 643 | + | |
| 644 | + | |
| 645 | + | |
| 646 | + | |
| 647 | + | |
| 648 | + | |
| 649 | + | |
| 650 | + | |
| 651 | + | |
| 652 | + | |
| 653 | + | |
642 | 654 | | |
643 | 655 | | |
644 | 656 | | |
| |||
0 commit comments