Commit bc11146
[SPARK-23778][CORE] Avoid unneeded shuffle when union gets an empty RDD
## What changes were proposed in this pull request?
When a `union` is invoked on several RDDs of which one is an empty RDD, the result of the operation is a `UnionRDD`. This causes an unneeded extra-shuffle when all the other RDDs have the same partitioning.
The PR ignores incoming empty RDDs in the union method.
## How was this patch tested?
added UT
Author: Marco Gaido <[email protected]>
Closes #21333 from mgaido91/SPARK-23778.1 parent bc0498d commit bc11146
File tree
2 files changed
+18
-5
lines changed- core/src
- main/scala/org/apache/spark
- test/scala/org/apache/spark/rdd
2 files changed
+18
-5
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1306 | 1306 | | |
1307 | 1307 | | |
1308 | 1308 | | |
1309 | | - | |
1310 | | - | |
1311 | | - | |
| 1309 | + | |
| 1310 | + | |
| 1311 | + | |
| 1312 | + | |
1312 | 1313 | | |
1313 | | - | |
| 1314 | + | |
1314 | 1315 | | |
1315 | 1316 | | |
1316 | 1317 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
154 | 154 | | |
155 | 155 | | |
156 | 156 | | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
157 | 167 | | |
158 | 168 | | |
159 | 169 | | |
| |||
1047 | 1057 | | |
1048 | 1058 | | |
1049 | 1059 | | |
1050 | | - | |
| 1060 | + | |
| 1061 | + | |
| 1062 | + | |
1051 | 1063 | | |
1052 | 1064 | | |
1053 | 1065 | | |
| |||
0 commit comments