Commit 51ef443
[SPARK-33822][SQL] Use the
### What changes were proposed in this pull request?
This PR intends to fix the bug that throws a unsupported exception when running [the TPCDS q5](https://github.com/apache/spark/blob/master/sql/core/src/test/resources/tpcds/q5.sql) with AQE enabled ([this option is enabled by default now via SPARK-33679](031c5ef)):
```
java.lang.UnsupportedOperationException: BroadcastExchange does not support the execute() code path.
at org.apache.spark.sql.execution.exchange.BroadcastExchangeExec.doExecute(BroadcastExchangeExec.scala:189)
at org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:180)
at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:218)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:215)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:176)
at org.apache.spark.sql.execution.exchange.ReusedExchangeExec.doExecute(Exchange.scala:60)
at org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:180)
at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:218)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:215)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:176)
at org.apache.spark.sql.execution.adaptive.QueryStageExec.doExecute(QueryStageExec.scala:115)
at org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:180)
at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:218)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:215)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:176)
at org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:321)
at org.apache.spark.sql.execution.SparkPlan.executeCollectIterator(SparkPlan.scala:397)
at org.apache.spark.sql.execution.exchange.BroadcastExchangeExec.$anonfun$relationFuture$1(BroadcastExchangeExec.scala:118)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withThreadLocalCaptured$1(SQLExecution.scala:185)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
...
```
I've checked the AQE code and I found `EnsureRequirements` wrongly puts `BroadcastExchange` on a top of `BroadcastQueryStage` in the `reOptimize` phase as follows:
```
+- BroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#2183]
+- BroadcastQueryStage 2
+- ReusedExchange [d_date_sk#1086], BroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#1963]
```
A root cause is that a `Cast` class in a required child's distribution does not have a `timeZoneId` field (`timeZoneId=None`), and a `Cast` class in `child.outputPartitioning` has it. So, this difference can make the distribution requirement check fail in `EnsureRequirements`:
https://github.com/apache/spark/blob/1e85707738a830d33598ca267a6740b3f06b1861/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala#L47-L50
The `Cast` class that does not have a `timeZoneId` field is generated in the `HashJoin` object. To fix this issue, this PR proposes to use the `CastSupport.cast` method there.
### Why are the changes needed?
Bugfix.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Manually checked that q5 passed.
Closes #30818 from maropu/BugfixInAQE.
Authored-by: Takeshi Yamamuro <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>CastSupport.cast method in HashJoin1 parent 15616f4 commit 51ef443
File tree
2 files changed
+27
-19
lines changed- sql/core/src
- main/scala/org/apache/spark/sql/execution/joins
- test/scala/org/apache/spark/sql/execution/joins
2 files changed
+27
-19
lines changedLines changed: 7 additions & 6 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
17 | 17 | | |
18 | 18 | | |
19 | 19 | | |
20 | | - | |
| 20 | + | |
| 21 | + | |
21 | 22 | | |
22 | 23 | | |
23 | 24 | | |
| |||
756 | 757 | | |
757 | 758 | | |
758 | 759 | | |
759 | | - | |
| 760 | + | |
760 | 761 | | |
761 | 762 | | |
762 | 763 | | |
| |||
771 | 772 | | |
772 | 773 | | |
773 | 774 | | |
774 | | - | |
| 775 | + | |
775 | 776 | | |
776 | 777 | | |
777 | 778 | | |
778 | 779 | | |
779 | 780 | | |
780 | 781 | | |
781 | | - | |
| 782 | + | |
782 | 783 | | |
783 | 784 | | |
784 | 785 | | |
| |||
791 | 792 | | |
792 | 793 | | |
793 | 794 | | |
794 | | - | |
| 795 | + | |
795 | 796 | | |
796 | 797 | | |
797 | 798 | | |
798 | 799 | | |
799 | 800 | | |
800 | | - | |
| 801 | + | |
801 | 802 | | |
802 | 803 | | |
803 | 804 | | |
| |||
Lines changed: 20 additions & 13 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
242 | 242 | | |
243 | 243 | | |
244 | 244 | | |
245 | | - | |
| 245 | + | |
| 246 | + | |
246 | 247 | | |
247 | 248 | | |
248 | | - | |
249 | | - | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
250 | 252 | | |
251 | 253 | | |
252 | | - | |
| 254 | + | |
| 255 | + | |
253 | 256 | | |
254 | 257 | | |
255 | | - | |
256 | | - | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
257 | 261 | | |
258 | 262 | | |
259 | | - | |
260 | | - | |
| 263 | + | |
| 264 | + | |
261 | 265 | | |
262 | | - | |
| 266 | + | |
| 267 | + | |
263 | 268 | | |
264 | 269 | | |
265 | 270 | | |
266 | | - | |
267 | | - | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
268 | 274 | | |
269 | | - | |
| 275 | + | |
270 | 276 | | |
271 | | - | |
| 277 | + | |
| 278 | + | |
272 | 279 | | |
273 | 280 | | |
274 | 281 | | |
| |||
0 commit comments