Commit 55e9dd6
[SPARK-3084] [SQL] Collect broadcasted tables in parallel in joins
BroadcastHashJoin has a broadcastFuture variable that tries to collect
the broadcasted table in a separate thread, but this doesn't help
because it's a lazy val that only gets initialized when you attempt to
build the RDD. Thus queries that broadcast multiple tables would collect
and broadcast them sequentially. I changed this to a val to let it start
collecting right when the operator is created.
Author: Matei Zaharia <[email protected]>
Closes apache#1990 from mateiz/spark-3084 and squashes the following commits:
f468766 [Matei Zaharia] [SPARK-3084] Collect broadcasted tables in parallel in joins
(cherry picked from commit 6a13dca)
Signed-off-by: Michael Armbrust <[email protected]>1 parent ec0b91e commit 55e9dd6
File tree
1 file changed
+1
-1
lines changed- sql/core/src/main/scala/org/apache/spark/sql/execution
1 file changed
+1
-1
lines changedLines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
424 | 424 | | |
425 | 425 | | |
426 | 426 | | |
427 | | - | |
| 427 | + | |
428 | 428 | | |
429 | 429 | | |
430 | 430 | | |
| |||
0 commit comments