[SPARK-24087][SQL] Avoid shuffle when join keys are a super-set of bucket keys #21156

yucai · 2018-04-25T14:56:29Z

What changes were proposed in this pull request?

To improve the bucket join, when join keys are a super-set of bucket keys, we should avoid shuffle.

How was this patch tested?

Enable ignored test.

SparkQA · 2018-04-25T18:34:04Z

Test build #89847 has finished for PR 21156 at commit 27df568.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

…cket keys

SparkQA · 2018-04-26T06:04:57Z

Test build #89867 has finished for PR 21156 at commit b6bfdc2.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-04-26T07:05:02Z

Test build #89869 has finished for PR 21156 at commit a59c94f.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

yucai · 2018-04-26T07:45:09Z

retest this please

SparkQA · 2018-04-26T11:32:59Z

Test build #89875 has finished for PR 21156 at commit a59c94f.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

…r to disable this feature

…efault

SparkQA · 2018-06-04T06:08:38Z

Test build #91434 has finished for PR 21156 at commit 4e026e5.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-06-04T07:05:02Z

Test build #91435 has finished for PR 21156 at commit fa76a78.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-06-04T07:05:02Z

Test build #91436 has finished for PR 21156 at commit 946688a.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-07-06T07:05:01Z

Test build #92669 has finished for PR 21156 at commit 981a0fd.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

yucai · 2018-07-06T07:33:59Z

I run the failed commands locally, no issue, retest again.

info] SHA-1: 4e68a043f1e9aa75e4395aba7892196e90cca1f4
[info] Packaging /mnt/work/upstream/spark/external/kinesis-asl-assembly/target/scala-2.11/spark-streaming-kinesis-asl-assembly-2.4.0-SNAPSHOT.jar ...
[info] Done packaging.
[success] Total time: 29 s, completed Jul 6, 2018 12:32:58 AM
[yyu1@yucai-dev-1659859 spark]$ ./build/sbt -Phadoop-2.6 -Pkubernetes -Phive-thriftserver -Pflume -Pkinesis-asl -Pyarn -Pkafka-0-8 -Phive -Pmesos test:package streaming-kafka-0-8-assembly/assembly streaming-flume-assembly/assembly streaming-kinesis-asl-assembly/assembly

yucai · 2018-07-06T07:34:38Z

retest this please

yucai · 2018-07-06T07:34:54Z

@cloud-fan @gatorsmile @mgaido91 @viirya Could you help review this feature?

SparkQA · 2018-07-06T11:34:19Z

Test build #92674 has finished for PR 21156 at commit 981a0fd.

This patch fails SparkR unit tests.
This patch merges cleanly.
This patch adds no public classes.

yucai · 2018-07-06T11:48:01Z

retest this please

SparkQA · 2018-07-06T15:27:58Z

Test build #92682 has finished for PR 21156 at commit 981a0fd.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

mgaido91 · 2018-07-09T09:16:46Z

sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinExec.scala

+        if leftPartitioning.satisfies(ClusteredDistribution(leftKeys)) =>
+        avoidShuffleIfPossible(leftKeys, leftExpressions)
+
+      case _ => rightPartitioning match {


IIUC if either left or right are not HashPartitioning we are sure we won't meet the required distribution, so I guess this is useless, isn't it?

Yes, you are right. The main purpose of this feature is for the bucketed table, so the HashPartitioning is enough.
Actually, with the similar way, we can skip the shuffle for one side if it is RangePartitioning also, but I am not sure if it is really useful.

But that case would not be covered anyway here as we are returning that we require a HashClusteredDistribution so a RangePartitioning would never match anyway, wouldn't it?

In that case, we can return OrderedDistribution :: OrderedDistribution :: Nil to avoid shuffle for the RangePartitioning side.

yes, we can do that, but anyway this case is useless...

mgaido91 · 2018-07-09T09:17:37Z

sql/core/src/test/scala/org/apache/spark/sql/sources/BucketedReadSuite.scala


-  // Enable it after fix https://issues.apache.org/jira/browse/SPARK-12704
-  ignore("avoid shuffle when join keys are a super-set of bucket keys") {
+  test("avoid shuffle when join keys are a super-set of bucket keys") {


can we add more tests with different BucketSpec on the two sides?

mgaido91 · 2018-07-09T09:37:30Z

sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinExec.scala

+  private def avoidShuffleIfPossible(
+      joinKeys: Seq[Expression],
+      expressions: Seq[Expression]): Seq[Distribution] = {
+    val indices = expressions.map(x => joinKeys.indexWhere(_.semanticEquals(x)))


what if here we don't find an expression? I think it would return -1 causing an error when using the index later. Can we also add a test case for this situation?

case HashPartitioning(leftExpressions, _) if leftPartitioning.satisfies(ClusteredDistribution(leftKeys)) => avoidShuffleIfPossible(leftKeys, leftExpressions)

if leftPartitioning.satisfies(ClusteredDistribution(leftKeys)) has ensured expressions is a subset of joinKeys, so it would not return -1, right?

yes, you're right, thanks.

yucai · 2018-07-09T10:24:26Z

@mgaido91 with this way, seems like we don't need reorderJoinPredicates anymore, any thoughts?

SparkQA · 2018-07-09T18:15:25Z

Test build #92754 has finished for PR 21156 at commit 371c3a9.

This patch fails from timeout after a configured wait of `300m`.
This patch merges cleanly.
This patch adds no public classes.

yucai · 2018-07-10T09:32:49Z

@cloud-fan For bucket table, the user will do the bucket on the primary key, so in this case, they will not have the parallelism and data skew issue and we can see good benefit from avoiding shuffle.
Do you mean the performance regression in some more general cases?

yucai · 2018-07-10T09:36:02Z

A classic scenario could be like below:

SELECT
  ...
FROM
  lstg_item item,
  lstg_item_vrtn v
WHERE 
  item.auct_end_dt = CAST(SUBSTR('2018-04-19 00:00:00',1,10) AS DATE)
  AND item.item_id = v.item_id
  AND item.auct_end_dt = v.auct_end_dt;

lstg_item is a really big table and item_id is its primary key.
If we bucket on its item_id:

No data skew. Each partition will have the same data.
Before this PR, the above query needs extra shuffle on big table. After this PR, we can save that shuffle.

SparkQA · 2018-07-10T12:21:49Z

Test build #92810 has finished for PR 21156 at commit 6148029.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-07-10T13:44:23Z

Test build #92809 has finished for PR 21156 at commit 48511b7.

This patch passes all tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
trait HashJoin extends Join
trait Join
trait ShuffledJoin extends Join

maryannxue · 2018-07-10T17:04:55Z

IMHO, this ShuffledJoin is basically join + known distribution info. So instead of adding another join node (which doesn't map to any specific join algorithm), can we try to return the right distribution for bucket tables?

SparkQA · 2018-07-10T17:36:14Z

Test build #92822 has finished for PR 21156 at commit 3eee88b.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

yucai · 2018-07-12T01:52:54Z

@maryannxue how about this way? Any better idea?

SparkQA · 2018-07-12T05:46:28Z

Test build #92909 has finished for PR 21156 at commit de2bc4d.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

yucai · 2018-07-26T14:36:26Z

closed by mistake, reopen it.

SparkQA · 2018-07-26T18:29:12Z

Test build #93601 has finished for PR 21156 at commit f406062.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

wangshisan · 2018-10-09T02:57:30Z

What is the status now? I think this is of great value, since this gives users more possibility to leverage bucket join, all joins which take the bucket key as the prefix of join keys will benefit from this.
And we have a further optimization here:

Table A(a1, a2, a3) is bucketed by a1, a2
Table B(b1, b2, b3) is bucketed by b1.
A join B on (a1=b1, a2=b2, a3=b3)

In this case, only table B needs extra shuffle, and shuffle keys are (b1, b2), shuffle partition number is table A's bucket number.

SparkQA · 2018-10-22T13:13:12Z

Test build #97825 has started for PR 21156 at commit f406062.

SparkQA · 2018-10-22T14:39:52Z

Test build #97855 has started for PR 21156 at commit f406062.

SparkQA · 2018-10-22T16:10:28Z

Test build #97872 has started for PR 21156 at commit f406062.

maryannxue · 2018-10-22T17:00:32Z

Sorry for the delay. I’ll take another look today.

…

On Mon, Oct 22, 2018 at 7:50 AM UCB AMPLab ***@***.***> wrote: Can one of the admins verify this patch? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#21156 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AD-ogFX6Og9FX-cSEsJcyEvyrmIzrISgks5unb8TgaJpZM4TjmFn> .

maryannxue · 2018-10-22T22:01:09Z

The idea is good. Is it possible to make it an optimization rule? Another suggestion is we need more test cases.

cloud-fan · 2018-10-23T07:17:51Z

sql/core/src/main/scala/org/apache/spark/sql/execution/joins/JoinUtils.scala

+    }
+
+    val leftPartitioning = left.outputPartitioning
+    val rightPartitioning = right.outputPartitioning


This is my biggest concern. Currently Spark adds shuffle with a rule, so we can't always get the children partitioning precisely. We implemented a similar feature in EnsureRequirements.reorderJoinPredicates, which is hacky and we should improve the framework before adding more features like this.

@cloud-fan in this PR, requiredChildDistribution is always re-calculated each time it is invoked, could it be more precise than EnsureRequirements.reorderJoinPredicates？

This kind of bucketjoin is common, do we have a plan to improve the framework in 3.0?

AmplabJenkins · 2019-09-16T18:21:55Z

Can one of the admins verify this patch?

github-actions · 2020-01-11T00:06:30Z

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

yucai changed the title ~~[SPARK-24087][SQL] Avoid shuffle when join keys are a super-set of bucket keys~~ [WIP][SPARK-24087][SQL] Avoid shuffle when join keys are a super-set of bucket keys Apr 25, 2018

[SPARK-24087][SQL] Avoid shuffle when join keys are a super-set of bu…

b6bfdc2

…cket keys

yucai force-pushed the SPARK-24087 branch from 27df568 to b6bfdc2 Compare April 26, 2018 02:35

yucai changed the title ~~[WIP][SPARK-24087][SQL] Avoid shuffle when join keys are a super-set of bucket keys~~ [SPARK-24087][SQL] Avoid shuffle when join keys are a super-set of bucket keys Apr 26, 2018

simplify the codes

a59c94f

yucai added 3 commits June 4, 2018 10:22

Add spark.sql.sortMergeJoinExec.childrenPartitioningDetection for use…

4e026e5

…r to disable this feature

enable spark.sql.sortMergeJoinExec.childrenPartitioningDetection by d…

fa76a78

…efault

should return

946688a

skip RangePartition

981a0fd

mgaido91 reviewed Jul 9, 2018

View reviewed changes

Merge remote-tracking branch 'origin/master' into pr21156

76e7d5f

improve tests

371c3a9

support shuffled hash join

de2bc4d

yucai force-pushed the SPARK-24087 branch from 3eee88b to de2bc4d Compare July 12, 2018 01:50

remove bucket table check

f406062

yucai closed this Jul 26, 2018

yucai deleted the SPARK-24087 branch July 26, 2018 14:34

yucai restored the SPARK-24087 branch July 26, 2018 14:35

yucai reopened this Jul 26, 2018

HyukjinKwon mentioned this pull request Aug 7, 2018

[SPARK-24886][INFRA] Fix the testing script to increase timeout for Jenkins build (from 300m to 340m) #21845

Closed

cloud-fan reviewed Oct 23, 2018

View reviewed changes

dongjoon-hyun added the SQL label Jun 14, 2019

github-actions bot added the Stale label Jan 11, 2020

github-actions bot closed this Jan 12, 2020

[SPARK-24087][SQL] Avoid shuffle when join keys are a super-set of bucket keys #21156

[SPARK-24087][SQL] Avoid shuffle when join keys are a super-set of bucket keys #21156

Uh oh!

Conversation

yucai commented Apr 25, 2018

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

SparkQA commented Apr 25, 2018

Uh oh!

SparkQA commented Apr 26, 2018

Uh oh!

SparkQA commented Apr 26, 2018

Uh oh!

yucai commented Apr 26, 2018

Uh oh!

SparkQA commented Apr 26, 2018

Uh oh!

SparkQA commented Jun 4, 2018

Uh oh!

SparkQA commented Jun 4, 2018

Uh oh!

SparkQA commented Jun 4, 2018

Uh oh!

SparkQA commented Jul 6, 2018

Uh oh!

yucai commented Jul 6, 2018

Uh oh!

yucai commented Jul 6, 2018

Uh oh!

yucai commented Jul 6, 2018

Uh oh!

SparkQA commented Jul 6, 2018

Uh oh!

yucai commented Jul 6, 2018

Uh oh!

SparkQA commented Jul 6, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yucai commented Jul 9, 2018

Uh oh!

SparkQA commented Jul 9, 2018

Uh oh!

yucai commented Jul 10, 2018

Uh oh!

yucai commented Jul 10, 2018

Uh oh!

SparkQA commented Jul 10, 2018

Uh oh!

SparkQA commented Jul 10, 2018

Uh oh!

maryannxue commented Jul 10, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SparkQA commented Jul 10, 2018

Uh oh!

yucai commented Jul 12, 2018

Uh oh!

SparkQA commented Jul 12, 2018

Uh oh!

yucai commented Jul 26, 2018

Uh oh!

SparkQA commented Jul 26, 2018

maryannxue commented Jul 10, 2018 •

edited

Loading

wangshisan commented Oct 9, 2018 •

edited

Loading

yucai Oct 23, 2018 •

edited

Loading