Skip to content

Conversation

@cloud-fan
Copy link
Contributor

What changes were proposed in this pull request?

This PR adds abstract classes for shuffle and broadcast, so that users can provide their columnar implementations.

This PR updates several places to use the abstract exchange classes, and also update AdaptiveSparkPlanExec so that the columnar rules can see exchange nodes.

This is an alternative of #29134 .
Close #29134

Why are the changes needed?

To allow columnar exchanges.

Does this PR introduce any user-facing change?

no

How was this patch tested?

new tests

@cloud-fan
Copy link
Contributor Author

cc @andygrove @tgravescs @maryannxue

}

private def newQueryStage(e: Exchange): QueryStageExec = {
val optimizedPlan = applyPhysicalRules(e.child, queryStageOptimizerRules)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's important to hide the exchange nodes from the stage optimizer rules, as that's the assumption of these rules.

Then we don't need https://github.com/apache/spark/pull/29134/files#diff-a30c7a6fcdcdd13e57135fd04d05f3b7R115

@andygrove
Copy link
Member

Thanks @cloud-fan I will test our AQE POC with these changes.

@SparkQA
Copy link

SparkQA commented Jul 27, 2020

Test build #126660 has finished for PR 29262 at commit c76bf9c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • trait BroadcastExchangeLike extends Exchange
  • trait ShuffleExchangeLike extends Exchange

@andygrove
Copy link
Member

andygrove commented Jul 28, 2020

Thanks @cloud-fan. I have tested these changes both with Spark 3.1 and also back ported to the 3.0 branch and everything is working well, so LGTM.

I wish I had thought to separate the rules out into stage creation and post-stage creation. That made things much simpler.

cc @tgravescs

@SparkQA
Copy link

SparkQA commented Jul 28, 2020

Test build #126728 has finished for PR 29262 at commit b90ea90.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 29, 2020

Test build #126750 has finished for PR 29262 at commit 8aff7a8.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@tgravescs
Copy link
Contributor

test this please

Copy link
Contributor

@tgravescs tgravescs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, pending jenkins

@SparkQA
Copy link

SparkQA commented Jul 29, 2020

Test build #126775 has finished for PR 29262 at commit 8aff7a8.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@tgravescs
Copy link
Contributor

test this please

@SparkQA
Copy link

SparkQA commented Jul 29, 2020

Test build #126779 has finished for PR 29262 at commit 50175d7.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@asfgit asfgit closed this in a025a89 Jul 29, 2020
@tgravescs
Copy link
Contributor

I merged this to master, unfortunately wouldn't pick clean to branch-3.0.

@cloud-fan would you want to put up PR for branch-3.0? Otherwise Andy or myself can.

@SparkQA
Copy link

SparkQA commented Jul 29, 2020

Test build #126785 has finished for PR 29262 at commit 50175d7.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor Author

Does it qualify a backport? It's kind of a new feature.

@tgravescs
Copy link
Contributor

I went back and forth on that as I can see it both ways. I ended up filing it as a bug as it prevents us from properly doing columnar processing (introduced in 3.0.0) with AQE. I also thought it was all internal and a fairly isolated path and was hoping to get it into 3.0.1 if others didn't disagree since users are starting to to use AQE in 3.0. what do you think?

@cloud-fan
Copy link
Contributor Author

I agree it's not that risky(the code is the same as before, just with one more abstraction layer: exchange-like). I don't have a strong opinion here. @andygrove can you help to create a backport PR?

@andygrove
Copy link
Member

Sure, I'll create a PR this morning. Thanks @cloud-fan and @tgravescs.

@cloud-fan cloud-fan changed the title [SPARK-32332][SQL] Support columnar exchanges [SPARK-32332][SQL] Make it possible to implement columnar exchanges Aug 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants