[SPARK-19650] Commands should not trigger a Spark job #17027

hvanhovell · 2017-02-22T14:06:10Z

What changes were proposed in this pull request?

Spark executes SQL commands eagerly. It does this by creating an RDD which contains the command's results. The downside to this is that any action on this RDD triggers a Spark job which is expensive and is unnecessary.

This PR fixes this by avoiding the materialization of an RDD for Commands; it just materializes the result and puts them in a LocalRelation.

How was this patch tested?

Added a regression test to SQLQuerySuite.

SparkQA · 2017-02-22T14:25:25Z

Test build #73279 has finished for PR 17027 at commit bd37934.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
case class MaterializedPlan(plan: SparkPlan) extends LeafNode

SparkQA · 2017-02-23T13:11:46Z

Test build #73345 has finished for PR 17027 at commit fdfe7fe.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-02-23T16:10:22Z

Test build #73350 has finished for PR 17027 at commit e8acd98.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-02-24T13:51:07Z

Test build #73425 has finished for PR 17027 at commit dad6b13.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2017-02-24T22:51:17Z

sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala

+    // For various commands (like DDL) and queries with side effects, we force query execution
+    // to happen right away to let these side effects take place eagerly.
    queryExecution.analyzed match {
      // For various commands (like DDL) and queries with side effects, we force query execution


remove this line

actually let me remove it while merging

cloud-fan · 2017-02-24T22:54:19Z

LGTM

gatorsmile

LGTM

cloud-fan · 2017-02-25T07:06:41Z

merging to master!

Spark executes SQL commands eagerly. It does this by creating an RDD which contains the command's results. The downside to this is that any action on this RDD triggers a Spark job which is expensive and is unnecessary. This PR fixes this by avoiding the materialization of an `RDD` for `Command`s; it just materializes the result and puts them in a `LocalRelation`. Added a regression test to `SQLQuerySuite`. Author: Herman van Hovell <[email protected]> Closes apache#17027 from hvanhovell/no-job-command.

hvanhovell added 3 commits February 17, 2017 14:09

Do not trigger a job for runnable commands unless we have to.

4eea40b

Introduce materialized plan

bd37934

Don't materialize the RDD

c17fe2c

Do not special case Commands in QueryExecution.hiveResultString.

fdfe7fe

Fix output

e8acd98

Use LocalRelation instead of MaterializedPlan. Add test.

dad6b13

hvanhovell changed the title ~~[SPARK-19650] Runnable commands should not trigger a Spark job [WIP]~~ [SPARK-19650] Commands should not trigger a Spark job Feb 24, 2017

cloud-fan reviewed Feb 24, 2017

View reviewed changes

gatorsmile approved these changes Feb 25, 2017

View reviewed changes

asfgit closed this in 8f0511e Feb 25, 2017

imback82 mentioned this pull request Oct 5, 2019

[SPARK-29279][SQL] Merge SHOW NAMESPACES and SHOW DATABASES code path #26006

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-19650] Commands should not trigger a Spark job #17027

[SPARK-19650] Commands should not trigger a Spark job #17027

Uh oh!

hvanhovell commented Feb 22, 2017 •

edited

Loading

Uh oh!

SparkQA commented Feb 22, 2017

Uh oh!

SparkQA commented Feb 23, 2017

Uh oh!

SparkQA commented Feb 23, 2017

Uh oh!

SparkQA commented Feb 24, 2017

Uh oh!

cloud-fan Feb 24, 2017

Uh oh!

cloud-fan Feb 25, 2017

Uh oh!

cloud-fan commented Feb 24, 2017

Uh oh!

gatorsmile left a comment

Uh oh!

cloud-fan commented Feb 25, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[SPARK-19650] Commands should not trigger a Spark job #17027

[SPARK-19650] Commands should not trigger a Spark job #17027

Uh oh!

Conversation

hvanhovell commented Feb 22, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

SparkQA commented Feb 22, 2017

Uh oh!

SparkQA commented Feb 23, 2017

Uh oh!

SparkQA commented Feb 23, 2017

Uh oh!

SparkQA commented Feb 24, 2017

Uh oh!

cloud-fan Feb 24, 2017

Choose a reason for hiding this comment

Uh oh!

cloud-fan Feb 25, 2017

Choose a reason for hiding this comment

Uh oh!

cloud-fan commented Feb 24, 2017

Uh oh!

gatorsmile left a comment

Choose a reason for hiding this comment

Uh oh!

cloud-fan commented Feb 25, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

hvanhovell commented Feb 22, 2017 •

edited

Loading