Skip to content

[SPARK-12018][SQL] Refactor common subexpression elimination code#10009

Closed
viirya wants to merge 4 commits intoapache:masterfrom
viirya:refactor-subexpr-eliminate
Closed

[SPARK-12018][SQL] Refactor common subexpression elimination code#10009
viirya wants to merge 4 commits intoapache:masterfrom
viirya:refactor-subexpr-eliminate

Conversation

@viirya
Copy link
Member

@viirya viirya commented Nov 26, 2015

JIRA: https://issues.apache.org/jira/browse/SPARK-12018

The code of common subexpression elimination can be factored and simplified. Some unnecessary variables can be removed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the above variables are not unnecessary. I will remove them later if the test is passed.

@SparkQA
Copy link

SparkQA commented Nov 27, 2015

Test build #46783 has finished for PR 10009 at commit 4876ce0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * case class SubExprEliminationState(isNull: String, value: String)\n

@rxin
Copy link
Contributor

rxin commented Nov 27, 2015

cc @nongli

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make this method return void? we can just assign values to isNull and value as they are both member variables.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. make sense.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about The collection of sub-exression result reset method that need to be called on each row.?

@cloud-fan
Copy link
Contributor

LGTM overall, cc @nongli to take another look.

@SparkQA
Copy link

SparkQA commented Nov 27, 2015

Test build #46795 has finished for PR 10009 at commit b3cf6a8.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * case class SubExprEliminationState(isNull: String, value: String)\n

@SparkQA
Copy link

SparkQA commented Nov 27, 2015

Test build #46799 has finished for PR 10009 at commit 8e937d3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * case class SubExprEliminationState(isNull: String, value: String)\n

@SparkQA
Copy link

SparkQA commented Nov 27, 2015

Test build #46803 has finished for PR 10009 at commit b162ba5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * case class SubExprEliminationState(isNull: String, value: String)\n

@viirya
Copy link
Member Author

viirya commented Nov 30, 2015

ping @rxin @cloud-fan @nongli

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I right in saying you've picked a different approach to do this? You removed isLoaded and just load it every time. I think this is scary because it does not allow shortcircuiting. This can be bad if it is expr1 AND expr2 and expr1 is selective or expr2 is expensive.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand your point. $fnName will be called only once for each row. This patch doesn't change that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nvm, what I said doesn't apply for this operator.

I was thinking for Filter you wouldn't want to call fnName once for each row. It can be called less than once.

LGTM

@marmbrus
Copy link
Contributor

marmbrus commented Dec 1, 2015

Thanks, merging to master and 1.6

asfgit pushed a commit that referenced this pull request Dec 1, 2015
JIRA: https://issues.apache.org/jira/browse/SPARK-12018

The code of common subexpression elimination can be factored and simplified. Some unnecessary variables can be removed.

Author: Liang-Chi Hsieh <viirya@appier.com>

Closes #10009 from viirya/refactor-subexpr-eliminate.

(cherry picked from commit 9693b0d)
Signed-off-by: Michael Armbrust <michael@databricks.com>
@asfgit asfgit closed this in 9693b0d Dec 1, 2015
@viirya viirya deleted the refactor-subexpr-eliminate branch December 27, 2023 18:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants