Skip to content

[SPARK-20246][SQL] should not push predicate down through aggregate with non-deterministic expressions#17562

Closed
cloud-fan wants to merge 2 commits intoapache:masterfrom
cloud-fan:filter
Closed

[SPARK-20246][SQL] should not push predicate down through aggregate with non-deterministic expressions#17562
cloud-fan wants to merge 2 commits intoapache:masterfrom
cloud-fan:filter

Conversation

@cloud-fan
Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

Similar to Project, when Aggregate has non-deterministic expressions, we should not push predicate down through it, as it will change the number of input rows and thus change the evaluation result of non-deterministic expressions in Aggregate.

How was this patch tested?

new regression test

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test was wrong, actually we can push down nondeterministic filter through project, as long as the project list is all deterministic.

@cloud-fan
Copy link
Copy Markdown
Contributor Author

cc @liancheng @gatorsmile

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: through project with all deterministic fields.

@viirya
Copy link
Copy Markdown
Member

viirya commented Apr 7, 2017

LGTM except a minor comment.

@SparkQA
Copy link
Copy Markdown

SparkQA commented Apr 7, 2017

Test build #75597 has finished for PR 17562 at commit a2599be.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link
Copy Markdown

SparkQA commented Apr 7, 2017

Test build #75598 has finished for PR 17562 at commit e6a1bfe.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link
Copy Markdown

SparkQA commented Apr 7, 2017

Test build #75600 has finished for PR 17562 at commit e6546be.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.


case filter @ Filter(condition, aggregate: Aggregate) =>
case filter @ Filter(condition, aggregate: Aggregate)
if aggregate.aggregateExpressions.forall(_.deterministic) =>
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you move this case above case filter @ Filter(condition, w: Window)?

Based on the comment you add above, it becomes easier to follow by the readers.

@gatorsmile
Copy link
Copy Markdown
Member

LGTM except a minor comment

@SparkQA
Copy link
Copy Markdown

SparkQA commented Apr 7, 2017

Test build #75604 has finished for PR 17562 at commit f254d5f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

asfgit pushed a commit that referenced this pull request Apr 8, 2017
…ith non-deterministic expressions

## What changes were proposed in this pull request?

Similar to `Project`, when `Aggregate` has non-deterministic expressions, we should not push predicate down through it, as it will change the number of input rows and thus change the evaluation result of non-deterministic expressions in `Aggregate`.

## How was this patch tested?

new regression test

Author: Wenchen Fan <wenchen@databricks.com>

Closes #17562 from cloud-fan/filter.

(cherry picked from commit 7577e9c)
Signed-off-by: Xiao Li <gatorsmile@gmail.com>
asfgit pushed a commit that referenced this pull request Apr 8, 2017
…ith non-deterministic expressions

## What changes were proposed in this pull request?

Similar to `Project`, when `Aggregate` has non-deterministic expressions, we should not push predicate down through it, as it will change the number of input rows and thus change the evaluation result of non-deterministic expressions in `Aggregate`.

## How was this patch tested?

new regression test

Author: Wenchen Fan <wenchen@databricks.com>

Closes #17562 from cloud-fan/filter.

(cherry picked from commit 7577e9c)
Signed-off-by: Xiao Li <gatorsmile@gmail.com>
@gatorsmile
Copy link
Copy Markdown
Member

Thanks! Merging to master/2.1/2.0

@asfgit asfgit closed this in 7577e9c Apr 8, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants