[SPARK-20246][SQL] should not push predicate down through aggregate with non-deterministic expressions#17562
[SPARK-20246][SQL] should not push predicate down through aggregate with non-deterministic expressions#17562cloud-fan wants to merge 2 commits intoapache:masterfrom
Conversation
There was a problem hiding this comment.
This test was wrong, actually we can push down nondeterministic filter through project, as long as the project list is all deterministic.
There was a problem hiding this comment.
nit: through project with all deterministic fields.
|
LGTM except a minor comment. |
|
Test build #75597 has finished for PR 17562 at commit
|
|
Test build #75598 has finished for PR 17562 at commit
|
|
Test build #75600 has finished for PR 17562 at commit
|
|
|
||
| case filter @ Filter(condition, aggregate: Aggregate) => | ||
| case filter @ Filter(condition, aggregate: Aggregate) | ||
| if aggregate.aggregateExpressions.forall(_.deterministic) => |
There was a problem hiding this comment.
Could you move this case above case filter @ Filter(condition, w: Window)?
Based on the comment you add above, it becomes easier to follow by the readers.
|
LGTM except a minor comment |
|
Test build #75604 has finished for PR 17562 at commit
|
…ith non-deterministic expressions ## What changes were proposed in this pull request? Similar to `Project`, when `Aggregate` has non-deterministic expressions, we should not push predicate down through it, as it will change the number of input rows and thus change the evaluation result of non-deterministic expressions in `Aggregate`. ## How was this patch tested? new regression test Author: Wenchen Fan <wenchen@databricks.com> Closes #17562 from cloud-fan/filter. (cherry picked from commit 7577e9c) Signed-off-by: Xiao Li <gatorsmile@gmail.com>
…ith non-deterministic expressions ## What changes were proposed in this pull request? Similar to `Project`, when `Aggregate` has non-deterministic expressions, we should not push predicate down through it, as it will change the number of input rows and thus change the evaluation result of non-deterministic expressions in `Aggregate`. ## How was this patch tested? new regression test Author: Wenchen Fan <wenchen@databricks.com> Closes #17562 from cloud-fan/filter. (cherry picked from commit 7577e9c) Signed-off-by: Xiao Li <gatorsmile@gmail.com>
|
Thanks! Merging to master/2.1/2.0 |
What changes were proposed in this pull request?
Similar to
Project, whenAggregatehas non-deterministic expressions, we should not push predicate down through it, as it will change the number of input rows and thus change the evaluation result of non-deterministic expressions inAggregate.How was this patch tested?
new regression test