Skip to content

Conversation

@tanelk
Copy link
Contributor

@tanelk tanelk commented Jul 13, 2020

What changes were proposed in this pull request?

Add And(IsNotNull(e), GreaterThan(Size(e), Literal(0))) filter before Explode, PosExplode and Inline, when outer = false.
Removed unused InferFiltersFromConstraints from operatorOptimizationRuleSet to avoid confusion that happened during the review process.

Why are the changes needed?

Predicate pushdown will be able to move this new filter down through joins and into data sources for performance improvement.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Unit test

@tanelk tanelk reopened this Sep 16, 2020
@tanelk
Copy link
Contributor Author

tanelk commented Sep 16, 2020

cc @maropu @dongjoon-hyun

Also, I messed up one commit, that made the bot add wrong labels.

@maropu
Copy link
Member

maropu commented Sep 17, 2020

ok to test

@SparkQA
Copy link

SparkQA commented Sep 17, 2020

Test build #128783 has finished for PR 29092 at commit ffe43d7.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@tanelk
Copy link
Contributor Author

tanelk commented Sep 29, 2020

Pinging @viirya and @cloud-fan for possible second review

@SparkQA
Copy link

SparkQA commented Sep 29, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33866/

@SparkQA
Copy link

SparkQA commented Sep 29, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33866/

@SparkQA
Copy link

SparkQA commented Oct 7, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34119/

@SparkQA
Copy link

SparkQA commented Oct 7, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34119/

@SparkQA
Copy link

SparkQA commented Oct 7, 2020

Test build #129514 has finished for PR 29092 at commit a25bd6f.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 8, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34144/

@SparkQA
Copy link

SparkQA commented Oct 8, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34144/

@SparkQA
Copy link

SparkQA commented Oct 8, 2020

Test build #129538 has finished for PR 29092 at commit d4089ca.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 12, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34301/

@SparkQA
Copy link

SparkQA commented Oct 12, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34301/

@SparkQA
Copy link

SparkQA commented Oct 12, 2020

Test build #129695 has finished for PR 29092 at commit e3a0205.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

InferFiltersFromConstraints) ::
Batch("Operator Optimization after Inferring Filters", fixedPoint,
rulesWithoutInferFiltersFromConstraints: _*) ::
operatorOptimizationRuleSet: _*) ::
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: operatorOptimizationRuleSet -> rulesWithoutInferFilters?

@maropu maropu changed the title [SPARK-32295][SQL] Add not null and size > 0 filters before inner explode to benefit from predicate pushdown [SPARK-32295][SQL] Add not null and size > 0 filters before inner explode/inline to benefit from predicate pushdown Oct 12, 2020
@maropu
Copy link
Member

maropu commented Oct 12, 2020

Looks okay otherwise.

@SparkQA
Copy link

SparkQA commented Oct 13, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34326/

@SparkQA
Copy link

SparkQA commented Oct 13, 2020

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34326/

@SparkQA
Copy link

SparkQA commented Oct 13, 2020

Test build #129720 has finished for PR 29092 at commit b7a7c49.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 13, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34344/

@SparkQA
Copy link

SparkQA commented Oct 13, 2020

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34344/

@maropu maropu closed this in 17eebd7 Oct 13, 2020
@maropu
Copy link
Member

maropu commented Oct 13, 2020

GA passed. Thanks! Merged to master.

@SparkQA
Copy link

SparkQA commented Oct 13, 2020

Test build #129738 has finished for PR 29092 at commit 857c007.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

// Exclude child's constraints to guarantee idempotency
val inferredFilters = ExpressionSet(
Seq(
GreaterThan(Size(g.children.head), Literal(0)),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be useful to filter rows before join, but can we pushdown Size expression to datasource?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants