Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eliminate filter when pushdown_filters is enabled #7688

Closed
Dandandan opened this issue Sep 28, 2023 · 1 comment
Closed

Eliminate filter when pushdown_filters is enabled #7688

Dandandan opened this issue Sep 28, 2023 · 1 comment
Labels
enhancement New feature or request performance Make DataFusion faster

Comments

@Dandandan
Copy link
Contributor

Is your feature request related to a problem or challenge?

When pushdown_filters is enabled, DF should be able to eliminate the subsequent filter.
When enabling the option for tpc-h benchmark, the FilterExec and l_shipdate projection is still present in the plans.

For example query 3 we can see the filter:

FilterExec: l_shipdate@3 > 9204
  ParquetExec: file_groups={2 groups: [[...]]},
    projection=[l_orderkey, l_extendedprice, l_discount, l_shipdate], predicate=l_shipdate@10 > 9204, pruning_predicate=l_shipdate_max@0 > 9204

Describe the solution you'd like

Remove the filter when.

We probably need to make some changes to TableProvider FileFormat to support removing the filter based on the file format.

Describe alternatives you've considered

No response

Additional context

No response

@Dandandan Dandandan added enhancement New feature or request performance Make DataFusion faster labels Sep 28, 2023
@alamb
Copy link
Contributor

alamb commented Sep 18, 2024

I believe this is a duplicate of #4028, which @itsjunetime completed in #12135

Tests are here https://github.com/apache/datafusion/blob/main/datafusion/sqllogictest/test_files/parquet_filter_pushdown.slt

Let me know if I got that incorrect

@alamb alamb closed this as completed Sep 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request performance Make DataFusion faster
Projects
None yet
Development

No branches or pull requests

2 participants