Skip to content

Parquet Dictionary Predicate Pushdown Fixes#13594

Merged
zhenxiao merged 2 commits intoprestodb:masterfrom
pettyjamesm:parquet-predicate-fix
Oct 24, 2019
Merged

Parquet Dictionary Predicate Pushdown Fixes#13594
zhenxiao merged 2 commits intoprestodb:masterfrom
pettyjamesm:parquet-predicate-fix

Conversation

@pettyjamesm
Copy link
Contributor

Port of trinodb/trino#1846.

Parquet dictionary pushdown was refactored in #6892 to remove a nested loop iteration but accidentally left the inner loop break statement behind. This meant that dictionary predicate pushdown would read at most 1 dictionary.

In addition to fixing the pushdown behavior, this PR adds support for checking the dictionary pushdown on each column skipping additional dictionary reads once the block can already be filtered.

== RELEASE NOTES ==
Hive Changes
* Fix parquet predicate pushdown on dictionaries to consider more than just the first predicate column
* Improve parquet predicate pushdown on dictionaries to avoid reading additional data after successfully eliminating a block 

Commit 0f7982b refactored ParquetPredicateUtils.getDictionaries from
getDictionariesByColumnOrdinal, removing a nested loop iteration but
accidentally leaving in a break statement. The effect has been that
at most 1 dictionary was returned from getDictionaries, limiting the
effectiveness of predicate pushdown on dictionaries.
No more parquet dictionaries need to be read once a dictionary
predicate pushdown check succeeds.
@zhenxiao zhenxiao self-requested a review October 23, 2019 21:15
@zhenxiao zhenxiao self-assigned this Oct 23, 2019
@tooptoop4
Copy link

@pettyjamesm does this fix #13457 ?

@pettyjamesm
Copy link
Contributor Author

@pettyjamesm does this fix #13457 ?

Unlikely, the query referenced in that issue has no predicates and the bug it does fix was introduced in 0.174

Copy link
Collaborator

@zhenxiao zhenxiao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch, @pettyjamesm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants