Pushdown dereferences in ORC Reader#3445
Merged
martint merged 3 commits intotrinodb:masterfrom Apr 23, 2020
Merged
Conversation
1dbc0d5 to
91ece53
Compare
10 tasks
martint
reviewed
Apr 21, 2020
presto-orc/src/main/java/io/prestosql/orc/reader/StructColumnReader.java
Outdated
Show resolved
Hide resolved
91ece53 to
c6cca65
Compare
c6cca65 to
83b3d52
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adding projection and predicate pushdown of dereference expressions in ORC Reader.
(1)
OrcPageSourcecommits to providing base columns for the required projected columns, and expectsHivePageSourceto adapt. But information about projected columns is propagated to the OrcReader using "ProjectedLayout".StructColumnReaderreturns null blocks for fields that are not in the layout. The pruning works only for a series of dereferences, Map and List readers don't use this information right now.(2) Predicate pushdown is implemented by using domains on all projected columns instead of just the base column. No other changes are required, because stats used for predicate-based filtering are always read for all the columns anyway.
(3) Added a test for reading dereferenced fields in case of null rows, which was missing during the first pass of #1720.