Skip to content

Nested predicate push down to Parquet Reader#7045

Closed
zhenxiao wants to merge 9 commits intoprestodb:masterfrom
zhenxiao:parquet-nested-predicate
Closed

Nested predicate push down to Parquet Reader#7045
zhenxiao wants to merge 9 commits intoprestodb:masterfrom
zhenxiao:parquet-nested-predicate

Conversation

@zhenxiao
Copy link
Collaborator

Built on top of
#6892

Currently Parquet TupleDomain is constructed based on HiveColumnHandle. This would not work if Nested predicate are pushed down, e.g.
select s.a from t where s.b > 10

In this implementation:
Analyze s.b, and put s.b > 10 as an optional Nested predicate in ExtrationResult
add Nested predicate to TableLayout
pass Nested predicate to File Scan, same as flat predicate
Skip reading row groups when Nested predicate does not match Parquet statistics

@zhenxiao zhenxiao force-pushed the parquet-nested-predicate branch 3 times, most recently from 6afa71e to c073b6b Compare January 18, 2017 01:43
@zhenxiao zhenxiao force-pushed the parquet-nested-predicate branch 2 times, most recently from ffe3335 to 47dfe1c Compare January 25, 2017 14:04
@zhenxiao
Copy link
Collaborator Author

@dain @nezihyigitbasi @martint any comments or suggestions?

@zhenxiao zhenxiao force-pushed the parquet-nested-predicate branch from a80cd4f to 6334ef9 Compare March 1, 2017 00:04
@zhenxiao
Copy link
Collaborator Author

zhenxiao commented Mar 1, 2017

@martint @dain @nezihyigitbasi comments or suggestions?

@dain
Copy link
Contributor

dain commented Mar 10, 2017

In an offline discussion we decided that @zhenxiao was going to investigate using synthetic virtual-columns in the connector to enable this push-down feature.

@dain dain removed their request for review March 10, 2017 18:42
@zhenxiao zhenxiao force-pushed the parquet-nested-predicate branch from 6334ef9 to 6422c0a Compare July 7, 2017 07:31
@zhenxiao zhenxiao force-pushed the parquet-nested-predicate branch from 6422c0a to 3aef172 Compare March 11, 2019 21:14
@zhenxiao zhenxiao force-pushed the parquet-nested-predicate branch from 3aef172 to 0f34d07 Compare March 11, 2019 22:52
@mbasmanova
Copy link
Contributor

I assume #13271 superceds this one, hence, closing.

@mbasmanova mbasmanova closed this Aug 27, 2019
@zhenxiao zhenxiao deleted the parquet-nested-predicate branch January 22, 2022 15:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants