Skip to content

Parquet predicate improvements#17217

Merged
zhenxiao merged 2 commits intoprestodb:masterfrom
zhenxiao:parquet-predicate
Jan 25, 2022
Merged

Parquet predicate improvements#17217
zhenxiao merged 2 commits intoprestodb:masterfrom
zhenxiao:parquet-predicate

Conversation

@zhenxiao
Copy link
Copy Markdown
Collaborator

@zhenxiao zhenxiao commented Jan 24, 2022

Test plan - TestTupleDomainParquetPredicate

== RELEASE NOTES ==

Hive Changes
* DefuncConfig: hive.parquet.fail-on-corrupted-statistics. Always fail on corrupted parquet statistics

@zhenxiao zhenxiao changed the title Parquet predicate Parquet predicate improvements Jan 24, 2022
@zhenxiao zhenxiao requested review from beinan and vkorukanti January 24, 2022 13:00
Copy link
Copy Markdown
Contributor

@vkorukanti vkorukanti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments. LGTM.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checkArgument(!rangeList.isEmpty(), "cannot use empty rangeList")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doc would be good to understand what is expected of the arguments.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any existing tests the test the code path TupleDomainParquetPredicate.getDomain(..) that pass multiple ranges to it?

zhenxiao and others added 2 commits January 25, 2022 13:24
Cherry-pick of trinodb/trino@fce0521

Co-authored-by: Martin Traverso <mtraverso@gmail.com>
@zhenxiao zhenxiao merged commit 7c3db27 into prestodb:master Jan 25, 2022
@neeradsomanchi neeradsomanchi mentioned this pull request Feb 8, 2022
4 tasks
@harryson497
Copy link
Copy Markdown

Why remove this corrupted check?

@fgwang7w
Copy link
Copy Markdown
Member

fgwang7w commented Sep 27, 2022

so what's the workaround when a SQL encountered a corrupted parquet stats without hive.parquet.fail-on-corrupted-statistics to bypass?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants