Parquet: Fix row group filters with promoted types #2232

rdblue · 2021-02-09T23:58:44Z

This fixes Parquet row group filters when types have been promoted from int to long or from float to double.

The filters are passed the file schema after ids are added, which is used to convert dictionary values or lower/upper bounds. That conversion currently uses the file's types to deserialize, but the filter expression is bound to the table types. If the types differ, then comparison in the evaluator fails.

This updates the conversion to first deserialize the Parquet value and then promote it if the table's type has changed. Only int to long and float to double are needed because those are the only type promotions that use a different representation.

aokolnychyi

LGTM

aokolnychyi · 2021-02-13T04:39:29Z

This looked correct to me so I merged it. @danielcweeks, let us know if you have any comments.

Thanks, @rdblue!

danielcweeks · 2021-02-15T18:52:09Z

Thanks for reviewing @aokolnychyi

This fixes Parquet row group filters when types have been promoted from int to long or from float to double. The filters are passed the file schema after ids are added, which is used to convert dictionary values or lower/upper bounds. That conversion currently uses the file's types to deserialize, but the filter expression is bound to the table types. If the types differ, then comparison in the evaluator fails. This updates the conversion to first deserialize the Parquet value and then promote it if the table's type has changed. Only int to long and float to double are needed because those are the only type promotions that use a different representation.

* Add 0.12.0 release notes pt 2 * Add more blurbs and fix formatting. - Add blurbs for #2565, #2583, and #2547. - Make formatting consistent. * Add blurb for #2613 Hive Vectorized Reader * Reword blurbs for #2565 and #2365 * More changes based on review comments * More updates to the 0.12.0 release notes * Add blurb for #2232 fix parquet row group filters * Add blurb for #2308

Fix Parquet filters with promoted types.

3d5630c

github-actions bot added data parquet labels Feb 9, 2021

Fix compile error.

f642d5f

rdblue requested a review from danielcweeks February 10, 2021 18:51

aokolnychyi approved these changes Feb 12, 2021

View reviewed changes

aokolnychyi merged commit 5218f43 into apache:master Feb 13, 2021

rdblue added this to the Java 0.11.1 Release milestone Mar 4, 2021

rdblue mentioned this pull request Aug 17, 2021

Add 0.12.0 release notes pt 2 #2986

Merged

cwsteinbach added a commit to cwsteinbach/apache-iceberg that referenced this pull request Aug 17, 2021

Add blurb for apache#2232 fix parquet row group filters

6e0625d

lrvingzhou-tx mentioned this pull request Sep 7, 2021

Parquet: Fix row group filters with promoted types #2232 BKBASE-Plugin/iceberg#1

Closed

lrvingzhou-tx pushed a commit to BKBASE-Plugin/iceberg that referenced this pull request Sep 7, 2021

Parquet: Fix row group filters with promoted types apache#2232

47ce741

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Parquet: Fix row group filters with promoted types #2232

Parquet: Fix row group filters with promoted types #2232

Uh oh!

rdblue commented Feb 9, 2021 •

edited

Loading

Uh oh!

aokolnychyi left a comment

Uh oh!

aokolnychyi commented Feb 13, 2021

Uh oh!

danielcweeks commented Feb 15, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Parquet: Fix row group filters with promoted types #2232

Parquet: Fix row group filters with promoted types #2232

Uh oh!

Conversation

rdblue commented Feb 9, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aokolnychyi left a comment

Choose a reason for hiding this comment

Uh oh!

aokolnychyi commented Feb 13, 2021

Uh oh!

danielcweeks commented Feb 15, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

rdblue commented Feb 9, 2021 •

edited

Loading