Skip to content

Conversation

@gustavoatt
Copy link
Contributor

Summary

Fixes #1138 (comment) by making sure that we can convert from org.apache.parquet.schema.MessageType to org.apache.iceberg.types.Type.

The issue was occurring due to Iceberg not being able to map the int96 type to an Iceberg Timestamp. Also added an end to end test to verify that we can read through Iceberg API the imported parquet file.

While doing this, I noticed that the Iceberg generic parquet reader was reading the timestamp as timestamp without timezone instead of a timestamp with timezone.

Testing

Improved our current unit-test to handle this issue

Reviewers

to: @rdblue
cc: @thesquelched @aokolnychyi

@rdblue rdblue self-requested a review August 12, 2020 00:48
@rdblue
Copy link
Contributor

rdblue commented Aug 12, 2020

+1 when tests are passing.

@rdblue rdblue merged commit 17b5ca5 into apache:master Aug 12, 2020
@rdblue
Copy link
Contributor

rdblue commented Aug 12, 2020

Merged! Thanks @gustavoatt!

@gustavoatt
Copy link
Contributor Author

Thanks for the fast review Ryan!

@thesquelched
Copy link

Thanks!

@gustavoatt gustavoatt deleted the gustavoatt--int96-timestamps-generic-read branch August 12, 2020 17:04
aokolnychyi pushed a commit to aokolnychyi/iceberg that referenced this pull request Aug 18, 2020
cmathiesen pushed a commit to ExpediaGroup/iceberg that referenced this pull request Aug 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

int96 support in parquet

3 participants