-
Notifications
You must be signed in to change notification settings - Fork 3k
Parquet: Fix map projection after map to key_value rename #3309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
RussellSpitzer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, Is there a non painful way we can add a test for this? Seems like manually testing it requires loading up a Parquet Version which defines map types with slightly different names?
|
We can add a test for |
|
Added the missing tests. I'll merge this when tests are passing. |
| .addField(Types.primitive(PrimitiveTypeName.DOUBLE, Type.Repetition.REQUIRED).id(5).named("y")) | ||
| .addField(Types.primitive(PrimitiveTypeName.DOUBLE, Type.Repetition.REQUIRED).id(6).named("z")) | ||
| .id(3) | ||
| .named("value")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is quite the type declaration. Love it.
kbendick
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work @rdblue.
| .addField(Types.primitive(PrimitiveTypeName.DOUBLE, Type.Repetition.REQUIRED).id(5).named("y")) | ||
| .addField(Types.primitive(PrimitiveTypeName.DOUBLE, Type.Repetition.REQUIRED).id(6).named("z")) | ||
| .id(3) | ||
| .named("value")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is quite the type declaration. Love it.
| } | ||
|
|
||
| @Test | ||
| public void testListElementName() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe testListElementDoesNotAssumeName?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I were making another change I'd probably update this, but I think it's pretty minor and I'd like to get this in to make it possible to release 0.12.1.
These are great tests and this is great work. Thank you! |
This fixes the Parquet map projection bug introduced by apache/parquet-java#798
The projection code in Iceberg would create map projections by using the Parquet
Types.mapbuilder. But, the type created by this builder changed by renaming the key-value pair,maptokey_value, so the projection was no longer valid for Parquet files. As a result, Parquet would not project themapcolumn and loading it would fail with an error like this:The solution is to copy the map structure and ensure that the names are preserved rather than generated.
Closes #2962.