Use Id based mapping for nested fields in Iceberg Connector#6520
Merged
phd3 merged 4 commits intotrinodb:masterfrom Feb 8, 2021
Merged
Use Id based mapping for nested fields in Iceberg Connector#6520phd3 merged 4 commits intotrinodb:masterfrom
phd3 merged 4 commits intotrinodb:masterfrom
Conversation
2678018 to
a069118
Compare
lxynov
approved these changes
Jan 16, 2021
Member
lxynov
left a comment
There was a problem hiding this comment.
Looks great modulo small style comments. Clean solution by introducing FieldMappers! Thanks for figuring it out.
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/ColumnIdentity.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergColumnHandle.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/ColumnIdentity.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergPageSourceProvider.java
Outdated
Show resolved
Hide resolved
a069118 to
8f55f78
Compare
8f55f78 to
b652fe7
Compare
Member
Author
|
@lxynov AC |
electrum
approved these changes
Feb 8, 2021
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
So far we've been using nested field names to map nested fields to ORC nested columns. This PR uses Iceberg IDs instead, when available in ORC file.
The name-based mapping logic for nested fields was using names from the fields in
RowType. We could manipulate theRowTypebefore sending it to ORC reader, but it'd be a bit hacky since we'd end up constructing dummy column names to avoid matching. this PR instead uses a "field mapper" that can be provided to ORC reader by hive/iceberg connector.Currently the PR only implements the support for ORC.