Skip to content

Use Id based mapping for nested fields in Iceberg Connector#6520

Merged
phd3 merged 4 commits intotrinodb:masterfrom
phd3:iceberg-use-id-nested
Feb 8, 2021
Merged

Use Id based mapping for nested fields in Iceberg Connector#6520
phd3 merged 4 commits intotrinodb:masterfrom
phd3:iceberg-use-id-nested

Conversation

@phd3
Copy link
Copy Markdown
Member

@phd3 phd3 commented Jan 6, 2021

So far we've been using nested field names to map nested fields to ORC nested columns. This PR uses Iceberg IDs instead, when available in ORC file.

The name-based mapping logic for nested fields was using names from the fields in RowType. We could manipulate the RowType before sending it to ORC reader, but it'd be a bit hacky since we'd end up constructing dummy column names to avoid matching. this PR instead uses a "field mapper" that can be provided to ORC reader by hive/iceberg connector.

Currently the PR only implements the support for ORC.

@phd3 phd3 added the WIP label Jan 6, 2021
@cla-bot cla-bot bot added the cla-signed label Jan 6, 2021
@phd3 phd3 force-pushed the iceberg-use-id-nested branch from 2678018 to a069118 Compare January 6, 2021 18:52
@phd3 phd3 removed the WIP label Jan 6, 2021
@phd3 phd3 requested review from electrum and lxynov January 6, 2021 23:51
Copy link
Copy Markdown
Member

@lxynov lxynov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great modulo small style comments. Clean solution by introducing FieldMappers! Thanks for figuring it out.

@phd3 phd3 force-pushed the iceberg-use-id-nested branch from a069118 to 8f55f78 Compare January 21, 2021 01:17
@phd3 phd3 force-pushed the iceberg-use-id-nested branch from 8f55f78 to b652fe7 Compare January 21, 2021 01:22
@phd3
Copy link
Copy Markdown
Member Author

phd3 commented Jan 28, 2021

@lxynov AC

@phd3 phd3 merged commit d07e221 into trinodb:master Feb 8, 2021
@phd3 phd3 added this to the 352 milestone Feb 8, 2021
@phd3 phd3 mentioned this pull request Feb 8, 2021
10 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

3 participants