Hive: Convert the CREATE TABLE ... PARTITIONED BY to Iceberg identity partitions#1917
Merged
rdblue merged 4 commits intoapache:masterfrom Dec 18, 2020
Merged
Hive: Convert the CREATE TABLE ... PARTITIONED BY to Iceberg identity partitions#1917rdblue merged 4 commits intoapache:masterfrom
rdblue merged 4 commits intoapache:masterfrom
Conversation
marton-bod
reviewed
Dec 12, 2020
mr/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java
Outdated
Show resolved
Hide resolved
mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandler.java
Outdated
Show resolved
Hide resolved
marton-bod
reviewed
Dec 12, 2020
mr/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java
Outdated
Show resolved
Hide resolved
marton-bod
reviewed
Dec 12, 2020
mr/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java
Outdated
Show resolved
Hide resolved
added 2 commits
December 12, 2020 18:01
Collaborator
|
@rdblue Could you please take a look at this as well, whenever suitable? :) Thank you! |
rdblue
reviewed
Dec 17, 2020
rdblue
reviewed
Dec 17, 2020
rdblue
reviewed
Dec 17, 2020
| return SchemaParser.fromJson(properties.getProperty(InputFormatConfig.TABLE_SCHEMA)); | ||
| } else { | ||
| return HiveSchemaUtil.convert(hmsTable.getSd().getCols()); | ||
| if (hmsTable.isSetPartitionKeys() && !hmsTable.getPartitionKeys().isEmpty()) { |
Contributor
There was a problem hiding this comment.
When would isSetPartitionKeys() be true and getPartitionKeys() empty?
Contributor
Author
There was a problem hiding this comment.
Theoretically it could be set to an empty list. Checked this way to keep on the safe side.
rdblue
reviewed
Dec 17, 2020
rdblue
reviewed
Dec 17, 2020
| test { | ||
| // testJoinTables / testScanTable | ||
| maxHeapSize '1500m' | ||
| maxHeapSize '2500m' |
Contributor
There was a problem hiding this comment.
Why was this needed? Additional tasks because of partitioning?
Contributor
Author
There was a problem hiding this comment.
Yeah. The extra tasks eat more memory :(
rdblue
approved these changes
Dec 17, 2020
Contributor
|
Thanks @marton-bod for reviewing, and @pvary for working on this! It looks good to me. I noted a few nits to fix, but I'm also fine merging this if you don't have time to fix them. I'll wait a day or two and then merge if I don't hear back. |
pvary
pushed a commit
to pvary/iceberg
that referenced
this pull request
Jan 5, 2021
rdblue
pushed a commit
that referenced
this pull request
Jan 5, 2021
XuQianJin-Stars
pushed a commit
to XuQianJin-Stars/iceberg
that referenced
this pull request
Mar 22, 2021
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
After consulting with the field folks they convinced me that it would be beneficial to have the first version of conversion in place for creating partitioned Iceberg tables from Hive. They suggested that even in this limited form this feature can boost adoption by allowing to try out Iceberg tables with partitions without changing the actual SQL commands.