-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Hive: Expose default partition spec and sort order in HMS. #4546
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| private void setPartitionSpec(TableMetadata metadata, Map<String, String> parameters) { | ||
| parameters.remove(TableProperties.DEFAULT_PARTITION_SPEC); | ||
| if (metadata.spec() != null && metadata.spec().isPartitioned()) { | ||
| parameters.put(TableProperties.DEFAULT_PARTITION_SPEC, PartitionSpecParser.toJson(metadata.spec())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PartitionSpecParser.toJson() doesn't convert a source column id to its name. Here is an example of a partition spec json string.
{"spec-id":0,"fields":[{"name":"data_bucket","transform":"bucket[16]","source-id":2,"field-id":1000}]}
It only shows the source id(2), the column name is data as the following code shows. It'd be more user-friendly if we show the column name.
Schema schema = new Schema(
required(1, "id", Types.IntegerType.get(), "unique ID"),
required(2, "data", Types.StringType.get())
);
PartitionSpec spec = PartitionSpec.builderFor(schema)
.bucket("data", 16)
.build();
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looked a bit more. The current solution is fine since the partition builder will automatically add a name based on the source column name. For example, data_bucket is the name for the bucket transform on the column data. Moreover, the string generated by PartitionSpecParser.toJson() is consistent with the one in the metadata.json file. Let's keep it in that sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would we need to store schema on HMS side to make sense of the ids? Otherwise today we are depending on the naming convention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Schema has been stored in HMS,
iceberg/hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java
Line 247 in 116026b
| tbl.setSd(storageDescriptor(metadata, hiveEngineEnabled)); // set to pickup any schema changes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added source-name in the spec json string. It will be like this
{"spec-id":1,"fields":[{"name":"data_bucket_16","transform":"bucket[16]","source-id":2,"source-name":"data","field-id":1000}]}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Chat offline)
Im thinking instead of changing it, its cleaner to store schema object in separate field, its more accurate then Hive schema and a bit less hacky than changing the partition-spec format to add it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a new property current-schema to avoid adding column name into partition spec and sort order. It duplicates the schema with Hive format, but we think it's worth to do that. The Hive format loses important column info(id).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea also i think Hive columns may not have exact Iceberg schema in some cases. For instance, the type conversions are not exact 1-to-1 mapping.
hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveCatalog.java
Outdated
Show resolved
Hide resolved
|
|
||
| private void setPartitionSpec(TableMetadata metadata, Map<String, String> parameters) { | ||
| parameters.remove(TableProperties.DEFAULT_PARTITION_SPEC); | ||
| if (metadata.spec() != null && metadata.spec().isPartitioned()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[question] In v1 spec when a partition is dropped it is replaced by VoidTransform, if all the transforms are void we should consider it un-partitioned (This may be beyond the scope of present PR), but presently when we call isPartitioned it will return true in this case. do we want to store partition spec in this scenario ? Your thoughts.
This is based on ticket #3014 @RussellSpitzer filed a while back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @singhpk234 for pointing out. #3059 is trying to fix #3014, and it is almost ready to merge. It should be fine in that case.
hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveCatalog.java
Outdated
Show resolved
Hide resolved
| * <p> | ||
| * This reserved property is used to store the default partition spec. | ||
| */ | ||
| public static final String DEFAULT_PARTITION_SPEC = "default-partition-spec"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be helpful to clarify what is meant by default for both of these.
Each javadoc states that it’s for the “default” spec / sort order, but I’m still not entirely sure what that means.
Is it the current sort order / partition spec that would be used if the user doesn’t override it for an individual query ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe something like “JSON representation of the table’s current configured partition spec, which will be used if not overridden for individual writes”. Kind of wordy but something along those lines would be helpful for me if quickly looking through the JavaDocs etc. Will leave that decision to you though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is for the current partition spec and sort order. Make sense to me. Will make the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you reverted too many? Should be 'current'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to be consistent with the name in metadata.json, which is default-partition-spec. The same for sort order. I changed the comments though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK I see, yea I always find it confusing.
In the comment, maybe we can add that they are equivalent, otherwise the comment is even more confusing:
Reserved table property for the JSON representation of current (default) schema.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is confusing. I like the current more. I keep the original name just for consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree , once we change the word default to current, it looks ok to me.
One concern I was thinking in general even from #4456 , whether all this JSON parsing increase the commit times which can be sensitive for Streaming applications (I guess having that flag to control partition length is good in #4456 as user can set it to 0 if so, but it doesn't help these ones). At some point should measure the commit time perf for large tables. Or just have a follow up pr for a flag to disable all these stats persistence altogether if its a concern, as right now there's no way to opt-out, wdyt?
| private void setPartitionSpec(TableMetadata metadata, Map<String, String> parameters) { | ||
| parameters.remove(TableProperties.DEFAULT_PARTITION_SPEC); | ||
| if (metadata.spec() != null && metadata.spec().isPartitioned()) { | ||
| parameters.put(TableProperties.DEFAULT_PARTITION_SPEC, PartitionSpecParser.toJson(metadata.spec())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would we need to store schema on HMS side to make sense of the ids? Otherwise today we are depending on the naming convention.
Out of all table properties synced to HMS, schema, snapshot summary, partition spec will more likely impact the commit perf. We may think of a general way to control that. I'd suggest to handle that in another PR. |
Yea I was mentioning, a follow-up PR is fine, or if someone has any other idea about this |
0e857b8 to
9bf2a0f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks better , thanks @flyrain , few comments on the code side.
As discussed offline and as I understand, these changes are for users of HiveCatalog who are interested in getting a better understanding of Iceberg metadata directly from HIve API. This could be done by RestCatalog functionalities once that has more adoptions.
| * <p> | ||
| * This reserved property is used to store the default partition spec. | ||
| */ | ||
| public static final String DEFAULT_PARTITION_SPEC = "default-partition-spec"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you reverted too many? Should be 'current'.
|
|
||
| private void setSchema(TableMetadata metadata, Map<String, String> parameters) { | ||
| parameters.remove(TableProperties.CURRENT_SCHEMA); | ||
| if (metadata.schema() != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets' have common method and re-use to set various fields like
private void setField(TableMetadata metadata, Map<String, String> parameters, String key, String value) {
parameters.remove(key);
if (value.length <= maxHiveTablepropertySize) {
parameters.put(key, value);
} else {
LOG.warn("Not exposing {} in HMS since it exceeds {} characters", maxHiveTablePropertySize);
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made the change.
szehon-ho
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional: Maybe we can add a test, about schema being too big to serialize, like snapshotSummary test
|
FYI @singhpk234 if you want to take another look ? |
singhpk234
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me as well :) !!!
Thanks @flyrain
| private void setField(Map<String, String> parameters, String key, String value) { | ||
| if (value.length() <= maxHiveTablePropertySize) { | ||
| parameters.put(key, value); | ||
| } else { | ||
| LOG.warn("Not exposing {} in HMS since it exceeds {} characters", key, maxHiveTablePropertySize); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit] should we make this :
iceberg/hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java
Lines 425 to 430 in 449a743
| if (summary.length() <= maxHiveTablePropertySize) { | |
| parameters.put(TableProperties.CURRENT_SNAPSHOT_SUMMARY, summary); | |
| } else { | |
| LOG.warn("Not exposing the current snapshot({}) summary in HMS since it exceeds {} characters", | |
| currentSnapshot.snapshotId(), maxHiveTablePropertySize); | |
| } |
also use this setter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was trying to do that, but the warn message needs snapshot id here. But it requires changes for method setFiled() like the below, and changes for all other callers. It's like removing one duplication, but adding a few complication. I'd suggest to keep it as is.
setField(Map<String, String> parameters, String key, String value, String warnMessage)
Made the change |
szehon-ho
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @flyrain i had a few more comments , can you see if they make sense.
hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveCatalog.java
Show resolved
Hide resolved
| } | ||
| } | ||
|
|
||
| private void setField(Map<String, String> parameters, String key, String value) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One performance suggestion, if the user sets to 0 (disable this feature), we can skip the serialization for performance.
Maybe , easiest, we can we add some boolean function like exposeInHmsProperties() that checks if value is 0, and use it in all the methods? (open to better names)
if (exposeInHmsProperties() && metadata.sortOrder() != null && metadata.sortOrder().isSorted()) {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setting it to 0 can control some of them(snapshot summary, schema, partition spec, and sort order), but not all of them. I'd suggest to create another PR to make sure all are taken care.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense, but could we at least take care of the ones in this PR? ( schema, partition spec, sort order)? We can have a follow up for the other ones not touched by this PR.
Just didn't want to leave it in a state where we are wasting CPU cycle (JSON serialization) needlessly if the user turns off this feature. As this is done in the critical commit block, unlike the original serialization which happens before. The other HMS table properties to me are also less CPU intensive as they are just getting a field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense, made the change.
| import org.apache.hadoop.hive.metastore.api.SerDeInfo; | ||
| import org.apache.hadoop.hive.metastore.api.StorageDescriptor; | ||
| import org.apache.hadoop.hive.metastore.api.Table; | ||
| import org.apache.hadoop.hive.metastore.api.hive_metastoreConstants; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two other suggestion for this class: can we add in comment of "HIVE_TABLE_PROPERTY_MAX_SIZE" , one more sentence to let user know how to turn off feature?
// set to 0 to not expose Iceberg metadata in HMS Table properties
And also, a precondition in HiveTableOperations constructor to check if value is non-negative.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #4546 (comment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the comment. Negative is fine, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think probably better to disallow negative as it makes little sense? But to me its ok either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would throwing exception be too much in that case? May just log a warning.
| } | ||
|
|
||
| @Test | ||
| public void testNotExposeTableProperties() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A new test added for method exposeInHmsProperties()
| public void testNotExposeTableProperties() { | ||
| Configuration conf = new Configuration(); | ||
| conf.set("iceberg.hive.table-property-max-size", "0"); | ||
| HiveTableOperations spyOps = spy(new HiveTableOperations(conf, null, null, catalog.name(), DB_NAME, "tbl")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Unnecessary spy
szehon-ho
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @flyrain looks good to me now, will merge tomorrow unless others have some review comment.
|
Merged, thanks @flyrain for contribution and @singhpk234 for additional review |
|
Thanks @szehon-ho for the detailed review and merge! Thanks @kbendick and @singhpk234 for the review. |
|
Maybe a little late in the game (I was on a longer PTO), and only partially related - since this is a Hive 4 feature -, but we did some work to display the partitioning information in the |
|
Thanks @pvary. We want to support the use case that doesn't need storage access. Does HIVE-25326 need to access the metadata.json file to show partition spec? |
|
@flyrain: Yes, it uses the HiveIcebergStorageHandler.getPartitionTransformSpec which in turn uses IcebergTableUtil.getTable to load the table and read the spec from the table snapshot. When we were working on creating Hive tables, I have started working on the Hive table creation we used the |
|
Got it. It should not. This PR is using a different name than |
(cherry picked from commit ded9a4d)
cc @aokolnychyi @RussellSpitzer @szehon-ho @karuppayya @anuragmantri @kbendick @singhpk234