-
Notifications
You must be signed in to change notification settings - Fork 4.8k
HIVE-29376:Using partition spec in DESC FORMATTED sql is unsupported … #6259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…for Iceberg table
| return null; | ||
| } | ||
|
|
||
| private Partition getPartition(Table tab, Map<String, String> partitionSpec) throws HiveException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic here is almost same as getPartition() in DescTableOperation.java https://github.com/apache/hive/pull/6259/changes#diff-641c62b42b01bff41c89a3b3661c15d6c08fce0e48740347f36fe32448984147R131
Check if you can add them in an utility so it can be reused
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for checking it, moved out this logic from DescTableOperation.java and DescTableAnalyzer.java to HiveIcebergStorageHandler.java
https://github.com/apache/hive/pull/6259/changes#diff-93864ecf035fe51b92185015da842a56837cea89064813de39c278c6f8fed03cR2079
please take a look
soumyakanti3578
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DDLUtils.isIcebergTable() has been used in many places in the compiler. I think we should not use a specific table type here.
| } | ||
|
|
||
| private Partition getPartition(Table tab, Map<String, String> partitionSpec) throws HiveException { | ||
| boolean isIcebergTable = DDLUtils.isIcebergTable(tab); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't use DDLUtils.isIcebergTable as this API in the compiler is too specific. Instead please use tab.isNonNative() in conjunction to other APIs if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for checking it, tried to make changes more generic to non native tables instead of iceberg centric please take a look
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ramitg254, consider joining with Table.hasNonNativePartitionSupport
| try { | ||
| part = db.getPartition(tab, partitionSpec, false); | ||
| isPartitionPresent = table.isNonNative() ? | ||
| table.getStorageHandler().isPartitionPresent(table, partitionSpec) : |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we keep the original code
part = db.getPartition(tab, partitionSpec)
and return null in StorageHandler.getPartition when partition doesn't exist?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually I intentionally dropped the original code there as getPartition is called at two places one is in analyzer and other is in operation and this is check uses two lists via getPartitions and getPartitionKeys and if I move this check to getPartition then this check will be called two times
But in the current state we are calling it single time only in analyzer phase and retrieving the partition directly in operation phase which I think is more efficient
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, i didn't catch the idea. please see comment in #6259 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, kept the original one
ql/src/java/org/apache/hadoop/hive/ql/ddl/table/info/desc/formatter/TextDescTableFormatter.java
Outdated
Show resolved
Hide resolved
ql/src/java/org/apache/hadoop/hive/ql/ddl/table/info/desc/DescTableOperation.java
Show resolved
Hide resolved
ql/src/java/org/apache/hadoop/hive/ql/metadata/DummyPartition.java
Outdated
Show resolved
Hide resolved
| return false; | ||
| } | ||
|
|
||
| default boolean isPartitionPresent(org.apache.hadoop.hive.ql.metadata.Table table, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
drop this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually I added it because of these considerations:
-
getPartitionin iceberg returns dummy partition corresponding to that partSpec without checking it is already present or not and this functionality is used in other places like for insert queries for iceberg tables as well that's why I added this separate method -
I also thought of adding this method in
Hive.javainstead of storage handler but that would be too specific and if we have newer storage handler in future then it can have some different implementation to check the partition is present or not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getPartition in iceberg returns dummy partition corresponding to that partSpec without checking it is already present or not
exactly. i think we need to change that API and also drop the RewritePolicy policy arg (move Context.RewritePolicy.get(conf) inside the StorageHandler).
getPartition() should return dummy partition corresponding to the provided partSpec only if exists otherwise it confusing.
please check if ATM this method is only used by Iceberg compaction. in that scenario we could skip the validation (i.e. check for SessionState.get().isCompaction())
cc @difin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@deniskuzZ updated the getPartition api implementation as per the suggestion and tests are passing without affecting any insertion and compaction related queries. Please have a look
| @@ -0,0 +1,4 @@ | |||
| drop table if exists ice_t; | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please combine it with the desc_ice_tbl_part_spec.q
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you please explain what do you mean by combining ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please keep both test cases in 1 qfile
|



…for Iceberg table
What changes were proposed in this pull request?
Adds support for using partition spec with describe statement for iceberg table.
and updated the other test outputs as partition information is also gettting printed for desc statement after the changes
Why are the changes needed?
currently using partition spec with describe statement for iceberg table result in unsupported exception
Does this PR introduce any user-facing change?
yes, this statement will not result in exception anymore
How was this patch tested?
build locally and ci tests and added q tests