Handle missing StorageDescriptor in Hive Glue tables#17655
Handle missing StorageDescriptor in Hive Glue tables#17655beinan merged 3 commits intoprestodb:masterfrom
Conversation
|
@ChunxuTang could you take a look and review this? |
1 similar comment
|
@ChunxuTang could you take a look and review this? |
|
@prithvip Sure! Glad to help. Will take a look today. |
ChunxuTang
left a comment
There was a problem hiding this comment.
Hi @v-jizhang, thanks a lot for your nice work!
I just left some minor suggestions. Feel free to have a look.
Additionally, as the PR was sent around two weeks ago, when you push new changes, could you also merge the current presto master branch to avoid potential merge conflicts? Thanks!
There was a problem hiding this comment.
Any there any specific reasons that you want to keep these functions in the FileHiveMetastore class? Looks like they are some helper functions. In that case, we can put them into the HiveUtil class.
There was a problem hiding this comment.
HiveUtil is in presto-hive module and if put them there, to avoid circular dependency, they cannot be referenced in presto-hive-metastore.
There was a problem hiding this comment.
I think most changes in this file could be moved to HiveUtil, as the changes are mostly related to some helper functions.
There was a problem hiding this comment.
HiveUtil is in presto-hive and because presto-hive depends on presto-hive-metastore, presto-hive-metastore can't use anything in HiveUtil
There was a problem hiding this comment.
Gotcha. Thanks for your clarification!
In that case, maybe we could move the utility functions to MetastoreUtil? Most added functions here look to be utility functions that are independent of FileHiveMetastore. What do you think?
There was a problem hiding this comment.
Is this Transaction transaction = newTransaction() necessary? Looks like it's not used.
There was a problem hiding this comment.
You're right. It's not used.
|
@ChunxuTang @rohanpednekar: There is fix trinodb/trino#12130. I'd like to port it in this PR also. Thanks |
ChunxuTang
left a comment
There was a problem hiding this comment.
Hi @v-jizhang, thanks a lot for your work!
Sorry for my late reply. Let some minor suggestions~
BTW, I noticed that trinodb/trino#12130 also contains two new functions testIcebergTableNonNullStorageDescriptor and testDeltaTableNonNullStorageDescriptor in the TestGlueToTrinoConverter. Would you mind also cherry-picking them here?
There was a problem hiding this comment.
Gotcha. Thanks for your clarification!
In that case, maybe we could move the utility functions to MetastoreUtil? Most added functions here look to be utility functions that are independent of FileHiveMetastore. What do you think?
478e040 to
f7832f8
Compare
@ChunxuTang Refactored utility functions and added tests. Thanks. |
ChunxuTang
left a comment
There was a problem hiding this comment.
@v-jizhang LGTM! Thanks for your work! BTW, I noticed that there's a testing failure. Could you re-run the tests to pass all of them?
Hi @beinan, would you like to also take a quick look at this PR?
Cherry-pick of trinodb/trino#11092 Fixes prestodb#17643 Co-authored-by: Alex <jo.alex2144@gmail.com>
Cherry-pick of trinodb/trino#12130 Some table loaders (e.g. Iceberg FlinkSink) will update/add Glue storage descriptors even while writing iceberg and delta format tables. Although these storage descriptors are not technically needed by delta/iceberg, it is not incompatible. Instead of assuming that a Glue storage descriptor excludes delta/iceberg usage, we should instead check directly for the table_type metadata flag to determine whether or not a table is delta/iceberg. Co-authored-by: Eric Hwang <eric.hwang@starburstdata.com>
Cherry-pick of trinodb/trino#12164 Delta Lake connector expects that the metastore storage descriptor be fully translated if it exists. It only needs to return a dummy storage value if no storage descriptor has been configured. Co-authored-by: Eric Hwang <eric.hwang@starburstdata.com>
Cherry-pick of trinodb/trino#11092
Fixes #17643
Co-authored-by: Alex jo.alex2144@gmail.com
Test plan - Test added