Improve call performance for sync_partition_metadata utility#18384
Improve call performance for sync_partition_metadata utility#18384yingsu00 merged 3 commits intoprestodb:masterfrom
sync_partition_metadata utility#18384Conversation
agrawalreetika
left a comment
There was a problem hiding this comment.
Thank you for working on this.
The Benchmark number looks promising, this would be a good improvement for the sync-partition procedure.
I have taken the first pass on PR and added my comments.
These attached Benchmark results are from glue or HMS? Should we do it for both?
presto-hive/src/main/java/com/facebook/presto/hive/SyncPartitionMetadataProcedure.java
Outdated
Show resolved
Hide resolved
...tastore/src/main/java/com/facebook/presto/hive/metastore/SemiTransactionalHiveMetastore.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/com/facebook/presto/hive/SyncPartitionMetadataProcedure.java
Outdated
Show resolved
Hide resolved
60d9881 to
946716a
Compare
imjalpreet
left a comment
There was a problem hiding this comment.
@fgwang7w Thank you for the optimisation. I did the first pass and I have a question regarding one change added in SyncPartitionMetadataProcedure.
presto-hive/src/main/java/com/facebook/presto/hive/SyncPartitionMetadataProcedure.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/com/facebook/presto/hive/SyncPartitionMetadataProcedure.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/com/facebook/presto/hive/SyncPartitionMetadataProcedure.java
Outdated
Show resolved
Hide resolved
946716a to
b2177f9
Compare
imjalpreet
left a comment
There was a problem hiding this comment.
I have one more suggestion, please let me know what are your views on the same.
presto-hive/src/main/java/com/facebook/presto/hive/SyncPartitionMetadataProcedure.java
Outdated
Show resolved
Hide resolved
b2177f9 to
de12106
Compare
sure, will update the release notes. many thanks for help to review. |
|
@tdcmeehan @prestodb/committers Hi Presto committers, this PR is reviewed internally already, can we have a second round review? |
7fe6a7c to
ef1376a
Compare
ef1376a to
3eec66a
Compare
3eec66a to
3fecbdc
Compare
...tastore/src/main/java/com/facebook/presto/hive/metastore/SemiTransactionalHiveMetastore.java
Outdated
Show resolved
Hide resolved
...tastore/src/main/java/com/facebook/presto/hive/metastore/SemiTransactionalHiveMetastore.java
Outdated
Show resolved
Hide resolved
...tastore/src/main/java/com/facebook/presto/hive/metastore/SemiTransactionalHiveMetastore.java
Outdated
Show resolved
Hide resolved
...-hive-metastore/src/main/java/com/facebook/presto/hive/metastore/glue/GlueHiveMetastore.java
Outdated
Show resolved
Hide resolved
...-hive-metastore/src/main/java/com/facebook/presto/hive/metastore/glue/GlueHiveMetastore.java
Outdated
Show resolved
Hide resolved
d1d4e54 to
12d7d8c
Compare
d1c911f to
9513fd6
Compare
yingsu00
left a comment
There was a problem hiding this comment.
@fgwang7w George, great improvement! In addition to the comments in code, there are several other nits:
- Commit title for 1 and 3 are too long
- Commit title first letter shall be upper case
- Can you use full sentences in the PR message?
presto-hive/src/main/java/com/facebook/presto/hive/SyncPartitionMetadataProcedure.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/com/facebook/presto/hive/SyncPartitionMetadataProcedure.java
Outdated
Show resolved
Hide resolved
...tastore/src/main/java/com/facebook/presto/hive/metastore/SemiTransactionalHiveMetastore.java
Outdated
Show resolved
Hide resolved
...tastore/src/main/java/com/facebook/presto/hive/metastore/SemiTransactionalHiveMetastore.java
Outdated
Show resolved
Hide resolved
ceb14b8 to
98ec731
Compare
sync_partition_metadata utility
|
Thank you @yingsu00 for review. All comments are addressed, could you please give another pass? Thanks! |
98ec731 to
2021afc
Compare
|
@prestodb/committers ping! |
a0caa9a to
b9a5dc3
Compare
There was a problem hiding this comment.
@fgwang7w Mostly good, just some nits.
In the PR message and commit messages:
without take into account
-> without taking into account
hms
-> HMS(Hive Meta Store)
Case 2: Sync partition utility lists file status of each partition twice. In case of handling a large of number of partitions for an external table, redundant check file status is a performance bottleneck.
Case 2: Sync partition utility lists file status of each partition twice, causing performance bottleneck when processing an external table with a large of number of partitions.
Commit 1 & 3 titles are too long.
Use upper case for the first letter in each sentence, including all commit titles, commit/PR messages, newly added comments.
presto-hive/src/main/java/com/facebook/presto/hive/SyncPartitionMetadataProcedure.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/com/facebook/presto/hive/SyncPartitionMetadataProcedure.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/com/facebook/presto/hive/SyncPartitionMetadataProcedure.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/com/facebook/presto/hive/SyncPartitionMetadataProcedure.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/com/facebook/presto/hive/SyncPartitionMetadataProcedure.java
Outdated
Show resolved
Hide resolved
presto-hive-common/src/main/java/com/facebook/presto/hive/HdfsContext.java
Outdated
Show resolved
Hide resolved
presto-hive-common/src/main/java/com/facebook/presto/hive/HdfsContext.java
Outdated
Show resolved
Hide resolved
presto-hive-common/src/main/java/com/facebook/presto/hive/HdfsContext.java
Outdated
Show resolved
Hide resolved
...tastore/src/main/java/com/facebook/presto/hive/metastore/SemiTransactionalHiveMetastore.java
Outdated
Show resolved
Hide resolved
...-hive-metastore/src/main/java/com/facebook/presto/hive/metastore/glue/GlueHiveMetastore.java
Outdated
Show resolved
Hide resolved
Current implementation for getting partition names takes partition values directly without take into account of the actual path. According to the [hive source code](shorturl.at/evT28), this logic may change in the future. This commit updates the partition values with the relative path acquired from the metastore APIs.
Reference to [HMS Repair Utility](shorturl.at/eHIXY), the size for adding partitions can be in a range between 1 and 2,147,483,647. For [Glue metastore](shorturl.at/pqDH9), it’s bounded by 100 for write access.
This PR fixes following issues:
== RELEASE NOTES ==
Detail troubleshoot and solution design can be found in this reference page here