Manifest file caching support for Iceberg Native Catalogs#21399
Manifest file caching support for Iceberg Native Catalogs#21399yingsu00 merged 2 commits intoprestodb:masterfrom
Conversation
|
Codenotify: Notifying subscribers in CODENOTIFY files for diff a562c59...99cb37f.
|
e8da5db to
e197980
Compare
presto-iceberg/src/main/java/com/facebook/presto/iceberg/IcebergResourceFactory.java
Outdated
Show resolved
Hide resolved
e197980 to
92f7be7
Compare
There was a problem hiding this comment.
@agrawalreetika How can it make sure that a catalog A that is not supposed to be visible to a session remains invisible to that session? If another session has the access to catalog A and the cache caches it, would the first session retrieve it and see things it's not supposed to see?
There was a problem hiding this comment.
Hi @yingsu00,
-
About already used
User:<user-name>,<Principle:>,Role:as part of the Catalog cache key, I don't see this information getting passed to the Iceberg library from Presto and is maintained as a cache key on the Presto side. Also while loading the catalog as part of configs Principle & Role are not passed to Iceberg library. -
As discussed these catalog-level username & password here, These are supposed to be used for basic auth of Nessie but while looking at the Nessie client library it looks like BASIC AUTH is deprecated here
So should we remove the BASIC authentication option for Nessie from the Presto side as well?
There was a problem hiding this comment.
Taking a step back, why did we need to make this change in the first place?
There was a problem hiding this comment.
@tdcmeehan Since current key is doing below -
- On Presto side, Under IcebergResourceFactory there is one more cache present called catalogCache, it was introduced as part of initial iceberg connector shipping to Presto.
But it has key asUser:<user-name>,<Principle:>,Role:and Iceberg Catalog as Values which is causing FileIO Object to get created for every new user & catalog file because of CatalogUtil.loadCatalog() while loading cache. Current catalogCache is controlled by iceberg.catalog.cached-catalog-num config in Presto. - Now Manifest file caching has FILEIO as key for the cache, can store upto 8 FILEIO Object by default here, this will fill that cache if 8 unique user query/ catalog query comes in from Presto.
This is why I think we can change the catalog cache key to catalog name + catalog session properties so that we can reuse the same catalog object and avoid loading the Catalog for every unique user & catalog.
Please let me know if you have any questions or any other suggestions here.
There was a problem hiding this comment.
@tdcmeehan
For the catalog cache, the original key was like "User:username,Principle:principle,Role:role,ExtraCredentialKey1:key1,...", and the value was the catalog object. There were several issues here
- Since the key uniquely identify a <user, principle, role> tuple, it assumes that each tuple maps to ONE catalog. This doesn't seem correct to me. A user can have multiple Iceberg catalogs.
- Multiple users may have the same <principle, role> tuple, so the other user won't be able retrieve the catalog from the cache even though he/she has the same authentication level.
- Adding the "User:", "Principle:" string headers to the key is not good for performance. We want to only hash necessary information. The longer the string, the slower the hashing and probing will be.
Actually, the check of access to catalogs were done in SystemAccessControl, which was done on the metadata level and happened earlier than this loadCatalog(). It seems to me having the principle and role in the catalog cache is unnecessary. If a <user, principle, role> tries to access a catalog he/she is not supposed to see, SystemAccessControl would deny it before getting to here.
cc @ChunxuTang
steveburnett
left a comment
There was a problem hiding this comment.
Thank you for consistently writing documentation! This looks great, I had a few minor nits. As always, please discuss if you feel my suggestions change your intended meaning in a way that is not correct.
dd4b8fa to
e63a72a
Compare
e63a72a to
1841b80
Compare
|
Thank you @steveburnett for your review. I have made the changes. |
steveburnett
left a comment
There was a problem hiding this comment.
LGTM! (docs)
New local build of docs, everything looks good.
1841b80 to
26c6b3a
Compare
There was a problem hiding this comment.
This comment is for line ln 109
According to https://projectnessie.org/tools/client_config/, the ref hash is just a hash of the reference string and it's optional. Adding it to the cache key string is not necessary and would incur perf degradation, since the cache lookup will hash the key anyways. I suggest we remove it from the key string. cc @ChunxuTang
There was a problem hiding this comment.
ref hash is not exposed as Iceberg catalog property but it's not a hidden field from session config - https://github.com/prestodb/presto/blob/master/presto-iceberg/src/main/java/com/facebook/presto/iceberg/IcebergSessionProperties.java#L277
So the user can still make changes to it though as per Nessie doc, it's an optional field. If we take it out from the cache key then it will load the cache value with the default ref hash i.e null so we don't have ref hash then should we also remove it from session properties in the connector?
There was a problem hiding this comment.
Is there a case that the nessie reference name itself is not unique and we have to use the reference hash to differentiate them? Or say, is it feasible to expect that reference names are unique in practical use cases?
Using a hash in a key for hashing sounds a bit uncommon to me...
There was a problem hiding this comment.
Not really sure if we really have to expose ref.hash property from Presto side if we have reference names there. But if we think we don't really need ref.hash then I think we should take it out as option from Presto session properties as well?
There was a problem hiding this comment.
@ChunxuTang The ref hash is the hash of the reference name. If the name is not unique, the hash won't be either. Also the hash is optional according to https://projectnessie.org/tools/client_config/.
I actually don't understand why the "nessie_reference_hash" Iceberg session property was added at the first place. It does not have a corresponding config property in NessieConfig, and it is not settable and is defaulted to null. There is no hashing applied to the reference name to set the "nessie_reference_hash" property, nor there was any verification it's a valid hash value. It seems it would always be null. @nastra @ChunxuTang was there any reason it was done this way? I think we should remove it from IcebergSessionProperties.
There was a problem hiding this comment.
do we absolutely need to remove the ref hash as part of this PR? To me it seems that introducing manifest file caching and nessie changes are intertwined.
I believe the ref hash was needed in the cache, although I would have to check again why exactly (I haven't been working on Nessie lately anymore and @ajantha-bhat took that part over)
There was a problem hiding this comment.
Hi, nessie_reference_hash is optionally needed to fetch the state of table from particular time (mapping to the hash). Without hash, we always fetch the latest state. Old state can be useful for time travel queries.
But as pointed out above, If it is not exposed to the user or integrated fully. We can remove for time being and add it later when we fully want to support it.
But as @nastra suggested better to keep the current PR scope as small as possible.
There was a problem hiding this comment.
@ajantha-bhat thanks for your reply. But according to https://projectnessie.org/tools/client_config/ this ref hash is the hash of the ref name.
| nessie.ref | Mandatory | Name of the Nessie reference, usually main. |
|---|---|---|
| nessie.ref.hash | Optional | Hash on nessie.ref, usually not specified. |
What is the ref name used for? If ref name is already part of the cache key, do we still need the ref hash?
There was a problem hiding this comment.
Nessie supports catalog level branching and tagging capabilities. ref name is (branch or tag name).
If the hash is not specified, we use the latest state of the table from that branch or tag.
catalog level hash configuration is mainly used for time travel queries. Once, we support that we need this config exposed.
If it is not really causing much problem, we can even keep this code to avoid adding it back again in the future.
There was a problem hiding this comment.
@tdcmeehan
For the catalog cache, the original key was like "User:username,Principle:principle,Role:role,ExtraCredentialKey1:key1,...", and the value was the catalog object. There were several issues here
- Since the key uniquely identify a <user, principle, role> tuple, it assumes that each tuple maps to ONE catalog. This doesn't seem correct to me. A user can have multiple Iceberg catalogs.
- Multiple users may have the same <principle, role> tuple, so the other user won't be able retrieve the catalog from the cache even though he/she has the same authentication level.
- Adding the "User:", "Principle:" string headers to the key is not good for performance. We want to only hash necessary information. The longer the string, the slower the hashing and probing will be.
Actually, the check of access to catalogs were done in SystemAccessControl, which was done on the metadata level and happened earlier than this loadCatalog(). It seems to me having the principle and role in the catalog cache is unnecessary. If a <user, principle, role> tries to access a catalog he/she is not supposed to see, SystemAccessControl would deny it before getting to here.
cc @ChunxuTang
yingsu00
left a comment
There was a problem hiding this comment.
@agrawalreetika For the deprecated Nessie basic authentication type using
username+password, let's just mark @deprecated on the related properties in NessieConfig.java for now. We can remove them in the future. Please remember to update the documentation about it.
ChunxuTang
left a comment
There was a problem hiding this comment.
@agrawalreetika Thanks for your contribution! This is a super useful feature.
I just left some minor comments.
There was a problem hiding this comment.
Just a quick question (probably beyond the scope of this PR): Besides native catalogs (E.g. Hadoop and Nessie), could other catalogs support manifest file caching?
There was a problem hiding this comment.
@ChunxuTang As of now from Presto no. Since manifest file caching is supported with HadoopFileIO in Iceberg. But in Presto for Hive Catalog we have custom HdfsFileIO
Do we know why do we use this custom HdfsFileIO from Presto for Iceberg Hive Catalog?
There was a problem hiding this comment.
Suggest directing importing the constant variables. E.g.
CatalogProperties.IO_MANIFEST_CACHE_MAX_TOTAL_BYTES_DEFAULT => IO_MANIFEST_CACHE_MAX_TOTAL_BYTES_DEFAULT
There was a problem hiding this comment.
Is there a case that the nessie reference name itself is not unique and we have to use the reference hash to differentiate them? Or say, is it feasible to expect that reference names are unique in practical use cases?
Using a hash in a key for hashing sounds a bit uncommon to me...
|
Gently pinging @nastra in case you want to review the nessie-related parts. |
26c6b3a to
9c9e7d7
Compare
000a4e0 to
69018dd
Compare
There was a problem hiding this comment.
can you elaborate why this needs to be set for manifest caching?
I would have expected this to use whatever FileIO the underlying catalog is using. Otherwise there's a risk that the FileIO specified here differs from the one used by the catalog
There was a problem hiding this comment.
@nastra As I checked in Iceberg core, In case when we query the Hadoop catalog from Presto. here in HadoopCatalog.initialize() Catalog configuration as not being set via HadoopFileIO.initialize() but if fileIOImpl config is set explicaity then it creates HadoopFileIO object via CatalogUtil.loadFileIO() which is also responsible of initialising Catalog configuration (including which are being passed from Presto io.manifest.cache-enabled). That is why I had set org.apache.iceberg.hadoop.HadoopFileIO explicitely here since Hadoop & Nessie both uses HadoopFileIO.
Please let me know if there is any better way of handling this?
There was a problem hiding this comment.
hm I see and I was not aware of that limitation in the HadoopCatalog/ HiveCatalog. The initialization of FileIO should actually be like it's done in https://github.com/apache/iceberg/blob/39239a19555f55e0facc0a585086791a606a4b32/nessie/src/main/java/org/apache/iceberg/nessie/NessieCatalog.java#L86-L88.
@agrawalreetika this might be a good point to raise a PR in Iceberg and I'll make sure to find somebody to review it
There was a problem hiding this comment.
Exactly, NessieCatalog takes care of initializing it. So it's not an issue there. Thanks for confirming. Sure, I will raise a PR for this.
There was a problem hiding this comment.
There was a problem hiding this comment.
@agrawalreetika With apache/iceberg#9283 merged, do we still need to set iceberg.io-impl?
There was a problem hiding this comment.
@yingsu00 No after apache/iceberg#9283 and once we upgrade Iceberg in Presto, we don't have to explicitely set iceberg.io-impl for Native catalogs. But for HiveCatalog we may still need it since Presto has custom FILEIO implementation.
There was a problem hiding this comment.
Is this waiting a response from @ChunxuTang on this question? #21399 (comment)
There was a problem hiding this comment.
Is this waiting a response from @ChunxuTang on this question? #21399 (comment)
I believe not. @agrawalreetika Can you please confirm?
There was a problem hiding this comment.
No this is good to go for Native catalogs support.
One we upgrade Iceberg version on Presto which has apache/iceberg#9283 change then we will not have to set iceberg.io-impl for Native catalogs explicitely to enable Manifest file caching.
There was a problem hiding this comment.
do we absolutely need to remove the ref hash as part of this PR? To me it seems that introducing manifest file caching and nessie changes are intertwined.
I believe the ref hash was needed in the cache, although I would have to check again why exactly (I haven't been working on Nessie lately anymore and @ajantha-bhat took that part over)
There was a problem hiding this comment.
minor: this isn't really tied to manifest caching. Setting io-impl for Iceberg is optional and should be handled separately:
if (ioImpl != null) {
properties.put(FILE_IO_IMPL, ioImpl);
}
There was a problem hiding this comment.
@agrawalreetika Do you think what @nastra suggested make sense? If yes could you please update the code?
There was a problem hiding this comment.
@nastra Currently fileIOImpl is being picked up from IcebergConfig which is marked as NotNull currently. We can make this optional once we upgrade Iceberg to version which has changes from apache/iceberg#9283
There was a problem hiding this comment.
Yes, this should be optional, because it's also an optional configuration in Iceberg itself. Also I still think it FILE_IO_IMPL should be configured outside of the loadCachingProperties() method (independently from whether it's optional/required right now)
There was a problem hiding this comment.
Sure, I have taken it out from loadCachingProperties() method
There was a problem hiding this comment.
these changes don't seem to be related to introducing manifest caching. I'm not saying we shouldn't deprecate this, but it seems more appropriate to handle this in a separate PR
There was a problem hiding this comment.
Sure, will take out from this. Handling it here #21476
There was a problem hiding this comment.
Creating a separate PR is fine. @agrawalreetika Can you please do it and let me know the PR number?
steveburnett
left a comment
There was a problem hiding this comment.
Only a couple of very small suggestions. As always, if my suggestions change the meaning in a way you disagree with, let me know and we can figure out a solution.
69018dd to
f7ab536
Compare
@nastra it's not absolutely needed to remove the ref hash in this PR. We can do it in future PRs. But we'd appreciate it if you can check how the ref hash was needed in the cache. If the Nessie ref names for catalog A and B are the same, the ref hash cannot be different. How can the ref hash help in that case? if the names are different, the hash should be different and therefore not needed. The documentation also says it's optional. And it is ALWAYS null in Presto code since it's not settable and defaulted to null. How can a value that's always null required in the cache? |
f7ab536 to
beec3f6
Compare
steveburnett
left a comment
There was a problem hiding this comment.
LGTM! (docs)
Local build of docs, everything looks good. Thanks!
presto-iceberg/src/test/java/com/facebook/presto/iceberg/nessie/TestNessieMultiBranching.java
Outdated
Show resolved
Hide resolved
beec3f6 to
8abc232
Compare
nastra
left a comment
There was a problem hiding this comment.
LGTM with one comment about pulling the configuration of FileIO out of loadCachingProperties() as it isn't related to caching
There was a problem hiding this comment.
minor: I would probably still pull this out of loadCachingProperties as this makes readers of the code think that this is related to caching (which it isn't)
There was a problem hiding this comment.
@nastra As of now I have kept it to set for loadCachingProperties, since setting io-impl is required for initialization of FileIO to set catalog properties on iceberg side. And if manifest file caching is not enabled we don't have to set this explicitly.
presto-iceberg/src/main/java/com/facebook/presto/iceberg/IcebergUtil.java
Outdated
Show resolved
Hide resolved
presto-iceberg/src/main/java/com/facebook/presto/iceberg/IcebergConfig.java
Outdated
Show resolved
Hide resolved
presto-iceberg/src/main/java/com/facebook/presto/iceberg/IcebergConfig.java
Outdated
Show resolved
Hide resolved
8abc232 to
88d420f
Compare
88d420f to
d53a1c1
Compare
There was a problem hiding this comment.
@ajantha-bhat thanks for your reply. But according to https://projectnessie.org/tools/client_config/ this ref hash is the hash of the ref name.
| nessie.ref | Mandatory | Name of the Nessie reference, usually main. |
|---|---|---|
| nessie.ref.hash | Optional | Hash on nessie.ref, usually not specified. |
What is the ref name used for? If ref name is already part of the cache key, do we still need the ref hash?
There was a problem hiding this comment.
@agrawalreetika With apache/iceberg#9283 merged, do we still need to set iceberg.io-impl?
|
@ajantha-bhat I checked Nessie documentation and source code, and it seems the ref hash is something like the commit hash on a git branch. If there are 2 new commits on this branch, lets say branch A, then the reference to the first commit would be (name="A", hash=[hash_commit_1]), and the reference to the latest commit is (name="A", hash=[hash_commit_2]) or just (name="A"). Is my understanding correct? So the "Hash on nessie.ref" is not "Hash OF the nessie.ref name", but the hash of one of the commits on the reference branch. I mistakenly thought it was "Hash OF the nessie.ref name" before. If my current understanding is correct, then we should not remove the ref hash property, but instead, we shall make it settable and validate the user's input. cc @agrawalreetika |
|
@yingsu00: Yes. Reference (Branch/tag) will have name and hash. If hash is not specified, Nessie consider the head commit (latest hash). |
d53a1c1 to
7fc9f1d
Compare
7fc9f1d to
99cb37f
Compare

Description
This PR introduces the below addition to Iceberg Connector:
User:<user-name>,<Principle:>,Role:, which would load cache entry i.e. catalog CatalogUtil.loadCatalog() for every new user & catalog file while loading cache.Adding support for - #20820
Motivation and Context
Starting from Iceberg version 1.1.0, Apache Iceberg provides a mechanism to cache the contents of Iceberg manifest files in memory. This manifest caching feature helps to reduce repeated reads of small Iceberg manifest files from remote storage.
Reference: apache/iceberg#4518
Impact
No Impact. Default behaviour, Disabled. Feature can be enabled via catalog configuration changes.
Test Plan
Contributor checklist
Release Notes