-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Flush transaction log cache in Delta flush_metadata_cache procedure
#16466
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
flush_metadata_cache procedure
b5015ef to
aa7008d
Compare
...duct-tests/src/main/java/io/trino/tests/product/deltalake/TestDeltaLakeActiveFilesCache.java
Outdated
Show resolved
Hide resolved
...duct-tests/src/main/java/io/trino/tests/product/deltalake/TestDeltaLakeActiveFilesCache.java
Outdated
Show resolved
Hide resolved
...src/main/java/io/trino/plugin/deltalake/metastore/HiveMetastoreBackedDeltaLakeMetastore.java
Outdated
Show resolved
Hide resolved
...in/trino-hive/src/main/java/io/trino/plugin/hive/metastore/DecoratedHiveMetastoreModule.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want this procedure installed for iceberg ?
tbh, I didn't know that it existed in iceberg already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather have the method always available in the connector (register it on HiveProcedureModule) and throw an exception (as your implementation does for delta) in case that the caching metastore is not used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added another commit to avoid installing the procedure in Iceberg.
I'd rather have the method always available in the connector
What's the rationale or motivation of this? My opinion is basically the opposite (no need to install a procedure when it's unusable).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the rationale or motivation of this?
I was thinking more from a user perspective where a user may be confused that the procedure is not available (e.g. : while using AWS Glue metastore) for the hive/delta connector even though is advertised to be available in the documentation.
...ino-delta-lake/src/main/java/io/trino/plugin/deltalake/procedure/FlushMetadataProcedure.java
Outdated
Show resolved
Hide resolved
aa7008d to
8e50366
Compare
0e7d6ac to
266aec9
Compare
...ino-delta-lake/src/main/java/io/trino/plugin/deltalake/procedure/FlushMetadataProcedure.java
Outdated
Show resolved
Hide resolved
...ino-delta-lake/src/main/java/io/trino/plugin/deltalake/procedure/FlushMetadataProcedure.java
Outdated
Show resolved
Hide resolved
266aec9 to
5a09dfb
Compare
...elta-lake/src/main/java/io/trino/plugin/deltalake/procedure/FlushMetadataCacheProcedure.java
Outdated
Show resolved
Hide resolved
...elta-lake/src/main/java/io/trino/plugin/deltalake/procedure/FlushMetadataCacheProcedure.java
Outdated
Show resolved
Hide resolved
...a-lake/src/test/java/io/trino/plugin/deltalake/TestDeltaLakeFlushMetadataCacheProcedure.java
Outdated
Show resolved
Hide resolved
...a-lake/src/test/java/io/trino/plugin/deltalake/TestDeltaLakeFlushMetadataCacheProcedure.java
Outdated
Show resolved
Hide resolved
5a09dfb to
e08137e
Compare
...in/trino-hive/src/main/java/io/trino/plugin/hive/metastore/DecoratedHiveMetastoreModule.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the flush procedure should be installed whenever caching is supported
what about changing the parameter name and logic.
for example, we could skip binding CachingHiveMetastoreConfig for iceberg (and maybe some others)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the flush procedure should be installed whenever caching is supported
This makes it hard to exclude Hive's flush_metadata_cache from Delta Lake connector. I initially tried to exclude the procedure from DeltaLakeConnector#getProcedures, but Procedure class doesn't contain catalog or connector information (MethodHandle doesn't work nicely to compare objects).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it make sense to try to delegate to Hive's flush procedure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to delegate to the connector at first, but the code was a little redundant. Let me merge as-is. I will take another look later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need a product test?
maybe file: -based tests are enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FileHiveMetastore has the additional cache with listTablesCache #13115. It requires a refactoring to disable the cache (the duration is hard-coded now) or wait until the internal cache is expired.
Additionally, replace toUnmodifiableList with toImmutableList.
7892030 to
118e67d
Compare
|
Rebased on master to resolve conflicts. |
c6ae7f8 to
5bb3280
Compare
The procedure was unusable because Iceberg connector always disables caching metastore.
Co-Authored-By: Marius Grama <[email protected]>
5bb3280 to
3f88633
Compare
Description
Relates to #13737
Release notes
(x) Release notes are required, with the following suggested text: