Improve Delta lake caching of metadata #17516
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Currently, when a new commit is made to a Delta table, the cached metadata entry of the most up to date
TableSnapshotis invalidated. This means that a metadata entry must be re-read from the checkpoint and the commits made after the checkpoint (please see c0d0937). This is unnecessary work, as we always read the new commits, and could be reconciling the cached metadata entry with the possible metadata entries loaded from the new commits.This PR modifies the
TransactionLogTailto keep track of any metadata entries it may contain. In theTableSnapshotthe cached metadata entry is reconciled with any metadata entry of theTransactionLogTail.This fixes the seconds part of #17406 .
Similar code can be added to cache the protocol entry, to avoid any reads from the checkpoint files except when new checkpoints are made. I'd be happy to contribute this as well.
Additional context and related issues
Please see #17406 .
Release notes
( ) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
(x) Release notes are required, with the following suggested text: