Skip to content

Use data size for delta metadata cache#24432

Merged
raunaqmorarka merged 2 commits intomasterfrom
delta-cache-accounting
Dec 13, 2024
Merged

Use data size for delta metadata cache#24432
raunaqmorarka merged 2 commits intomasterfrom
delta-cache-accounting

Conversation

@raunaqmorarka
Copy link
Copy Markdown
Member

Description

Reduces chances of coordinator OOM by accounting
for retained size of objects in delta metadata cache.
TTL can be higher because the cached metadata is immutable
and the space occupied by it in memory is accounted for.

Additional context and related issues

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text:

## Delta Lake
* The configuration property `delta.metadata.cache-size` has been replaced by `delta.metadata.cache-max-retained-size` which makes it possible to control the memory usage of delta table metadata cache. By default, upto 5% of JVM max heap of the coordinator may be used for caching delta table metadata. ({issue}`issuenumber`)

@cla-bot cla-bot bot added the cla-signed label Dec 10, 2024
@raunaqmorarka raunaqmorarka requested a review from ebyhr December 10, 2024 12:28
@github-actions github-actions bot added docs delta-lake Delta Lake connector labels Dec 10, 2024
Reduces chances of coordinator OOM by accounting
for retained size of objects in delta metadata cache
TTL can be higher because the cached metadata is immutable
and the space occupied by it in memory is accounted for
@raunaqmorarka raunaqmorarka merged commit af3a6f3 into master Dec 13, 2024
@raunaqmorarka raunaqmorarka deleted the delta-cache-accounting branch December 13, 2024 09:11
@github-actions github-actions bot added this to the 468 milestone Dec 13, 2024
@mosabua
Copy link
Copy Markdown
Member

mosabua commented Dec 16, 2024

Just in general and with regards to the release notes.. this is a breaking change since you are removing a property.

@raunaqmorarka
Copy link
Copy Markdown
Member Author

Just in general and with regards to the release notes.. this is a breaking change since you are removing a property.

Yes, it is a breaking change intentionally. We need users who were overriding the previous property to make a decision about how much coordinator memory they want to consume for the metadata cache. The old way could have potentially unbounded memory usage due to large transaction log jsons.

@mosabua
Copy link
Copy Markdown
Member

mosabua commented Dec 17, 2024

Great.. please check my reworded commit message in the release notes PR ..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Development

Successfully merging this pull request may close these issues.

3 participants