Skip to content

Add config for the caching duration of Delta table active data files#13316

Merged
ebyhr merged 2 commits intotrinodb:masterfrom
findinpath:delta-live-files-cache-ttl
Jul 27, 2022
Merged

Add config for the caching duration of Delta table active data files#13316
ebyhr merged 2 commits intotrinodb:masterfrom
findinpath:delta-live-files-cache-ttl

Conversation

@findinpath
Copy link
Copy Markdown
Contributor

@findinpath findinpath commented Jul 22, 2022

Description

Add config delta.metadata.live-files.cache-ttl for the caching duration of Delta table active data files.

Is this change a fix, improvement, new feature, refactoring, or other?

Improvement

Is this a change to the core query engine, a connector, client library, or the SPI interfaces? (be specific)

Delta Lake connector

How would you describe this change to a non-technical end user or system administrator?

Avoid holding indefinitely in the cache the active data files of the Delta Lake tables used already in SQL queries.

Related issues, pull requests, and links

Having this setting, would have probably ensured that the issue #13181 would occur far less often.

Documentation

( ) No documentation is needed.
(x) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.

Release notes

( ) No release notes entries required.
(x) Release notes entries required with the following suggested text:

# Delta Lake
* Add config for the caching duration of Delta table active data files

@cla-bot cla-bot bot added the cla-signed label Jul 22, 2022
@findinpath findinpath requested review from ebyhr, findepi and homar July 22, 2022 15:36
@findinpath findinpath force-pushed the delta-live-files-cache-ttl branch from 91a8103 to 95783c0 Compare July 22, 2022 15:37
@github-actions github-actions bot added the docs label Jul 22, 2022
@findinpath findinpath marked this pull request as draft July 22, 2022 16:03
@findinpath findinpath force-pushed the delta-live-files-cache-ttl branch from 95783c0 to f4aeba0 Compare July 22, 2022 16:33
@findinpath findinpath marked this pull request as ready for review July 22, 2022 16:33
@findinpath findinpath force-pushed the delta-live-files-cache-ttl branch 2 times, most recently from 7393dbb to 05c4ecb Compare July 25, 2022 03:53
@findinpath findinpath force-pushed the delta-live-files-cache-ttl branch 2 times, most recently from 99549a0 to 5cd0292 Compare July 26, 2022 09:20
@findinpath findinpath force-pushed the delta-live-files-cache-ttl branch from 5cd0292 to 7ba6c68 Compare July 26, 2022 10:43
@findinpath findinpath force-pushed the delta-live-files-cache-ttl branch from 7ba6c68 to 50c643d Compare July 27, 2022 07:29
* - ``delta.metadata.live-files.cache-ttl``
- Caching duration for active files which correspond to the Delta Lake
tables.
- ``30m``
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about what the default value should be.
Feedback is welcome.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i dont have strong opinion here. 30m looks fine.

cc @alexjo2144 @claudiusli

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that seems reasonable

@ebyhr ebyhr merged commit 0729d78 into trinodb:master Jul 27, 2022
@ebyhr
Copy link
Copy Markdown
Member

ebyhr commented Jul 27, 2022

Merged, thanks!

@github-actions github-actions bot added this to the 392 milestone Jul 27, 2022
@ebyhr ebyhr mentioned this pull request Jul 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

4 participants