Add option to disable filesystem caching of /_delta_log/ directory#23408
Add option to disable filesystem caching of /_delta_log/ directory#23408raunaqmorarka merged 1 commit intotrinodb:masterfrom sdaberdaku:master
Conversation
|
Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla |
|
Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla |
2 similar comments
|
Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla |
|
Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla |
|
@cla-bot check |
|
Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla |
|
The cla-bot has been summoned, and re-checked this pull request! |
jkylling
left a comment
There was a problem hiding this comment.
Thank you for the PR! It would be good if @raunaqmorarka and @wendigo could take a look as well.
.../src/main/java/io/trino/plugin/deltalake/cache/MutableDeltaLogDeltaLakeCacheKeyProvider.java
Outdated
Show resolved
Hide resolved
.../test/java/io/trino/plugin/deltalake/cache/TestMutableDeltaLogDeltaLakeCacheKeyProvider.java
Outdated
Show resolved
Hide resolved
Since the delta lake support |
plugin/trino-delta-lake/src/main/java/io/trino/plugin/deltalake/DeltaLakeConfig.java
Outdated
Show resolved
Hide resolved
plugin/trino-delta-lake/src/main/java/io/trino/plugin/deltalake/DeltaLakeConfig.java
Outdated
Show resolved
Hide resolved
plugin/trino-delta-lake/src/main/java/io/trino/plugin/deltalake/DeltaLakeConfig.java
Outdated
Show resolved
Hide resolved
.../src/main/java/io/trino/plugin/deltalake/cache/MutableDeltaLogDeltaLakeCacheKeyProvider.java
Outdated
Show resolved
Hide resolved
.../test/java/io/trino/plugin/deltalake/cache/TestMutableDeltaLogDeltaLakeCacheKeyProvider.java
Outdated
Show resolved
Hide resolved
@raunaqmorarka When overwriting with some Spark operations or externally deleting the delta table (i.e. manually deleting the files on the object storage) and recreating it makes the delta log mutable. |
|
Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla |
|
Dear @jkylling, @raunaqmorarka, and, @Praveen2112, I implemented the suggested changes. Best, Sebastian PS: For some reason, the verification/cla-signed is still failing although I submitted the CLA on Friday. Maybe it needs more time to get processed. |
|
Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla |
1 similar comment
|
Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla |
Could you clarify what Spark operations we would need to run to encounter this situation ?
CLA is processed manually, it will take time to process. |
|
Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla |
It has happened to me when running: With Spark 3.5.1 and Delta 3.1.0. |
|
Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla |
1 similar comment
|
Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla |
|
@cla-bot check |
|
The cla-bot has been summoned, and re-checked this pull request! |
|
Hello @jkylling, @Praveen2112, @raunaqmorarka, and @wendigo! My CLA has finally been processed! Best, S. |
.../io/trino/plugin/deltalake/TestDeltaLakeAlluxioCacheFileOperationsMutableTransactionLog.java
Outdated
Show resolved
Hide resolved
...rino-delta-lake/src/main/java/io/trino/plugin/deltalake/cache/DeltaLakeCacheKeyProvider.java
Outdated
Show resolved
Hide resolved
|
Hello @jkylling and @raunaqmorarka, I don't know if you guys had a chance to review the minimal test I wrote. Thanks again for your support! Best, |
raunaqmorarka
left a comment
There was a problem hiding this comment.
minor comment about test, lgtm otherwise
Description
Added a configuration option to disable object caching of files with /_delta_log/ in their path to avoid issues with Delta tables with mutable commits. This is useful in those scenarios when delta tables are deleted and re-created and the files inside the _delta_log folder cannot be considered immutable anymore, and thus are unsafe to cache.
Additional context and related issues
Fixes #21451
Release notes
( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text: