Skip to content

Comments

Fix to avoid retrying the metadata loading once a FileNotFoundException is thrown, when trying to drop Iceberg tables, whose metadata is deleted from its S3 storage#23510

Merged
hantangwangd merged 1 commit intoprestodb:masterfrom
dmariamgeorge:fix_drop_lag
Aug 31, 2024

Conversation

@dmariamgeorge
Copy link
Contributor

@dmariamgeorge dmariamgeorge commented Aug 23, 2024

Description

According to the current behaviour, when a table's metadata is deleted from S3, it repeatedly tries to access the metadata file, leading to multiple retries and delays in dropping the table. with this fix, upon encountering a FileNotFoundException, it immediately halts retries and proceeds with the drop operation, completing the task in a fraction of the time.

Motivation and Context

Fixes issue: #23511

Impact

Users will experience significantly faster drop operations for Iceberg tables whose metadata has already been deleted from S3 storage. Instead of encountering long delays due to unnecessary retries, it will now recognize when the metadata is missing and complete the drop operation more quickly.

Test Plan

  1. Connected Presto to an S3 Minio storage through the Iceberg connector.
  2. Created a schema, and within the schema created a table.
  3. Now, deleted the metadata and the bucket.
  4. Tried dropping the associated table.
  5. The table drop operation got completed quickly.

Contributor checklist

  • Please make sure your submission complies with our development, formatting, commit message, and attribution guidelines.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

== RELEASE NOTES ==

Iceberg Connector Changes
* Fix to reduce drop time for Iceberg tables with deleted metadata in S3 storage. :pr:`23510`

@tdcmeehan
Copy link
Contributor

Nice change @dmariamgeorge . Can you please try to create a test case for this?

@dmariamgeorge dmariamgeorge force-pushed the fix_drop_lag branch 2 times, most recently from 1f391ba to 258c4dc Compare August 24, 2024 08:29
@dmariamgeorge
Copy link
Contributor Author

dmariamgeorge commented Aug 24, 2024

Nice change @dmariamgeorge . Can you please try to create a test case for this?

@tdcmeehan - There exists a testcase testTableDropWithMissingMetadata() for this:

@dmariamgeorge dmariamgeorge changed the title Fix to avoid retrying the metadata loading once a FileNotFoundException is thrown, when trying to drop Iceberg/Hive tables, whose metadata is deleted from its S3 storage Fix to avoid retrying the metadata loading once a FileNotFoundException is thrown, when trying to drop Iceberg tables, whose metadata is deleted from its S3 storage Aug 24, 2024
Copy link
Member

@hantangwangd hantangwangd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch, the change looks good to me!

Maybe in TestIcebergMetadataListing we can set iceberg.hive.table-refresh.max-retry-time to a longer period, and validate the timeOut of testTableDropWithMissingMetadata() to a shorter period, so that we can confirm that it did not trigger the retry?

@dmariamgeorge dmariamgeorge force-pushed the fix_drop_lag branch 3 times, most recently from 85c35cb to 595a315 Compare August 30, 2024 12:59
@steveburnett
Copy link
Contributor

Nit of formatting on the release note entry - need a space before :pr:

== RELEASE NOTES ==

Iceberg Connector Changes
* Fix to reduce drop time for Iceberg tables with deleted metadata in S3 storage. :pr:`23510`

Copy link
Member

@hantangwangd hantangwangd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix.

@hantangwangd hantangwangd merged commit 2bceaa0 into prestodb:master Aug 31, 2024
@jaystarshot jaystarshot mentioned this pull request Nov 1, 2024
25 tasks
@ethanyzhang ethanyzhang added the from:IBM PR from IBM label Dec 13, 2024
@prestodb-ci prestodb-ci requested review from a team, bibith4 and namya28 and removed request for a team December 13, 2024 15:13
@ethanyzhang ethanyzhang removed the from:IBM PR from IBM label Dec 13, 2024
@prestodb prestodb deleted a comment from prestodb-ci Dec 13, 2024
@tdcmeehan tdcmeehan added the from:IBM PR from IBM label Dec 13, 2024
@prestodb-ci prestodb-ci requested a review from a team December 13, 2024 15:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

from:IBM PR from IBM

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants