Skip to content

Iceberg DML operation timeout avoids corrupting the table#14118

Merged
findepi merged 1 commit intotrinodb:masterfrom
findinpath:iceberg-insert-timeout-corrupts-the-table
Sep 16, 2022
Merged

Iceberg DML operation timeout avoids corrupting the table#14118
findepi merged 1 commit intotrinodb:masterfrom
findinpath:iceberg-insert-timeout-corrupts-the-table

Conversation

@findinpath
Copy link
Copy Markdown
Contributor

@findinpath findinpath commented Sep 13, 2022

Description

In case of dealing with an CommitFailedException while
commiting an Iceberg transaction, the Iceberg framework
will attempt to retry for COMMIT_NUM_RETRIES times the
operation and if the operation still fails, it will clean up
the metadata file corresponding to the transaction.
In case of a metastore client timeout operation the
Iceberg library can therefore delete metadata files
which eventually get referenced from the configuration
of the table persisted on the metastore for the table
which leaves the table in a corrupt state.

Throw CommitStateUnknownException to ensure that the
table is not left in a corrupt state after the erroneous
completion of the DML operation.

Fixes #14104
probably fixes #12581 too

Non-technical explanation

Avoid deleting metadata files in case of a DML operation timeout because these filesmay eventually get referenced from the Hive metastore configuration of the Iceberg table.

Release notes

( ) This is not user-visible and no release notes are required.
( ) Release notes are required, please propose a release note for me.
(x) Release notes are required, with the following suggested text:

# Iceberg
* Iceberg DML operation timeout avoids corrupting the table

@cla-bot cla-bot bot added the cla-signed label Sep 13, 2022
Copy link
Copy Markdown
Member

@electrum electrum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you plan to squash the first commit?

@findinpath findinpath force-pushed the iceberg-insert-timeout-corrupts-the-table branch from 02949c4 to 35b52ad Compare September 13, 2022 19:26
@findinpath
Copy link
Copy Markdown
Contributor Author

Do you plan to squash the first commit?

The added value of the first commit is only for maintenance reasons. Squashing the commits would leave a naive reader of the code without any clear context related to the problem being addressed in the commit.
I'm fine either way. I wanted to have a test at hand that showcases the issue and not blindly add a fix.

@findinpath findinpath force-pushed the iceberg-insert-timeout-corrupts-the-table branch from 35b52ad to 2e46887 Compare September 13, 2022 19:40
Comment on lines 67 to 69
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove try, catch.
createQueryRunner() can throw any Exception

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bump

@findinpath findinpath force-pushed the iceberg-insert-timeout-corrupts-the-table branch from 2e46887 to eb6a0df Compare September 15, 2022 10:16
@findinpath findinpath requested a review from findepi September 15, 2022 10:16
Copy link
Copy Markdown
Member

@alexjo2144 alexjo2144 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Marius!

In case of dealing with an `CommitFailedException` while
commiting an Iceberg transaction, the Iceberg framework
will attempt to retry for `COMMIT_NUM_RETRIES` times the
operation and if the operation still fails, it will clean up
the metadata file corresponding to the transaction.
In case of a metastore client timeout operation the
Iceberg library can therefore delete metadata files
which eventually get referenced from the configuration
of the table persisted on the metastore for the table
which leaves the table in a corrupt state.

Throw `CommitStateUnknownException` to ensure that the
table is not left in a corrupt state after the erroneous
completion of the DML operation.
@findinpath findinpath force-pushed the iceberg-insert-timeout-corrupts-the-table branch from 3cbe297 to ada3ca1 Compare September 16, 2022 15:27
@findepi findepi merged commit 15dd728 into trinodb:master Sep 16, 2022
@findepi findepi mentioned this pull request Sep 16, 2022
@findepi
Copy link
Copy Markdown
Member

findepi commented Sep 16, 2022

@findinpath under what circumstances should the GlueIcebergTableOperations throw CommitStateUnknownException?

@github-actions github-actions bot added this to the 397 milestone Sep 16, 2022
@findinpath
Copy link
Copy Markdown
Contributor Author

under what circumstances should the GlueIcebergTableOperations throw CommitStateUnknownException?

com.amazonaws.services.glue.AWSGlue#updateTable

From the javadoc

     * @throws ConcurrentModificationException
     *         Two processes are trying to modify a resource simultaneously.

@findepi
Copy link
Copy Markdown
Member

findepi commented Sep 17, 2022

That indicates that ConcurrentModificationException handling is (probably) right.

Under what circumstances should we throw CommitStateUnknownException in GlueIcebergTableOperations?

@findinpath
Copy link
Copy Markdown
Contributor Author

Good, I see where you are probably hinting. I've been too focused on the HMS part and missed the fact that Glue is also affected by the same issue.

Here is the list of exceptions thrown by updateTable operation:

EntityNotFoundException – A specified entity does not exist
InvalidInputException – The input provided was not valid.
InternalServiceException – An internal service error occurred.
OperationTimeoutException – The operation timed out.
ConcurrentModificationException – Two processes are trying to modify a resource simultaneously.
ResourceNumberLimitExceededException – A resource numerical limit was exceeded.
GlueEncryptionException – An encryption operation failed.

Except ConcurrentModificationException, ResourceNumberLimitExceededException (should be happening before hitting the internal metastore service so no risk of corruption here) , InvalidInputException (not very sure here if there may happen that the processing of the input is not half-way through before throwing the exception) and EntityNotFoundException (quite unlikely to be thrown in our specific case) I'd opt to throw CommitStateUnknownException.

So ResourceNumberLimitExceededException & ConcurrentModificationException ask for a retry -> let's throw a CommitFailedException. I'll make a follow-up PR on this topic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

Missing iceberg manifest list file after HMS read times out Metadata file missing after writing to Iceberg table on GCS

7 participants