Iceberg DML operation timeout avoids corrupting the table#14118
Conversation
...iceberg/src/main/java/io/trino/plugin/iceberg/catalog/file/FileMetastoreTableOperations.java
Outdated
Show resolved
Hide resolved
electrum
left a comment
There was a problem hiding this comment.
Do you plan to squash the first commit?
.../trino/plugin/iceberg/catalog/file/TestIcebergFileMetastoreTableOperationsInsertFailure.java
Outdated
Show resolved
Hide resolved
...iceberg/src/main/java/io/trino/plugin/iceberg/catalog/file/FileMetastoreTableOperations.java
Outdated
Show resolved
Hide resolved
.../trino/plugin/iceberg/catalog/file/TestIcebergFileMetastoreTableOperationsInsertFailure.java
Outdated
Show resolved
Hide resolved
02949c4 to
35b52ad
Compare
The added value of the first commit is only for maintenance reasons. Squashing the commits would leave a naive reader of the code without any clear context related to the problem being addressed in the commit. |
35b52ad to
2e46887
Compare
...-iceberg/src/main/java/io/trino/plugin/iceberg/catalog/hms/HiveMetastoreTableOperations.java
Outdated
Show resolved
Hide resolved
...iceberg/src/main/java/io/trino/plugin/iceberg/catalog/file/FileMetastoreTableOperations.java
Outdated
Show resolved
Hide resolved
...iceberg/src/main/java/io/trino/plugin/iceberg/catalog/file/FileMetastoreTableOperations.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
remove try, catch.
createQueryRunner() can throw any Exception
.../trino/plugin/iceberg/catalog/file/TestIcebergFileMetastoreTableOperationsInsertFailure.java
Outdated
Show resolved
Hide resolved
.../trino/plugin/iceberg/catalog/file/TestIcebergFileMetastoreTableOperationsInsertFailure.java
Outdated
Show resolved
Hide resolved
...iceberg/src/main/java/io/trino/plugin/iceberg/catalog/file/FileMetastoreTableOperations.java
Outdated
Show resolved
Hide resolved
2e46887 to
eb6a0df
Compare
...-iceberg/src/main/java/io/trino/plugin/iceberg/catalog/hms/HiveMetastoreTableOperations.java
Outdated
Show resolved
Hide resolved
.../trino/plugin/iceberg/catalog/file/TestIcebergFileMetastoreTableOperationsInsertFailure.java
Outdated
Show resolved
Hide resolved
.../trino/plugin/iceberg/catalog/file/TestIcebergFileMetastoreTableOperationsInsertFailure.java
Outdated
Show resolved
Hide resolved
.../trino/plugin/iceberg/catalog/file/TestIcebergFileMetastoreTableOperationsInsertFailure.java
Outdated
Show resolved
Hide resolved
In case of dealing with an `CommitFailedException` while commiting an Iceberg transaction, the Iceberg framework will attempt to retry for `COMMIT_NUM_RETRIES` times the operation and if the operation still fails, it will clean up the metadata file corresponding to the transaction. In case of a metastore client timeout operation the Iceberg library can therefore delete metadata files which eventually get referenced from the configuration of the table persisted on the metastore for the table which leaves the table in a corrupt state. Throw `CommitStateUnknownException` to ensure that the table is not left in a corrupt state after the erroneous completion of the DML operation.
3cbe297 to
ada3ca1
Compare
|
@findinpath under what circumstances should the |
From the javadoc |
|
That indicates that Under what circumstances should we throw |
|
Good, I see where you are probably hinting. I've been too focused on the HMS part and missed the fact that Glue is also affected by the same issue. Here is the list of exceptions thrown by Except So |
Description
In case of dealing with an
CommitFailedExceptionwhilecommiting an Iceberg transaction, the Iceberg framework
will attempt to retry for
COMMIT_NUM_RETRIEStimes theoperation and if the operation still fails, it will clean up
the metadata file corresponding to the transaction.
In case of a metastore client timeout operation the
Iceberg library can therefore delete metadata files
which eventually get referenced from the configuration
of the table persisted on the metastore for the table
which leaves the table in a corrupt state.
Throw
CommitStateUnknownExceptionto ensure that thetable is not left in a corrupt state after the erroneous
completion of the DML operation.
Fixes #14104
probably fixes #12581 too
Non-technical explanation
Avoid deleting metadata files in case of a DML operation timeout because these filesmay eventually get referenced from the Hive metastore configuration of the Iceberg table.
Release notes
( ) This is not user-visible and no release notes are required.
( ) Release notes are required, please propose a release note for me.
(x) Release notes are required, with the following suggested text: