Use metastore locking around read-modify-write operations for transaction commit#9584
Conversation
There was a problem hiding this comment.
This breaks abstractions, since the class was supposed to be dependent on HiveMetastore only.
This is intentional though, because Thrift HMS and Glue will use different locking protocol, and so we will have different implementations of table operations. This will be made simpler if we flatten configuration, ie remove hive.metastore as config from Iceberg and configure everything via iceberg.catalog.type (#9577)
There was a problem hiding this comment.
In this case, I wonder if we can just name this class as ThriftHiveTableOperations, where we can directly guarantee the existence of Thrift metastore, and for file metastore we don't need the stub, and we can have FileHiveTableOperations that inherit the same base class for all the shared functionalities.
There was a problem hiding this comment.
That's a good point. I knew this refactor is imminent, but i didn't want to do this yet (understood you may be working on this already). Anyway, since you asked, i did that.
42b3544 to
2efe784
Compare
|
CI #9300 |
2efe784 to
c3e5174
Compare
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/HiveTableOperations.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/TestIcebergV2.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
I did not notice originally that we are passing thriftMetastore for sake of obtaining "fresh" table here.
Maybe cleaner approach would be to extend HiveMetastore interface to have getTableUncached. Then for all but CachingHiveMetastore the method would just return getTable and for CachingHiveMetastore we would go straight to the backend.
Then you would not need to pass both HiveMetastore and ThriftMetastore to HiveTableOperations
There was a problem hiding this comment.
Then you would not need to pass both HiveMetastore and ThriftMetastore to HiveTableOperations
For that you would also need to expose acquireTableExclusiveLock via HiveMetastore interface. But it is not bad I think.
There was a problem hiding this comment.
Then you would not need to pass both
HiveMetastoreandThriftMetastoretoHiveTableOperations
To alleviate the need for this, i would also need to expose table lock/unlock functions in all HiveMetastore layers, which seems working against the abstraction we try to defend.
plugin/trino-hive/src/main/java/io/trino/plugin/hive/metastore/thrift/ThriftHiveMetastore.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/HiveTableOperations.java
Outdated
Show resolved
Hide resolved
testing/trino-product-tests/src/main/java/io/trino/tests/product/iceberg/TestIcebergInsert.java
Outdated
Show resolved
Hide resolved
testing/trino-product-tests/src/main/java/io/trino/tests/product/iceberg/TestIcebergInsert.java
Outdated
Show resolved
Hide resolved
b5bf534 to
748f4ac
Compare
plugin/trino-hive/src/main/java/io/trino/plugin/hive/metastore/thrift/ThriftMetastore.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
In this case, I wonder if we can just name this class as ThriftHiveTableOperations, where we can directly guarantee the existence of Thrift metastore, and for file metastore we don't need the stub, and we can have FileHiveTableOperations that inherit the same base class for all the shared functionalities.
748f4ac to
4ad29cf
Compare
4ad29cf to
730769b
Compare
This replaces `HiveTableOperations` with `HiveMetastoreTableOperations` and `FileMetastoreTableOperations` suitable for `HIVE_METASTORE` and `TESTING_FILE_METASTORE` Iceberg catalogs respectively, along with necessary interfaces.
730769b to
fdb18e7
Compare
There was a problem hiding this comment.
nit: would be cool to have a separate commit which just inlines this into subclasses. And the do modifications in following commit.
There was a problem hiding this comment.
Looks that FileMetastoreTableOperations is just inline and in Hive version there is locking added. But it is hard to see exaclty.
There was a problem hiding this comment.
Extracted for visibility. Hive version looks like a bigger change, because nesting changed.
There was a problem hiding this comment.
hdfsContext and queryId are all a part of the session, I think we can remove them from the interface.
HiveMetastore can be a part of the injected dependency, so we can probably also remove that.
There was a problem hiding this comment.
sounds like potential follow up candidates, right?
There was a problem hiding this comment.
actually, i misunderstood you, sorry. -> #9661
In the following commit, the implementation will diverge.
37e33bc to
85b8c6f
Compare
Previously, no locking was applied when writing Iceberg data, thus in case of concurrent writes (from same cluster, from multiple Trino clusters, or from different applications) successfully committed data could get made unreachable by a concurrent transaction's commit. This behavior is illustrated with a test being added here. Before the fix, the writes would always succeed, but the part of written data would not be visible in the final table state.
85b8c6f to
005c74c
Compare
|
Added additional test in |
|
CI #9658 |
Fixes #9583