Add support to redirect table operations from Iceberg to Hive#11356
Add support to redirect table operations from Iceberg to Hive#11356findepi merged 1 commit intotrinodb:masterfrom
Conversation
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Outdated
Show resolved
Hide resolved
...product-tests/src/main/java/io/trino/tests/product/iceberg/TestIcebergRedirectionToHive.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/TrinoHiveCatalog.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/TrinoHiveCatalog.java
Outdated
Show resolved
Hide resolved
9501f3e to
ebd2fbb
Compare
...t-tests/src/main/java/io/trino/tests/product/iceberg/TestIcebergHiveTablesCompatibility.java
Outdated
Show resolved
Hide resolved
0d33751 to
acc1604
Compare
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergSessionProperties.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/TrinoCatalog.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
This doesn't seem like an Iceberg-catalog level method. Can we narrow this down to just the things that are dependent on the catalog implementation and move the rest up to IcebergMetadata?
There was a problem hiding this comment.
This doesn't seem like an Iceberg-catalog level method.
Initially I created a TableRedirectionHandler abstraction which was created at the same time as the TrinoCatalog instance, but I dropped afterwards the idea because the redirection is tightly linked to the metastore implementation (hive/glue).
The same boilerplate code used for creating the TrinoCatalog would need to be done for redirection handling as well.
There was a problem hiding this comment.
On a second thought, it probably makes sense to take out this operation from TrinoCatalog in hindsight of JDBC / REST catalogs which will not have anymore Hive related content in them.
There was a problem hiding this comment.
I've modified the code again so that the redirectTable method lives in the TrinoHiveCatalog and TrinoGlueCatalog, but it is not exposed in the TrinoCatalog interface in order to keep the interface free from the concept of table redirection, but still be able to provide this functionality for hive & glue.
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergConfig.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Outdated
Show resolved
Hide resolved
acc1604 to
6b75248
Compare
6b75248 to
d863ff6
Compare
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Outdated
Show resolved
Hide resolved
152eafb to
909f1cc
Compare
There was a problem hiding this comment.
Note the change in functionality here for the Iceberg connector. Please provide feedback whether the provided functionality is incorrect from your perspective.
1058105 to
50e3430
Compare
|
Rebased on |
There was a problem hiding this comment.
When the table redirection towards Hive connector is not enabled,
in case of trying to query on the Iceberg connector
a metadata table of a Hive connector table,
the user will receive a table not found exception.
From io.trino.tests.product.iceberg.TestIcebergHiveTablesCompatibility#testIcebergSelectFromHiveTable
assertQueryFailure(() -> onTrino().executeQuery("SELECT * FROM iceberg.default.\"" + tableName + "$data\""))
.hasMessageMatching("Query failed \\(#\\w+\\):\\Q Not an Iceberg table: default." + tableName);
assertQueryFailure(() -> onTrino().executeQuery("SELECT * FROM iceberg.default.\"" + tableName + "$files\""))
.hasMessageMatching("Query failed \\(#\\w+\\):\\Q line 1:15: Table 'iceberg.default." + tableName + "$files' does not exist");
There was a problem hiding this comment.
keeping verify in such case seems "OK" too, wdyt?
There was a problem hiding this comment.
Keeping verify would issue for the statement:
onTrino().executeQuery("SELECT * FROM iceberg.default.\"" + hiveTableName + "$files\"")
error messages like the following:
Wrong table type: test_iceberg_select_from_hive_63u5u11q3c70$files
I tend to say that the error message
Table 'iceberg.default." + hiveTableName + "$files' does not exist"
fits better (not ideal, but better).
There was a problem hiding this comment.
Just to make it clear from the code perspective:
io.trino.plugin.iceberg.IcebergMetadata#getTableHandle is reached by a select from hive_table_name$files because there is no system table in Iceberg found for the hive_table_name (IcebergMetadata#getSystemTable returns Optional.empty() in such cases).
In StatementAnalyzer#visitTable the logic of the method assumes that if the identifier doesn't correspond to a MV or to a view, it is certainly a table. For this reason IcebergMetadata#getTableHandle is called with the rather unexpected argument hive_table_name$files.
Probably a refactoring of StatementAnalyzer#visitTable method which verifies in the beginning whether we're dealing with an redirected table and acts accordingly would be the right way to go, but such a change (if it makes sense) would rather fit in a different PR.
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadataFactory.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
we should redirect non-Iceberg tables, rather than redirect Iceberg tables
There was a problem hiding this comment.
I missed this bug because I didn't know I can run the Glue tests locally.
@findepi Once the PR is in a good shape, please run it against the CI with AWS GLue secrets.
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/catalog/glue/TrinoGlueCatalog.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/catalog/hms/TrinoHiveCatalog.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/catalog/glue/TrinoGlueCatalog.java
Outdated
Show resolved
Hide resolved
...t-tests/src/main/java/io/trino/tests/product/iceberg/TestIcebergHiveTablesCompatibility.java
Outdated
Show resolved
Hide resolved
50e3430 to
13196c8
Compare
|
Rebased on |
plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/TestSharedHiveMetastore.java
Outdated
Show resolved
Hide resolved
863c172 to
bc81a1a
Compare
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Nit, I'd use get instead of map here, makes it clear that an empty catalog name shouldn't ever show up at this point.
There was a problem hiding this comment.
We do have a few lines above:
Optional<String> targetCatalogName = getHiveCatalogName(session);
if (targetCatalogName.isEmpty()) {
return Optional.empty();
}
Also note that the method returns an Optional, so I'd have to do .get() and then wrap the result back to Optional.
There was a problem hiding this comment.
Now that Glue views are merged we should probably match what the Hive catalog has for this line: (table.isEmpty() || VIRTUAL_VIEW.name().equals(table.get().getTableType()))
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/catalog/hms/TrinoHiveCatalog.java
Outdated
Show resolved
Hide resolved
bc81a1a to
58426ff
Compare
|
Rebased on top of |
There was a problem hiding this comment.
// Pretend the table does not exist to produce better message in case of redirects to Hive
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Outdated
Show resolved
Hide resolved
fd3366f to
101291d
Compare
|
Rebased on top of |
The Iceberg connector can make use of the `iceberg.hive-catalog-name` configuration property for enable table redirects towards the Hive tables. When the table redirection towards Hive connector is not enabled, in case of trying to query on the Iceberg connector a metadata table of a Hive connector table, the user will receive a table not found exception.
101291d to
b2a73ff
Compare
Description
New property introduced for the Iceberg connector:
iceberg.hive-catalog-nameNew feature
The changes in this PR affect mainly the Iceberg connector.
In an environment which makes use of a shared metastore it may come in handy to have table redirects to automatically allow Trino to translate a table name like
iceberg.default.hive_table_nametowards the namehive.default.hive_table_name.Note that the translation can happen quite transparently when the user connects to a predefined catalog and schema (e.g. :
iceberg.default) and the table operation looks like:SELECT * FROM hive_table_name.Related issues, pull requests, and links
Documentation
(x) No documentation is needed.
( ) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.
Release notes
( ) No release notes entries required.
(x) Release notes entries required with the following suggested text: