Add support to redirect table reads from Hive to Iceberg#8340
Add support to redirect table reads from Hive to Iceberg#8340phd3 wants to merge 3 commits intotrinodb:masterfrom
Conversation
4d6f0f9 to
a303baa
Compare
plugin/trino-hive/src/main/java/io/trino/plugin/hive/HiveConfig.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/main/java/io/trino/plugin/hive/HiveMetadata.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/main/java/io/trino/plugin/hive/HiveMetadata.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Do you mean that redirects on modification operations are a bad idea?
then why do we support them at all?
if they are a good idea in general, why would we want to opt out here?
There was a problem hiding this comment.
Do you mean that redirects on modification operations are a bad idea?
then why do we support them at all?
does it look better with Fail modifications for redirected tables in engine commit? If so, I can put it as the first commit. Note that Hive connector check is still required because of procedures.
There was a problem hiding this comment.
does it look better with
Fail modifications for redirected tables in enginecommit?
I am not very fluent with this code, but if this implements @electrum 's thinking #7606 (comment), it should go in separate PR and to be debated separately. There hopefully is nothing special about redirects in Iceberg or Hive connector to warrant DDL-specific checks in the connector code (except, maybe, procedures -- which ones?)
There was a problem hiding this comment.
makes sense to do it in a separate PR, was trying to gather feedback if exceptions thrown in hive redirection tests feel more natural with this change. and yes, that commit doesn't have anything specific to hive connector.
hive connector checks become sort of "illegal state checks" in non-procedure calls, but wouldn't hurt to still keep proper error message there.
There was a problem hiding this comment.
but wouldn't hurt to still keep proper error message there.
except that it suggests -- to the future reader -- that we're coding some connector-specific behavior.
(and I do care about future readers quite a lot:)
There was a problem hiding this comment.
Good point. How about changing all connector checks to checkState, and adding special handling for procedures to throw TrinoException?
There was a problem hiding this comment.
or TrinoException here, but with a comment. up to you
plugin/trino-hive/src/main/java/io/trino/plugin/hive/HiveSessionProperties.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/main/java/io/trino/plugin/hive/HiveSessionProperties.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/test/java/io/trino/plugin/hive/AbstractTestHive.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/test/java/io/trino/plugin/hive/AbstractTestHive.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/test/java/io/trino/plugin/hive/AbstractTestHive.java
Outdated
Show resolved
Hide resolved
...no-product-tests/src/main/java/io/trino/tests/product/hive/TestHiveRedirectionToIceberg.java
Outdated
Show resolved
Hide resolved
a303baa to
4cd0b46
Compare
Queries on information_schema.tables failed when the filters pointed to a specific table, and the table was redirected.
Hive Connector redirects Iceberg table reads to the configured Iceberg catalog. Co-authored by: Xingyuan Lin <linxingyuan1102@gmail.com>
4cd0b46 to
bb368d1
Compare
bb368d1 to
0d7d0b6
Compare
|
@findepi @losipiuk @raunaqmorarka @electrum this is ready for review. |
|
@phd3 can you please rebase? |
| Session session = stateMachine.getSession(); | ||
| QualifiedObjectName tableName = createQualifiedObjectName(session, statement, statement.getName()); | ||
| Optional<TableHandle> tableHandle = metadata.getTableHandle(session, tableName); | ||
| Optional<TableHandle> tableHandle = metadata.getOriginalTableHandle(session, tableName, Optional.of(getName())); |
There was a problem hiding this comment.
Fail modifications for redirected tables in engine
I am not convinced we should to that
There was a problem hiding this comment.
here "modifications" is bad wording, meant DDLs. we already added support for DMLs in #8683
The last discussion that we had on this was #7606 (comment)
DDL operations are problematic since users need to be aware of which connector they are using, since they have different data types, partitioning, bucketing, etc. Our thought was that these are relatively rare operations by more advanced users and that trying to have hidden redirections would end up causing more confusion.
There was a problem hiding this comment.
do you also feel otherwise for DDLs?
There was a problem hiding this comment.
Users sooner or later will ask why ALTER ... ADD COLUMN (something varchar) does not work, and I won't be able to explain to them why not. Yes, from engineer perspective, we may have harder time supporting things like column properties (maybe, or maybe not), but i would assume there are not always used, so users who do not intend to use column properties, will not accept this as a rational explanation for the limitation.
TL;DR yes, DDLs like ALTER .. ADD/DROP COLUMN should be routed as well.
| } | ||
|
|
||
| @Override | ||
| public Optional<CatalogSchemaTableName> redirectTable(ConnectorSession session, SchemaTableName tableName) |
There was a problem hiding this comment.
These credits belong to @MiguelWeezardo
thanks @ssheikin for the comment. added @MiguelWeezardo as co-author in the other PR
|
Seems to be superseded by #10173 |
Hive plugin changes on top of #7606
Fixes #4442