Add table redirection support in SPI by phd3 · Pull Request #7606 · trinodb/trino

phd3 · 2021-04-16T05:49:16Z

Supersedes #7016 (SPI changes only).

Adds ConnectorMetadata#getRedirectedTable and ConnectorMetadata#getRedirectedName to redirect tables. Behavior for views and materialized views is unaltered.
Adds ConnectorMetadata#streamColumns method to introduce an alternate SPI method that is aware of redirection without breaking SPI.
Operations being impacted are: table scan, column listing, create table with like clause. Others remain unaltered for now. A brief audit for getTableHandle calls here.

Listing method contracts (Courtesy @electrum and @dain):

listTables returns the list from the source connector. We special case filtering down to a single table -- just check if the name exists in the source connector -- it is returned even if the redirected table does not exist.
listTableColumns returns columns from the target of the redirect, since source connector does not know. If the redirect target does not exist, the table is skipped in the column list -- thus listing never fails. This is consistent with existing behavior for listing columns.
The list of table names comes solely from the source catalog, thus access control filterTables is only called for the source catalog. The target catalog is not involved at all with listing names. Because the list of columns comes from the target catalog, the access control filterColumns needs to be called for the target catalog for redirections. We do this by modifying MetadataListing.listTableColumns to do the redirection logic for listing columns, rather than doing it in MetadataManager.
listTablePrivileges remains unaware of redirection for now.

findepi · 2021-04-16T10:56:32Z

@phd3 please see build failures

kokosing · 2021-04-16T11:56:20Z

core/trino-spi/src/main/java/io/trino/spi/connector/ConnectorMetadata.java

Why do we return Stream? Stream in SPI does not look like a good idea. Is it a lazy collection?

@kokosing We just needed a redirection-aware method for listing table columns without breaking current SPI, we're just combining the return type change along with it. #5160 (comment)

the current implementation in this PR lets redirection go through Metadata instead of it being returned from ConnectorMetadata as suggested in the above comment.

kokosing · 2021-04-16T11:57:16Z

core/trino-spi/src/main/java/io/trino/spi/connector/ColumnsMetadata.java

io.trino.spi.connector.ConnectorTableSchema?

Ultimately we need ColumnMetadata objects from MetadataListing#listTableColumns, which seems to more information than ColumnSchema in ConnectorTableSchema right now. Since the purpose of ConnectorTableSchema seems to be avoiding fetching full metadata, may be we can keep the two separate. wdyt?

also open to comments/suggestions for class/method naming here. May be we can use Stream<TableColumnsMetadata> streamTableColumns(...) instead of current Stream<ColumnsMetadata> listTableColumnsStream(...).

kokosing · 2021-04-16T11:57:42Z

core/trino-spi/src/main/java/io/trino/spi/connector/ColumnsMetadata.java

What it means that it is Optional?

Added a comment

sopel39 · 2021-04-16T20:26:47Z

core/trino-main/src/main/java/io/trino/sql/analyzer/StatementAnalyzer.java

In case of error, will error message refer to redirected table, e.g:

SELECT * FROM table_to_be_redirected

would cause e.g

Cannot analyze `redirected_table` table

?

yeah, changes along the lines of #7134 would be helpful to propagate this information properly, when it's checked in.

for now, I think redirected table in error message helps keep the error message behavior consistent. i.e. any future errors (say during execution) will only refer to the target as the source won't be available. I agree though that it can be a bit confusing. wdyt?

#7134 is only partially helpful as warning/events might not be shown by all tools.

or now, I think redirected table in error message helps keep the error message behavior consistent. i.e. any future errors (say during execution) will only refer to the target as the source won't be available.

I'm not sure we actually refer to table names during execution. We do during optimization, but even there we already use "virtual" table names for pushed down TableHandles.
I think the biggest confusion will actually come during analysis as error messages won't match queries.

I also look at a way to reduce amount of changes required by new SPI. Currently, it's pretty extensive and touches a lot of components. Potentially, transparent redirection within getTableHandle could reduce scope of changes.

...trino-main/src/main/java/io/trino/connector/informationschema/InformationSchemaMetadata.java

sopel39 · 2021-04-16T20:37:26Z

core/trino-main/src/main/java/io/trino/execution/CreateTableTask.java

If we redirect here, then error messages below will be confusing for users, e.g:

CREATE TABLE foo LIKE source_catalog.bar

with error

LIKE table catalog `target_catalog` does not exist

If getTableHandle was redirection aware, we could run this checks on original table name and only within getTableHandle throw redirection specific errors.

With the new implementation:

CREATE TABLE foo LIKE source_catalog.bar

if source_catalog didn't exist, it's reported as is.

if target_catalog didn't exist, it'd be reported in the error message, along with the message that the table was redirected from a source_catalog table.

If target_catalog table is different from created table's catalog, the error message LIKE table across catalogs is not supported ... would also contain information about redirection.

However, any issues with ACL checks would still be reported on the target. (similar to StatementAnalyzer#visitTable)

sopel39 · 2021-04-26T14:02:32Z

...trino-main/src/main/java/io/trino/connector/informationschema/InformationSchemaMetadata.java

extract this to separate method (starting from try)

IMO extracting it to a separate method may not be very useful, because exception catching here is context dependent, and seems easier to read inline. wdyt?

sopel39 · 2021-04-26T14:02:48Z

...trino-main/src/main/java/io/trino/connector/informationschema/InformationSchemaMetadata.java

the ignoring logic assumes that we wouldn't want to break listing in general if one table has redirection issues. (Similar to HiveMetadata#getViews, HiveMetadata#listTableColumns)

In this particular case, do you mean that Optional<Set<String>> tables = filterString(constraint, TABLE_NAME_COLUMN_HANDLE) will return non-empty only if user has requested specific tables explicitly and we should fail in that case if redirection errors are observed?

core/trino-main/src/main/java/io/trino/metadata/Metadata.java

core/trino-main/src/main/java/io/trino/metadata/MetadataListing.java

sopel39 · 2021-04-26T15:42:37Z

core/trino-main/src/main/java/io/trino/metadata/MetadataListing.java

Why Optional<List<ColumnMetadata>>? If table has no columns, there we should not list it at all as part of tableColumns

The connector needs to tell the engine that the table is redirected so the engine needs to find the columns from the redirected table. e.g. If user lists columns for table A that's redirected to B, they should get the metadata from B, but the table name displayed is still A.

It's not immediately clear that this is for a single catalog (the original code had this problem as well). My first thought is that it wasn't safe since table names could overlap. How about

List<TableColumnsMetadata> catalogColumns = getOnlyElement(metadata.listTableColumns(session, prefix).values(), List.of()); Map<SchemaTableName, Optional<List<ColumnMetadata>>> tableColumns = catalogColumns.stream() .collect(toImmutableMap(TableColumnsMetadata::getTable, TableColumnsMetadata::getColumns));

core/trino-main/src/main/java/io/trino/metadata/MetadataListing.java

sopel39 · 2021-04-26T15:48:15Z

core/trino-main/src/main/java/io/trino/sql/analyzer/StatementAnalyzer.java

I think we can force connector to not redirect the storage table, since it can be writable and we've avoided similar usecases for now.

sopel39 · 2021-04-26T15:50:43Z

core/trino-main/src/main/java/io/trino/sql/analyzer/StatementAnalyzer.java

We can leave addEmptyColumnReferencesForTable as is (see also #7606 (comment))

Could you elaborate why we'd want to keep it that way?

my reasoning was that: tableColumnReferences are used for ACL checks. in case of views/materialized views, that should be the source, but for tables, that should be redirected table. so separating them makes this more readable.

If we consider that column metadata responsibility is always on target, the currently implemented behavior would make sense right?

sopel39 · 2021-04-26T15:51:39Z

core/trino-main/src/main/java/io/trino/sql/analyzer/StatementAnalyzer.java

Let's leave this if is and add another and additional check (for redirected table name) if source catalog and schema exists

for tables that haven't been redirected, the flow is untouched. targetTableName is same as name and exception handling happens here for catalog, schema, table existence (like before).

for redirected tables, imo source schema existence shouldn't matter - as long as the connector can redirect to a target. MetadataManager#getRedirectedTableHandle throws exception if (1) catalog doesn't exist during redirection at any step OR (2) getTableHandle fails on the target table due to catalog/schema/table existence issue, and reported accordingly.

If source catalog doesn't exist, the redirection will stop there and reported as "source catalog doesn't exist", which is the current behavior.

even though there's some code-change, I think scope of the flow-change is small here, as you had suggested earlier. Redirection related logic/exception-handling is contained in the MetadataManager redirection apis.

So IMO it'd be better to not duplicate the checks here for source and target both. Thoughts?

sopel39 · 2021-04-26T15:52:19Z

core/trino-main/src/main/java/io/trino/sql/analyzer/StatementAnalyzer.java

Let's make analyzeFiltersAndMasks handle redirections itself

I think we need to handle at a higher level, because we are not redirecting views, and createScopeForView also uses this method.

phd3 · 2021-04-30T18:45:38Z

@sopel39 PTAL, I had split this out in separate commits and responded to comments.

core/trino-main/src/main/java/io/trino/metadata/MetadataManager.java

core/trino-main/src/main/java/io/trino/metadata/Metadata.java

core/trino-main/src/main/java/io/trino/metadata/MetadataManager.java

raunaqmorarka · 2021-05-13T15:59:35Z

core/trino-main/src/main/java/io/trino/metadata/MetadataManager.java

getTableHandle returns empty when catalog, schema or table in input QualifiedObjectName tableName is empty or if table was not found by connector.
Why do we throw in redirection case ? Could we fallback to original table (maybe provide warning about it) ?

If I'm understanding the scenario, this means a connector returned a redirection, then the target did not exist. Why would we want to fallback?

In the case of the Hive connector redirecting to Iceberg, the subsequent read operation would fail (maybe in a confusing way). Though this might be a bad example, since they share the same metastore and thus this should not occur.

I was wondering whether in case of problems in finding the target table (could be a bad redirection config or target connector metastore is unavailable temporarily), a silent fallback (with warning) is useful to avoid disrupting normal operations.
A contrived example would be that the target table is in memory connector, then the coordinator was restarted, now the target table doesn't exist. The source connector probably can't easily detect status of target table.
However, it's also useful to get an explicit error in case one doesn't want to fallback to source at all.
Would we have a catalog session property in each connector which implements this redirection to disable the redirection if needed ?

IMO it'd be better to fail-fast in this case with a clear error, since the target table is "externally" provided by the connector, so that the engine code doesn't need to be aware of throwing redirection-related errors - whenever it needs a handle.

Having a catalog property switch seems like a good idea.

.../trino-main/src/main/java/io/trino/sql/planner/iterative/rule/ApplyTableScanRedirection.java

testing/trino-tests/src/test/java/io/trino/execution/TestTableRedirection.java

raunaqmorarka · 2021-05-13T16:43:14Z

testing/trino-tests/src/test/java/io/trino/execution/TestTableRedirection.java

I think it would be better for the tests to be part of commit which introduces the tested functionality rather than a separate commit.

I'll split it in the two commits that do name and listing redirection.

Since all the changes are closely related, I'm leaning towards going with a single commit after the review.

testing/trino-tests/src/test/java/io/trino/execution/TestTableRedirection.java

raunaqmorarka · 2021-05-13T17:30:32Z

testing/trino-tests/src/test/java/io/trino/execution/TestTableRedirection.java

maybe move this to AbstractTestQueryFramework

have kept this in the test class for now as a quick utility, as ATQF has more generic methods around query assertions. I can generalize this method and do a followup if you feel strongly.

testing/trino-tests/src/test/java/io/trino/execution/TestTableRedirection.java

anjalinorwood · 2021-05-18T18:45:10Z

At Netflix, we rely heavily on redirecting iceberg tables referenced in hive connector session to iceberg connector and vice versa (currently using Netflix specific patch) since the users are largely unaware of the table format used for the table. This patch aligns well with how we redirect. A few high level questions/comments .. mostly around scope.

In our environment the redirection needs to work for all possible operations (such as drop table, alter table/column, insert, view and materialized view operations, including table and column references in the view/mv definition etc). Is there a specific reason this PR limits the scope to select, describe and a couple other operations?
Are hive and iceberg connector changes planned in a follow-up PR?
Can you please give an example of more than one level of table redirection?
Thanks!
cc: @electrum @Parth-Brahmbhatt

electrum · 2021-05-18T20:37:05Z

There are a few things:

DDL operations are problematic since users need to be aware of which connector they are using, since they have different data types, partitioning, bucketing, etc. Our thought was that these are relatively rare operations by more advanced users and that trying to have hidden redirections would end up causing more confusion.
Table rename is problematic, since the target table can be a different name than the source (or of any in the redirection chain), and it isn't clear what the rename target should be, since the target name could be fully qualified, or qualified against the session. This also falls into the category of a rare operation, and one that would break any existing queries, so it doesn't seem useful to support.
Views shouldn't a problem, since they're just queries, no different than any other query. I think this would be the same for the materialized view source query.
There should be no problem with supporting DML operations. It's not implemented in this PR, but I don't see any issue, conceptually or API wise, that would prevent it.

I believe that Hive changes to redirect to Iceberg are planned by @phd3. I'm not sure if Iceberg to Hive is needed, but having symmetry could be useful, although it would add extra complexity in the Iceberg connector and make alternative catalog support more difficult.
If you're asking for a concrete use case, then I don't have one, but there doesn't seem to be any harm in supporting it at the API layer. Making it only support a single level seems like more of a special case, but I'll consider this when reviewing code, to see if limiting to a single redirection could simplify things.

core/trino-main/src/main/java/io/trino/execution/CreateTableTask.java

core/trino-main/src/main/java/io/trino/metadata/Metadata.java

core/trino-main/src/main/java/io/trino/metadata/MetadataManager.java

electrum · 2021-05-18T22:22:06Z

core/trino-main/src/main/java/io/trino/metadata/MetadataManager.java

If I'm understanding the scenario, this means a connector returned a redirection, then the target did not exist. Why would we want to fallback?

In the case of the Hive connector redirecting to Iceberg, the subsequent read operation would fail (maybe in a confusing way). Though this might be a bad example, since they share the same metastore and thus this should not occur.

.../trino-main/src/main/java/io/trino/sql/planner/iterative/rule/ApplyTableScanRedirection.java

core/trino-spi/src/main/java/io/trino/spi/StandardErrorCode.java

electrum

This looks good overall

...trino-main/src/main/java/io/trino/connector/informationschema/InformationSchemaMetadata.java

electrum · 2021-05-19T17:56:12Z

core/trino-main/src/main/java/io/trino/metadata/MetadataManager.java

This method naming and semantics confused me when reading the listing code today (after I forgot how this method worked from reading it yesterday). I was thinking it would return empty when the table is not redirected, like redirectTable. But it would return a handle to the normal, non-redirected table in that case.

Maybe call this getTableHandleWithRedirection to better explain the behavior of "return a table with redirection semantics".

We should also consider renaming getTableHandle to getRawTableHandle or getTableHandleWithoutRedirection, so you have to make a choice of which to use.

I see, yeah the semantics of connector-level and engine-level methods are confusing and different. Will change the method name. Changing getTableHandle name seems like a good idea, will do in followup.

...trino-main/src/main/java/io/trino/connector/informationschema/InformationSchemaMetadata.java

core/trino-spi/src/main/java/io/trino/spi/connector/TableColumnsMetadata.java

core/trino-spi/src/main/java/io/trino/spi/connector/ConnectorMetadata.java

core/trino-main/src/main/java/io/trino/metadata/Metadata.java

electrum · 2021-05-19T19:41:11Z

core/trino-main/src/main/java/io/trino/metadata/MetadataListing.java

It's not immediately clear that this is for a single catalog (the original code had this problem as well). My first thought is that it wasn't safe since table names could overlap. How about

List<TableColumnsMetadata> catalogColumns = getOnlyElement(metadata.listTableColumns(session, prefix).values(), List.of()); Map<SchemaTableName, Optional<List<ColumnMetadata>>> tableColumns = catalogColumns.stream() .collect(toImmutableMap(TableColumnsMetadata::getTable, TableColumnsMetadata::getColumns));

core/trino-main/src/main/java/io/trino/metadata/MetadataListing.java

core/trino-main/src/main/java/io/trino/metadata/MetadataManager.java

phd3 · 2021-05-27T06:49:45Z

@electrum @raunaqmorarka thanks for your thorough reviews. AC.

electrum

Looks great!

core/trino-main/src/main/java/io/trino/metadata/MetadataListing.java

core/trino-main/src/main/java/io/trino/sql/analyzer/StatementAnalyzer.java

.../trino-main/src/main/java/io/trino/sql/planner/iterative/rule/ApplyTableScanRedirection.java

core/trino-main/src/main/java/io/trino/execution/CreateTableTask.java

pnmatthe · 2021-05-28T15:36:06Z

Looks like we're ready to merge?

The engine redirects table scans and column listing to the target table if the connector indicates this using ConnectorMetadata#redirectTable implementation.

phd3 · 2021-06-01T16:55:53Z

Merged #7606 into master.

cla-bot bot added the cla-signed label Apr 16, 2021

phd3 requested review from electrum, raunaqmorarka and sopel39 April 16, 2021 05:49

kokosing reviewed Apr 16, 2021

View reviewed changes

phd3 force-pushed the iceberg-redirect branch 2 times, most recently from e8a8546 to 8b77514 Compare April 16, 2021 20:09

sopel39 reviewed Apr 16, 2021

View reviewed changes

This was referenced Apr 19, 2021

Add table redirection support to SPI and Hive's redirection to Iceberg #5160

Closed

[Simplified Version] Add table redirection support to SPI and Hive's redirection to Iceberg #7016

Closed

phd3 force-pushed the iceberg-redirect branch 12 times, most recently from 1a9c3d1 to 0ee134d Compare April 22, 2021 23:54

sopel39 reviewed Apr 26, 2021

View reviewed changes

phd3 force-pushed the iceberg-redirect branch from 0ee134d to 285997d Compare April 27, 2021 00:28