Support views in Delta lake Connector#11763
Conversation
|
@mdesmet the PR seems to cover AC changes that are unrelated to views. |
Some view tests within BaseConnectorTest failed because of existing AC setup, I looked at how it's done in iceberg and based myself on that but haven't really tested all AC use cases and the implications of this change. I will setup a separate PR. Also for clarity. in the original bug report Spark views were mentioned. I have not investigated that for now. I think If we want to support that we might need a translation layer similar like what Linkedin did for the HIve views (Coral). My focus has only been to support creation of views as in Iceberg/Hive. |
There was a problem hiding this comment.
This is related to supporting all access modes. See #11782
I have added delta.security = system and that causes the metadata to be retrieved in MetadataManager (before the access control was set to SYSTEM),
There was a problem hiding this comment.
please reenable the test.
then let's think how to make it pass. maybe we don't need delta.security = system here.
or maybe we need to merge the other PR first.
IDK yet, but let's have the test remind us (ie be failing until we can fix it)
There was a problem hiding this comment.
This change is related to the other PR. I will remove that commit here, but as I said before it seems that more tests are failing because of the AC setup in the BaseConnectorTest.
plugin/trino-delta-lake/src/main/java/io/trino/plugin/deltalake/DeltaLakeSecurityModule.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
one check per line pls (for readability sake)
There was a problem hiding this comment.
Might be better to split these checks into two static util methods so that this line reads !isDeltaDataTable(table) && !isDeltaView(table)
|
Delta, Hive and Iceberg can use the same catalog (eg Glue or HMS). |
Understood. Let's separate and iterate from there
This is totally different story. I don't know what's the level of support for Spark SQL in Coral. So far we were reading only HiveQL-based views with Coral, and that was really challenging to get right-ish. We can invest into SparkSQL-based views for Delta Lake, but that's certainly a project on its own. Let's have Trino Views supported first, just as we did for Iceberg. |
02a785a to
2737614
Compare
|
So basically the issue is with following check in Then in the That's why i had implemented the Access Control modes first in order to be able to leverage the existing BaseConnector tests as the system AC doesn't support DEFINER views. |
There was a problem hiding this comment.
Wouldn't it be simpler to have a Set with these 3 error codes(you can put comments in the place where set is created) and here only one if, something like:
if(errorCoded.contains(e.getErrorCode()) { //do nothing } else { throw e}
There was a problem hiding this comment.
HIVE_VIEW_TRANSLATION_ERROR should never be thrown here anyway, since we're not translating views.
2737614 to
1701125
Compare
2708f08 to
284cbe7
Compare
955b6bf to
d8c4da0
Compare
There was a problem hiding this comment.
, which can be useful under certain conditions. -> this part of the comment does add a bit of mistery to the statement, but has no real gain in understanding for the reader.
There was a problem hiding this comment.
Is this a question for the review?
There was a problem hiding this comment.
The logic from this method seems to be 1 to 1 copied from io.trino.plugin.iceberg.catalog.hms.TrinoHiveCatalog#createView from trino-iceberg.
This applies also for the other methods of the class newly introduced in this PR.
Can you think of a way to avoid such kind of code duplication?
There was a problem hiding this comment.
The code in Delta and Iceberg is indeed similar but different from the Hive connector as the Hive connector acts upon a SemiTransactionalhiveMetastore with slightly different method calls. In Iceberg and Delta we don't have the SemiTransactionalhiveMetastore.
To share the code we would need to create a helper class in the Hive connector (the easier part) with all views related operations. This helper class would need access to the HiveMetastore implementation (which will relay to either actual Hive Metastore or GlueHiveMetastore (Glue API calls) due to Guice wiring). This works well for Delta. However on Iceberg the implementation is different. It has two factories to create either a TrinoCatalog for Hive metastore, which has access to the HiveMetastore implementation, either to The Glue catalog which directly calls the Glue API.
We could fix the latter by also using Hive's GlueHiveMetastore in the Icebergs TrinoGlueCatalogin order for the helper class to be used in both cases. There are however also differences between theTrinoGlueCatalogandGlueHiveMetastore` that will need to be investigated and properly tested.
Code reuse is possible but at definitely at a much higher cost also it would probably be better handled in a separate PR. The question arises what we do first: provide the views feature and do the internal refactoring after or first refactor and delay the views feature?
There was a problem hiding this comment.
I'd go for trying out (as discussed offline) for grouping together the code which would be otherwise repetitive after landing your PR. This seems rather cheap to do and shouldn't add an increased complexity in the code of Delta and Iceberg connectors.
Regarding TrinoGlueCatalog I'd suggest to leave it as it is (at least for now).
The main purpose of my comment was DRY (Don't repeat yourself) and not a more complex refactoring (which in any case would not be in the scope of this PR).
There was a problem hiding this comment.
I have followed your advice and added a ViewsMetastoreHelper class, potentially it could also be used for Hive but then we would have to use callbacks. I didn't do that for now.
There was a problem hiding this comment.
As a follow-up to this PR, could you please create an issue (with good-first-issue label) to add support for comments on Delta views ?
plugin/trino-hive/src/main/java/io/trino/plugin/hive/metastore/thrift/ThriftHiveMetastore.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
The PR is missing a test where Trino views for Delta gets created on a Glue backed metastore.
Consider either such a test either in trino-delta-lake.
c99bf35 to
a016c9c
Compare
a016c9c to
e00f037
Compare
There was a problem hiding this comment.
unrelated change - please restore.
There was a problem hiding this comment.
nit: The paranthesis in the expression VALUES (1) are not necessary.
There was a problem hiding this comment.
Why protected? I think private will do just fine.
There was a problem hiding this comment.
This method can stay private
There was a problem hiding this comment.
ConnectorSession session parameter seems unused.
There was a problem hiding this comment.
ConnectorSession session parameter seems to be unused.
Make sure that also the other methods calling this method drop it as well : dropView , getViews.
There was a problem hiding this comment.
Can we use createViewProperties(ConnectorSession session, String trinoVersion, String connectorName) call instead (to avoid code duplication) ?
plugin/trino-hive/src/main/java/io/trino/plugin/hive/metastore/thrift/ThriftHiveMetastore.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
@AfterClass(alwaysRun = true)
There was a problem hiding this comment.
Can be converted to local variable.
@ebyhr : The only way to test that all three together is through a product test. It seems we currently don't have any configuration that has both Hive, Iceberg and Delta. Is that what we want or do you see another way to test it? |
b8fc01a to
5f8597e
Compare
I think you can use BaseSharedMetastoreTest from Side note: @mdesmet the product test enviroment |
d335361 to
e27141c
Compare
...elta-lake/src/test/java/io/trino/plugin/deltalake/BaseDeltaLakeSharedMetastoreViewsTest.java
Outdated
Show resolved
Hide resolved
...elta-lake/src/test/java/io/trino/plugin/deltalake/BaseDeltaLakeSharedMetastoreViewsTest.java
Outdated
Show resolved
Hide resolved
...elta-lake/src/test/java/io/trino/plugin/deltalake/BaseDeltaLakeSharedMetastoreViewsTest.java
Outdated
Show resolved
Hide resolved
...elta-lake/src/test/java/io/trino/plugin/deltalake/BaseDeltaLakeSharedMetastoreViewsTest.java
Outdated
Show resolved
Hide resolved
...elta-lake/src/test/java/io/trino/plugin/deltalake/BaseDeltaLakeSharedMetastoreViewsTest.java
Outdated
Show resolved
Hide resolved
...elta-lake/src/test/java/io/trino/plugin/deltalake/TestDeltaLakeSharedFileMetastoreViews.java
Outdated
Show resolved
Hide resolved
...elta-lake/src/test/java/io/trino/plugin/deltalake/TestDeltaLakeSharedGlueMetastoreViews.java
Outdated
Show resolved
Hide resolved
...-lake/src/test/java/io/trino/plugin/deltalake/metastore/glue/TestDeltaLakeGlueMetastore.java
Outdated
Show resolved
Hide resolved
.../src/test/java/io/trino/plugin/deltalake/metastore/glue/TestDeltaLakeViewsGlueMetastore.java
Outdated
Show resolved
Hide resolved
.../src/test/java/io/trino/plugin/deltalake/metastore/glue/TestDeltaLakeViewsGlueMetastore.java
Outdated
Show resolved
Hide resolved
e27141c to
2f14321
Compare
8792f4c to
0cfc576
Compare
|
@ebyhr : PTAL |
|
/test-with-secrets sha=0cfc5761b4c02ec5137f9dfb751137f637bf42d0 |
...elta-lake/src/test/java/io/trino/plugin/deltalake/BaseDeltaLakeSharedMetastoreViewsTest.java
Outdated
Show resolved
Hide resolved
...elta-lake/src/test/java/io/trino/plugin/deltalake/BaseDeltaLakeSharedMetastoreViewsTest.java
Outdated
Show resolved
Hide resolved
...elta-lake/src/test/java/io/trino/plugin/deltalake/TestDeltaLakeSharedGlueMetastoreViews.java
Outdated
Show resolved
Hide resolved
.../src/test/java/io/trino/plugin/deltalake/metastore/glue/TestDeltaLakeViewsGlueMetastore.java
Outdated
Show resolved
Hide resolved
.../src/test/java/io/trino/plugin/deltalake/metastore/glue/TestDeltaLakeViewsGlueMetastore.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/main/java/io/trino/plugin/hive/TrinoViewHiveMetastore.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/main/java/io/trino/plugin/hive/TrinoViewUtil.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/main/java/io/trino/plugin/hive/metastore/glue/GlueHiveMetastore.java
Outdated
Show resolved
Hide resolved
0cfc576 to
cd69957
Compare
|
/test-with-secrets sha=cd69957b50ee03ac83f57a86fa77017c47d65c4f https://github.com/trinodb/trino/actions/runs/3615689927 |
...elta-lake/src/test/java/io/trino/plugin/deltalake/BaseDeltaLakeSharedMetastoreViewsTest.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/catalog/glue/TrinoGlueCatalog.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/catalog/glue/TrinoGlueCatalog.java
Outdated
Show resolved
Hide resolved
cd69957 to
94226a3
Compare
|
/test-with-secrets sha=94226a3573dde2ac87ba7cadb5cf607174f4bf8f https://github.com/trinodb/trino/actions/runs/3644481861 |
|
|
||
| if (!isPrestoView(tableParameters)) { | ||
| // Hive views are not compatible | ||
| throw new HiveViewNotSupportedException(viewName); |
There was a problem hiding this comment.
What are the situations where isView above return true (the method is defined as isHiveOrPrestoView(tableType) && PRESTO_VIEW_COMMENT.equals(tableParameters.get(TABLE_COMMENT))) but isPrestoView will return false?
intuitively this line is not reachable?
Description
Support views in Delta lake connector
Improvement
Delta lake connector
Related issues, pull requests, and links
Release notes
(x) Release notes entries required with the following suggested text: