Skip to content

Don't write to delta lake table with nonnullable/with invariants columns#13353

Merged
ebyhr merged 3 commits intotrinodb:masterfrom
homar:homar/delta_lake_block_writes_to_table_with_expression_and_non_null_columns
Jul 28, 2022
Merged

Don't write to delta lake table with nonnullable/with invariants columns#13353
ebyhr merged 3 commits intotrinodb:masterfrom
homar:homar/delta_lake_block_writes_to_table_with_expression_and_non_null_columns

Conversation

@homar
Copy link
Copy Markdown
Member

@homar homar commented Jul 26, 2022

Description

Is this change a fix, improvement, new feature, refactoring, or other?

Is this a change to the core query engine, a connector, client library, or the SPI interfaces? (be specific)

How would you describe this change to a non-technical end user or system administrator?

Related issues, pull requests, and links

#12635

Documentation

(x) No documentation is needed.
( ) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.

Release notes

(x) No release notes entries required.
( ) Release notes entries required with the following suggested text:

# Delta Lake
* Prevent writing to a table with `NOT NULL` or [invariants](https://github.com/delta-io/delta/blob/master/PROTOCOL.md#column-invariants) columns. ({issue}`12635`)

Copy link
Copy Markdown
Member

@ebyhr ebyhr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you fix the commit title as like this?

Prevent writing to delta lake table with non-nullable or invariants columns

@homar homar force-pushed the homar/delta_lake_block_writes_to_table_with_expression_and_non_null_columns branch from e8bf959 to 9dfd3f6 Compare July 26, 2022 11:50
@homar homar force-pushed the homar/delta_lake_block_writes_to_table_with_expression_and_non_null_columns branch from 9dfd3f6 to 86c4ef3 Compare July 26, 2022 12:23
@alexjo2144
Copy link
Copy Markdown
Member

There are a few other places where the Metadata entry is rewritten and existing nullability or invariants are probably dropped. Check out DeltaLakeMetadata#appendTableEntries

@alexjo2144
Copy link
Copy Markdown
Member

There are a few other places where the Metadata entry is rewritten and existing nullability or invariants are probably dropped. Check out DeltaLakeMetadata#appendTableEntries

Addressed that issue in #13368

@homar
Copy link
Copy Markdown
Member Author

homar commented Jul 27, 2022

There are a few other places where the Metadata entry is rewritten and existing nullability or invariants are probably dropped. Check out DeltaLakeMetadata#appendTableEntries

Addressed that issue in #13368

You didn't even give me a chance. But thanks!

@homar homar force-pushed the homar/delta_lake_block_writes_to_table_with_expression_and_non_null_columns branch 3 times, most recently from 930e4ec to 86c4ef3 Compare July 27, 2022 11:45
@findepi findepi changed the title Dont write to delta lake table with nonnullable/with invariants columns Don't write to delta lake table with nonnullable/with invariants columns Jul 27, 2022
@findepi
Copy link
Copy Markdown
Member

findepi commented Jul 27, 2022

@homar please see the CI red.

@homar @alexjo2144 can you propose RN wording?

@ebyhr
Copy link
Copy Markdown
Member

ebyhr commented Jul 28, 2022

How about the below entry?

# Delta Lake
* Prevent writing to a table with `NOT NULL` or [invariants](https://github.com/delta-io/delta/blob/master/PROTOCOL.md#column-invariants) columns. ({issue}`12635`)

@ebyhr
Copy link
Copy Markdown
Member

ebyhr commented Jul 28, 2022

CI hit #13199

@homar
Copy link
Copy Markdown
Member Author

homar commented Jul 28, 2022

CI hit #13199

Do I understand correctly that this is not because of my PR ?

@findepi
Copy link
Copy Markdown
Member

findepi commented Jul 28, 2022

CI hit #13199

Why does it even try to access Glue? there should be no secrets to do that.

For the record, he failure is

tests               | 2022-07-28 10:27:50 INFO: FAILURE     /    io.trino.tests.product.deltalake.TestDeltaLakeDatabricksInsertCompatibility.testCompression [ZSTD] (Groups: profile_specific_tests, delta-lake-databricks) took 4.1 seconds
tests               | 2022-07-28 10:27:50 SEVERE: Failure cause:
tests               | io.trino.tempto.query.QueryExecutionException: java.sql.SQLException: Query failed (#20220728_044250_00294_q2ax2): Table being modified concurrently. (Service: AWSGlue; Status Code: 400; Error Code: ConcurrentModificationException; Request ID: 70168e94-5180-4b28-9818-e3b6964dfb75; Proxy: null)
tests               | 	at io.trino.tempto.query.JdbcQueryExecutor.execute(JdbcQueryExecutor.java:119)
tests               | 	at io.trino.tempto.query.JdbcQueryExecutor.executeQuery(JdbcQueryExecutor.java:84)
tests               | 	at io.trino.tests.product.utils.QueryExecutors$1.lambda$executeQuery$0(QueryExecutors.java:60)
tests               | 	at net.jodah.failsafe.Functions.lambda$get$0(Functions.java:48)
tests               | 	at net.jodah.failsafe.RetryPolicyExecutor.lambda$supply$0(RetryPolicyExecutor.java:62)
tests               | 	at net.jodah.failsafe.Execution.executeSync(Execution.java:129)
tests               | 	at net.jodah.failsafe.FailsafeExecutor.call(FailsafeExecutor.java:376)
tests               | 	at net.jodah.failsafe.FailsafeExecutor.get(FailsafeExecutor.java:67)
tests               | 	at io.trino.tests.product.utils.QueryExecutors$1.executeQuery(QueryExecutors.java:60)
tests               | 	at io.trino.tests.product.deltalake.TestDeltaLakeDatabricksInsertCompatibility.testCompression(TestDeltaLakeDatabricksInsertCompatibility.java:357)
tests               | 	at io.trino.tests.product.deltalake.TestDeltaLakeDatabricksInsertCompatibility.testCompression(TestDeltaLakeDatabricksInsertCompatibility.java:304)
tests               | 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
tests               | 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
tests               | 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
tests               | 	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
tests               | 	at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:104)
tests               | 	at org.testng.internal.Invoker.invokeMethod(Invoker.java:645)
tests               | 	at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:851)
tests               | 	at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1177)
tests               | 	at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:129)
tests               | 	at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:112)
tests               | 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
tests               | 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
tests               | 	at java.base/java.lang.Thread.run(Thread.java:833)
tests               | Caused by: java.sql.SQLException: Query failed (#20220728_044250_00294_q2ax2): Table being modified concurrently. (Service: AWSGlue; Status Code: 400; Error Code: ConcurrentModificationException; Request ID: 70168e94-5180-4b28-9818-e3b6964dfb75; Proxy: null)
tests               | 	at io.trino.jdbc.AbstractTrinoResultSet.resultsException(AbstractTrinoResultSet.java:1937)
tests               | 	at io.trino.jdbc.TrinoResultSet.getColumns(TrinoResultSet.java:285)
tests               | 	at io.trino.jdbc.TrinoResultSet.create(TrinoResultSet.java:61)
tests               | 	at io.trino.jdbc.TrinoStatement.internalExecute(TrinoStatement.java:262)
tests               | 	at io.trino.jdbc.TrinoStatement.execute(TrinoStatement.java:240)
tests               | 	at io.trino.tempto.query.JdbcQueryExecutor.executeQueryNoParams(JdbcQueryExecutor.java:128)
tests               | 	at io.trino.tempto.query.JdbcQueryExecutor.execute(JdbcQueryExecutor.java:112)
tests               | 	... 23 more
tests               | 	Suppressed: java.lang.Exception: Query: DROP TABLE delta.default.test_compression_ZSTD_k7ta87ju8tuf
tests               | 		at io.trino.tempto.query.JdbcQueryExecutor.executeQueryNoParams(JdbcQueryExecutor.java:136)
tests               | 		... 24 more
tests               | Caused by: io.trino.spi.TrinoException: Table being modified concurrently. (Service: AWSGlue; Status Code: 400; Error Code: ConcurrentModificationException; Request ID: 70168e94-5180-4b28-9818-e3b6964dfb75; Proxy: null)
tests               | 	at io.trino.plugin.hive.metastore.glue.GlueHiveMetastore.dropTable(GlueHiveMetastore.java:583)
tests               | 	at io.trino.plugin.hive.metastore.cache.CachingHiveMetastore.dropTable(CachingHiveMetastore.java:512)
tests               | 	at io.trino.plugin.deltalake.metastore.HiveMetastoreBackedDeltaLakeMetastore.dropTable(HiveMetastoreBackedDeltaLakeMetastore.java:173)
tests               | 	at io.trino.plugin.deltalake.DeltaLakeMetadata.dropTable(DeltaLakeMetadata.java:1819)
tests               | 	at io.trino.plugin.base.classloader.ClassLoaderSafeConnectorMetadata.dropTable(ClassLoaderSafeConnectorMetadata.java:390)
tests               | 	at io.trino.metadata.MetadataManager.dropTable(MetadataManager.java:741)
tests               | 	at io.trino.execution.DropTableTask.execute(DropTableTask.java:89)
tests               | 	at io.trino.execution.DropTableTask.execute(DropTableTask.java:37)
tests               | 	at io.trino.execution.DataDefinitionExecution.start(DataDefinitionExecution.java:145)
tests               | 	at io.trino.execution.SqlQueryManager.createQuery(SqlQueryManager.java:249)
tests               | 	at io.trino.dispatcher.LocalDispatchQuery.lambda$startExecution$7(LocalDispatchQuery.java:143)
tests               | 	at io.trino.$gen.Trino_391_66_ge73263d____20220728_042120_2.run(Unknown Source)
tests               | 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
tests               | 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
tests               | 	at java.base/java.lang.Thread.run(Thread.java:833)
tests               | Caused by: com.amazonaws.services.glue.model.ConcurrentModificationException: Table being modified concurrently. (Service: AWSGlue; Status Code: 400; Error Code: ConcurrentModificationException; Request ID: 70168e94-5180-4b28-9818-e3b6964dfb75; Proxy: null)
tests               | 	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1862)
tests               | 	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1415)
tests               | 	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1384)
tests               | 	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1154)
tests               | 	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:811)
tests               | 	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:779)
tests               | 	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:753)
tests               | 	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:713)
tests               | 	at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:695)
tests               | 	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:559)
tests               | 	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:539)
tests               | 	at com.amazonaws.services.glue.AWSGlueClient.doInvoke(AWSGlueClient.java:11444)
tests               | 	at com.amazonaws.services.glue.AWSGlueClient.invoke(AWSGlueClient.java:11411)
tests               | 	at com.amazonaws.services.glue.AWSGlueClient.invoke(AWSGlueClient.java:11400)
tests               | 	at com.amazonaws.services.glue.AWSGlueClient.executeDeleteTable(AWSGlueClient.java:3688)
tests               | 	at com.amazonaws.services.glue.AWSGlueClient.deleteTable(AWSGlueClient.java:3657)
tests               | 	at io.trino.plugin.hive.metastore.glue.GlueHiveMetastore.lambda$dropTable$19(GlueHiveMetastore.java:578)
tests               | 	at io.trino.plugin.hive.metastore.glue.GlueMetastoreApiStats.call(GlueMetastoreApiStats.java:35)
tests               | 	at io.trino.plugin.hive.metastore.glue.GlueHiveMetastore.dropTable(GlueHiveMetastore.java:577)
tests               | 	... 14 more

@ebyhr
Copy link
Copy Markdown
Member

ebyhr commented Jul 28, 2022

Why does it even try to access Glue?

Because I sent #13381 to run added product test depending on Databricks.

@homar
Copy link
Copy Markdown
Member Author

homar commented Jul 28, 2022

@findepi @ebyhr can we merge this ?

@ebyhr ebyhr merged commit b39ac7c into trinodb:master Jul 28, 2022
@ebyhr
Copy link
Copy Markdown
Member

ebyhr commented Jul 28, 2022

Merged, thanks!

@ebyhr ebyhr mentioned this pull request Jul 28, 2022
@github-actions github-actions bot added this to the 392 milestone Jul 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

4 participants