Skip to content

Prevent inserting nulls coming from expressions into non nullable columns#13462

Merged
findepi merged 4 commits intotrinodb:masterfrom
homar:homar/prevent_inserting_nulls_from_expressions_into_non_nullable_columns
Aug 10, 2022
Merged

Prevent inserting nulls coming from expressions into non nullable columns#13462
findepi merged 4 commits intotrinodb:masterfrom
homar:homar/prevent_inserting_nulls_from_expressions_into_non_nullable_columns

Conversation

@homar
Copy link
Member

@homar homar commented Aug 2, 2022

Description

Is this change a fix, improvement, new feature, refactoring, or other?

a fix

Is this a change to the core query engine, a connector, client library, or the SPI interfaces? (be specific)

core query engine

How would you describe this change to a non-technical end user or system administrator?

it prevents users from inserting nulls into non nullable columns

Related issues, pull requests, and links

fixes: #13434

Documentation

(x) No documentation is needed.
( ) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.

Release notes

( ) No release notes entries required.
(x) Release notes entries required with the following suggested text:

# Core
* Prevents inserting nulls that are results of expressions into non nullable columns

@cla-bot cla-bot bot added the cla-signed label Aug 2, 2022
@homar homar force-pushed the homar/prevent_inserting_nulls_from_expressions_into_non_nullable_columns branch 2 times, most recently from 238d587 to 296a85b Compare August 2, 2022 23:04
@findepi
Copy link
Member

findepi commented Aug 3, 2022

#13434

If the PR is supposed to fix (close) the issue, prepend the issue number with "fixes: ". Thanks

@homar homar requested a review from kasiafi August 3, 2022 13:38
@homar homar force-pushed the homar/prevent_inserting_nulls_from_expressions_into_non_nullable_columns branch 7 times, most recently from 6ab803e to 9a21e25 Compare August 7, 2022 15:52
@homar homar changed the title [WIP] Prevent inserting nulls coming from expressions into non nullable columns Prevent inserting nulls coming from expressions into non nullable columns Aug 7, 2022
@homar homar requested review from findepi and kasiafi August 7, 2022 18:45
@homar homar force-pushed the homar/prevent_inserting_nulls_from_expressions_into_non_nullable_columns branch from 9a21e25 to 1d2ed09 Compare August 8, 2022 10:54
@homar homar force-pushed the homar/prevent_inserting_nulls_from_expressions_into_non_nullable_columns branch 4 times, most recently from 3e14c3e to f5adf56 Compare August 8, 2022 21:11
@homar homar force-pushed the homar/prevent_inserting_nulls_from_expressions_into_non_nullable_columns branch from 3e14c3e to f5adf56 Compare August 8, 2022 21:11
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

( #13551 )

// TODO (https://github.com/trinodb/trino/issues/13551) This doesn't fail for other connectors so
//  probably shouldn't fail for MariaDB either. Once fixed, remove test override.

(mind two spaces after // on the second comment line)

@homar homar force-pushed the homar/prevent_inserting_nulls_from_expressions_into_non_nullable_columns branch from f5adf56 to 7a0d69b Compare August 9, 2022 09:37
@homar homar force-pushed the homar/prevent_inserting_nulls_from_expressions_into_non_nullable_columns branch from 7a0d69b to 6321160 Compare August 9, 2022 11:28
assertQueryFails(format("INSERT INTO %s (nullable_col) SELECT nationkey FROM nation", table.getName()), errorMessageForInsertIntoNotNullColumn("not_null_col"));
// TODO (https://github.com/trinodb/trino/issues/13551) This doesn't fail for other connectors so
// probably shouldn't fail for MariaDB either. Once fixed, remove test override.
assertQueryFails(format("INSERT INTO %s (nullable_col) SELECT nationkey FROM nation WHERE regionkey < 0", table.getName()), ".*Field 'not_null_col' doesn't have a default value.*");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reuse errorMessageForInsertIntoNotNullColumn("not_null_col")

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that won't work here

assertQueryFails(format("INSERT INTO %s (nullable_col) SELECT nationkey FROM nation", table.getName()), errorMessageForInsertIntoNotNullColumn("not_null_col"));
// TODO (https://github.com/trinodb/trino/issues/13551) This doesn't fail for other connectors so
// probably shouldn't fail for MariaDB either. Once fixed, remove test override.
assertQueryFails(format("INSERT INTO %s (nullable_col) SELECT nationkey FROM nation WHERE regionkey < 0", table.getName()), ".*Field 'not_null_col' doesn't have a default value.*");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems to be a problem here caused by the current implementation

I modified the test to run like this:

    public void testInsertIntoNotNullColumn()
    {
        try (TestTable table = new TestTable(getQueryRunner()::execute, "insert_not_null", "(nullable_col INTEGER, not_null_col INTEGER NOT NULL)")) {
            assertQuerySucceeds(format("INSERT INTO %s (nullable_col) SELECT nationkey FROM nation WHERE regionkey < 0", table.getName()));
        }

and I notice the following exception in the stacktrace:

java.lang.AssertionError: Expected query to succeed: INSERT INTO insert_not_null1evdhkhr9c (nullable_col) SELECT nationkey FROM nation WHERE regionkey < 0
...
    Suppressed: java.lang.Exception: SQL: INSERT INTO insert_not_null1evdhkhr9c (nullable_col) SELECT nationkey FROM nation WHERE regionkey < 0
.........
         Suppressed: java.lang.RuntimeException: Query: INSERT INTO `tpch`.`insert_not_null1evdhkhr9c` (`nullable_col`) SELECT `nullable_col` FROM `tpch`.`tmp_trino_2dc66901e1fa421f9cf3064225e9df13

There seems to be some side-effect of your changes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I don't understand your point here. This query fails because MariaDB does additional check that fails fast which is similar to what I did in LogicalPlanner here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get it now. The exception originates from MariaDB which complaints while copying the content from the temporary table towards the target table.

@homar
Copy link
Member Author

homar commented Aug 9, 2022

Test failure not related and same as here #13556

@homar homar force-pushed the homar/prevent_inserting_nulls_from_expressions_into_non_nullable_columns branch from 23da3f3 to 6321160 Compare August 9, 2022 13:56
@homar
Copy link
Member Author

homar commented Aug 9, 2022

raptor failure is a know flaky #12385

@homar
Copy link
Member Author

homar commented Aug 9, 2022

Second failure is io.trino.tests.product.iceberg.TestIcebergSparkCompatibility.testStatsAfterAddingPartitionField which is not a known flaky but root cause seems like a transient issue:
Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/hive/warehouse/test_stats_after_adding_partition_field_co9knj4i0fkp/data/col1=4/00000-2016-70242f10-85f4-447a-ab96-8776b4233d2e-00001.parquet could only be written to 0 of the 1 minReplication nodes. There are 1 datanode(s) running and no node(s) are excluded in this operation.

@homar
Copy link
Member Author

homar commented Aug 10, 2022

trino-hive failure seems not to be related - I opened new issue #13600

@homar
Copy link
Member Author

homar commented Aug 10, 2022

In default, suite-7-non-generic TestIcebergSparkCompatibility.testTrinoReadsTrinoTableWithSparkDeletesAfterOptimizeAndCleanUp failed which is not a known issue but root cause seems to be a transient issue(like the one above - maybe there is some hidden problem somewhere?)
Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/hive/warehouse/test_spark_reads_trino_partitioned_table_with_deletes_after_expiring_snapshots_after_optimizeparquet-55279a4a1bd84f34b032479abbb27345/metadata/00003-fa053566-1af5-424f-af09-befb86799553.metadata.json could only be written to 0 of the 1 minReplication nodes. There are 1 datanode(s) running and no node(s) are excluded in this operation.

@findepi
Copy link
Member

findepi commented Aug 10, 2022

TestIcebergSparkCompatibility failed twice in a row (at different methods) with could only be written to 0 of the 1 minReplication nodes. There are 1 datanode(s) running and no node(s) are excluded in this operation..
We have some retries for that in Hive product tests.
Can this be applied in this class too?

@homar
Copy link
Member Author

homar commented Aug 10, 2022

TestIcebergSparkCompatibility failed twice in a row (at different methods) with could only be written to 0 of the 1 minReplication nodes. There are 1 datanode(s) running and no node(s) are excluded in this operation.. We have some retries for that in Hive product tests. Can this be applied in this class too?

I think it can be. I will create a ticket for it

@homar
Copy link
Member Author

homar commented Aug 10, 2022

TestIcebergSparkCompatibility failed twice in a row (at different methods) with could only be written to 0 of the 1 minReplication nodes. There are 1 datanode(s) running and no node(s) are excluded in this operation.. We have some retries for that in Hive product tests. Can this be applied in this class too?

#13601

@findepi findepi merged commit a631276 into trinodb:master Aug 10, 2022
@github-actions github-actions bot added this to the 393 milestone Aug 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

NULL value may be inserted into non-nullable column if it is a result of an expression

4 participants