Skip to content

Conversation

@djsagain
Copy link
Member

This commit enforces NOT NULL column declarations on write
in the Presto engine, so it applies to all connectors. The
existing Postgres and Mysql tests named testInsertIntoNotNullColumn
were changed to check for the new error message, and a new test
with the same name was added to TestIcebergSmoke.

One possible concern with this commit is that the error message
issued by the Presto engine when writing a null to a NOT NULL
column is a different message than the Connector might issue
if no value was supplied for the NOT NULL column. I think this
is ok, because the error messages supplied by the Connectors are
completely specific to the Connector.

@cla-bot cla-bot bot added the cla-signed label Jun 22, 2020
@djsagain djsagain requested a review from electrum June 22, 2020 14:10
Comment on lines 456 to 457
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plan.getFieldMappings() is guaranteed to match the order of columns in plan.getScope().getRelationType() (i.e., the output "shape" of the query).

The planner is not the right place to match ColumnMetadata to fields, as there is no correspondence between those fields and the order in which the fields appear in the plan.

For instance, given a table t (a BIGINT, b BIGINT), the following query will see the fields in a different order: INSERT INTO t(b, a) VALUES (1, 10)

This should be handled during analysis. The analyzer should record which fields ordinals are supposed to be not null. Take a look at Analysis.JoinUsingAnalysis and callers for an example of how you might record that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'll look there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martint The two callers of this method in this class fetch the columns names from table metadata. Are they wrong as well?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For CREATE TABLE AS ... it kind of works out because the metadata is derived from the CREATE TABLE statement. For INSERT INTO, it works because it creates a projection to match the layout of the table metadata. Unfortunately, it relies on projectNode.getOutputSymbols(), which is not guaranteed to come out in the desired order -- it just happens to work today.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martint, I believe I've addressed your concern in the latest force push of the commit. The code no longer expects channel order to match column order, and I added a unit test that exercises your INSERT INTO t(b, a) example

Copy link
Member

@electrum electrum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks good. I'll defer to @martint to review the planner changes

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should do this inside the check for not-null channels, since this is logically an optimization of checking the block for nulls. For nullable channels, we don't need to look at this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in the latest force push.

Comment on lines 456 to 457
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martint The two callers of this method in this class fetch the columns names from table metadata. Are they wrong as well?

@djsagain djsagain force-pushed the david.stryker/enforce-not-null-columns branch 2 times, most recently from 038cbb4 to 7c133be Compare June 24, 2020 19:01
Copy link
Member

@martint martint left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a minor comment, but the planner bits look good now.

Comment on lines 462 to 463
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd just do this:

Suggested change
.map(column -> requireNonNull(columnToSymbolMap.get(column.getName()), "columnToSymbolMap is missing column " + column.getName()))
.collect(Collectors.toSet());
.map(columnToSymbolMap::get)
.collect(toImmutableSet());

It's more concise, and the immutable set will catch any nulls that might result from the lookup (which is a bug somewhere in the implementation, anyway)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, that's much nicer; changed as you suggested by a forced comment. Thanks very much for looking!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this caused some tests to start failing with NullPointerException

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yeah, sorry, I missed the following in my suggestion above:

.map(ColumnMetadata::getName)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doh, I should have looked more closely rather than blindly applying change. I need to be more careful pushing changes at an hour when I'm no longer sentient ;)

It's fixed now in the latest forced push.

@djsagain djsagain force-pushed the david.stryker/enforce-not-null-columns branch from 7c133be to c151591 Compare June 27, 2020 04:37
This commit enforces NOT NULL column declarations on write
in the Presto engine, so it applies to all connectors.  The
existing Postgres and Mysql tests named testInsertIntoNotNullColumn
were changed to check for the new error message, and a new test
with the same name was added to TestIcebergSmoke.

One possible concern with this commit is that the error message
issued by the Presto engine when writing a null to a NOT NULL
column is a different message than the Connector might issue
if no value was supplied for the NOT NULL column.  I think this
is ok, because the error messages supplied by the Connectors are
completely specific to the Connector.
@djsagain djsagain force-pushed the david.stryker/enforce-not-null-columns branch from c151591 to 0674742 Compare June 27, 2020 14:17
@electrum electrum merged commit 078b680 into trinodb:master Jun 29, 2020
@electrum
Copy link
Member

Thanks!

@djsagain djsagain deleted the david.stryker/enforce-not-null-columns branch June 30, 2020 12:45
@findepi findepi added this to the 338 milestone Jul 3, 2020
@electrum electrum mentioned this pull request Jul 7, 2020
8 tasks
v-jizhang added a commit to v-jizhang/presto that referenced this pull request Mar 29, 2021
This PR is back ported from Trino
trinodb/trino#4144, originally authoered by
djsstarburst. Quote description from Trino PR4144
"This commit enforces NOT NULL column declarations on write
in the Presto engine, so it applies to all connectors. The
existing Postgres and Mysql tests named testInsertIntoNotNullColumn
were changed to check for the new error message, and a new test
with the same name was added to TestIcebergSmoke.

One possible concern with this commit is that the error message
issued by the Presto engine when writing a null to a NOT NULL
column is a different message than the Connector might issue
if no value was supplied for the NOT NULL column. I think this
is ok, because the error messages supplied by the Connectors are
completely specific to the Connector."

Because the gap between Presto and Trino, the original PR has to be
modified.

Cherry-pick of trinodb/trino#4144 (
trinodb/trino#4144)

Co-authored-by: djsstarburst <[email protected]>
pettyjamesm pushed a commit to prestodb/presto that referenced this pull request Apr 2, 2021
This PR is back ported from Trino
trinodb/trino#4144, originally authoered by
djsstarburst. Quote description from Trino PR4144
"This commit enforces NOT NULL column declarations on write
in the Presto engine, so it applies to all connectors. The
existing Postgres and Mysql tests named testInsertIntoNotNullColumn
were changed to check for the new error message, and a new test
with the same name was added to TestIcebergSmoke.

One possible concern with this commit is that the error message
issued by the Presto engine when writing a null to a NOT NULL
column is a different message than the Connector might issue
if no value was supplied for the NOT NULL column. I think this
is ok, because the error messages supplied by the Connectors are
completely specific to the Connector."

Because the gap between Presto and Trino, the original PR has to be
modified.

Cherry-pick of trinodb/trino#4144 (
trinodb/trino#4144)

Co-authored-by: djsstarburst <[email protected]>
aaneja added a commit to aaneja/presto2 that referenced this pull request Aug 18, 2025
Original support was added via #2ad67dcf,
but missed adding it for Iceberg tables

Cherry-pick of trinodb/trino#4144
Co-authored-by: djsstarburst <[email protected]>
aaneja added a commit to aaneja/presto2 that referenced this pull request Aug 19, 2025
Original support was added via #2ad67dcf,
but missed adding it for Iceberg tables

Cherry-pick of trinodb/trino#4144
Co-authored-by: djsstarburst <[email protected]>
aaneja added a commit to prestodb/presto that referenced this pull request Aug 20, 2025
Original support was added via #2ad67dcf,
but missed adding it for Iceberg tables

Cherry-pick of trinodb/trino#4144
Co-authored-by: djsstarburst <[email protected]>
lga-zurich pushed a commit to lga-zurich/presto-exchange that referenced this pull request Sep 2, 2025
Original support was added via #2ad67dcf,
but missed adding it for Iceberg tables

Cherry-pick of trinodb/trino#4144
Co-authored-by: djsstarburst <[email protected]>
lga-zurich pushed a commit to lga-zurich/presto-exchange that referenced this pull request Sep 8, 2025
Original support was added via #2ad67dcf,
but missed adding it for Iceberg tables

Cherry-pick of trinodb/trino#4144
Co-authored-by: djsstarburst <[email protected]>
lga-zurich pushed a commit to lga-zurich/presto-exchange that referenced this pull request Sep 8, 2025
Original support was added via #2ad67dcf,
but missed adding it for Iceberg tables

Cherry-pick of trinodb/trino#4144
Co-authored-by: djsstarburst <[email protected]>
lga-zurich pushed a commit to lga-zurich/presto-exchange that referenced this pull request Sep 8, 2025
Original support was added via #2ad67dcf,
but missed adding it for Iceberg tables

Cherry-pick of trinodb/trino#4144
Co-authored-by: djsstarburst <[email protected]>
lga-zurich pushed a commit to lga-zurich/presto-exchange that referenced this pull request Sep 8, 2025
Original support was added via #2ad67dcf,
but missed adding it for Iceberg tables

Cherry-pick of trinodb/trino#4144
Co-authored-by: djsstarburst <[email protected]>
lga-zurich pushed a commit to lga-zurich/presto-exchange that referenced this pull request Sep 8, 2025
Original support was added via #2ad67dcf,
but missed adding it for Iceberg tables

Cherry-pick of trinodb/trino#4144
Co-authored-by: djsstarburst <[email protected]>
lga-zurich pushed a commit to lga-zurich/presto-exchange that referenced this pull request Sep 8, 2025
Original support was added via #2ad67dcf,
but missed adding it for Iceberg tables

Cherry-pick of trinodb/trino#4144
Co-authored-by: djsstarburst <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

4 participants