Skip to content

Add support for not nullable columns in delta lake #13436

Merged
ebyhr merged 1 commit intotrinodb:masterfrom
homar:homar/delta_lake_add_support_for_nullability
Aug 6, 2022
Merged

Add support for not nullable columns in delta lake #13436
ebyhr merged 1 commit intotrinodb:masterfrom
homar:homar/delta_lake_add_support_for_nullability

Conversation

@homar
Copy link
Copy Markdown
Member

@homar homar commented Aug 1, 2022

Description

In delta lake add support for tables that have non nullable columns.

Is this change a fix, improvement, new feature, refactoring, or other?

new feature

Is this a change to the core query engine, a connector, client library, or the SPI interfaces? (be specific)

a connector

How would you describe this change to a non-technical end user or system administrator?

During creation of table you can specify if column can store NULL values.

Related issues, pull requests, and links

#12635

Documentation

( ) No documentation is needed.
( ) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.

Release notes

( ) No release notes entries required.
(x) Release notes entries required with the following suggested text:

# Delta Lake
* Add support for `NOT NULL` column constraint. ({issue}`13436`)

@cla-bot cla-bot bot added the cla-signed label Aug 1, 2022
@homar
Copy link
Copy Markdown
Member Author

homar commented Aug 1, 2022

@mosabua @colebow I know that it will be good to add this to the doc but I didn't figure out best way to do it. Could you suggest something?

@homar homar force-pushed the homar/delta_lake_add_support_for_nullability branch from 8d63629 to 9d0b2ef Compare August 1, 2022 09:08
@homar homar changed the title delta lake add support for nullability [WIP] delta lake add support for nullability Aug 1, 2022
@homar homar force-pushed the homar/delta_lake_add_support_for_nullability branch from 9d0b2ef to 228f85b Compare August 1, 2022 10:32
@homar homar changed the title [WIP] delta lake add support for nullability Add support for not nullable columns in delta lake Aug 1, 2022
@findepi findepi requested review from alexjo2144 and ebyhr August 1, 2022 13:25
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please uncomment the assertion and let the query succeed?
and add a TODO explaining what the expected result should be?

this would ensure this code gets updated when the bug is fixed

@homar homar force-pushed the homar/delta_lake_add_support_for_nullability branch from 228f85b to 2731828 Compare August 2, 2022 10:39
@homar homar force-pushed the homar/delta_lake_add_support_for_nullability branch 2 times, most recently from 104423d to 19fae37 Compare August 3, 2022 20:32
@alexjo2144
Copy link
Copy Markdown
Member

Can you add a compatibility test that Databricks respects when Trino sets the non-null flag and vise versa. Besides that, LGTM

@ebyhr
Copy link
Copy Markdown
Member

ebyhr commented Aug 4, 2022

Question: What happens if the nullability is changed remotely and the connector writes rows within delta.metadata.cache-ttl?

@homar homar force-pushed the homar/delta_lake_add_support_for_nullability branch from 19fae37 to 6e52723 Compare August 4, 2022 15:34
@homar
Copy link
Copy Markdown
Member Author

homar commented Aug 4, 2022

Question: What happens if the nullability is changed remotely and the connector writes rows within delta.metadata.cache-ttl?

So I expected it will be a problem but I added a test and it works - trino can't insert null into non nullable column.
In the test TTL is set to 30 mins.

@alexjo2144
Copy link
Copy Markdown
Member

The cache auto-invalidates if a new transaction has been written to the log

@homar
Copy link
Copy Markdown
Member Author

homar commented Aug 4, 2022

The cache auto-invalidates if a new transaction has been written to the log

Yes I know but in this case nothing new is written, or altering column is a transaction ? I guess that must be the case

@alexjo2144
Copy link
Copy Markdown
Member

altering column is a transaction

Yep, it is in Delta. Still good to have the test through

@homar homar force-pushed the homar/delta_lake_add_support_for_nullability branch 2 times, most recently from fdb69ef to 65347b7 Compare August 4, 2022 22:37
@homar homar force-pushed the homar/delta_lake_add_support_for_nullability branch from 65347b7 to aa07dfe Compare August 5, 2022 08:20
@homar
Copy link
Copy Markdown
Member Author

homar commented Aug 5, 2022

@ebyhr thanks for running these tests I forgot to remove test that checks if we can write into tables with non nullable columns. Now it should be fine

@homar
Copy link
Copy Markdown
Member Author

homar commented Aug 5, 2022

@ebyhr tests passed can we merge it now ?

@ebyhr ebyhr force-pushed the homar/delta_lake_add_support_for_nullability branch from aa07dfe to 315b099 Compare August 5, 2022 23:47
@ebyhr
Copy link
Copy Markdown
Member

ebyhr commented Aug 5, 2022

Rebased on upstream to resolve conflicts. Let me merge after CI completion.

@ebyhr
Copy link
Copy Markdown
Member

ebyhr commented Aug 6, 2022

I confirmed new MERGE statement respects NOT NULL constraints correctly in Delta Lake connector. Let's add the test in a follow-up PR.

@ebyhr ebyhr merged commit 8ca27d3 into trinodb:master Aug 6, 2022
@ebyhr
Copy link
Copy Markdown
Member

ebyhr commented Aug 6, 2022

Merged, thanks!

@ebyhr ebyhr mentioned this pull request Aug 6, 2022
@github-actions github-actions bot added this to the 393 milestone Aug 6, 2022
@colebow
Copy link
Copy Markdown
Member

colebow commented Aug 8, 2022

Are there other column constraints that would make sense to document alongside this? Or is it a unique/one-off thing?

@homar
Copy link
Copy Markdown
Member Author

homar commented Aug 8, 2022

Are there other column constraints that would make sense to document alongside this? Or is it a unique/one-off thing?

I'd say this is a unique case, at least for now. Also keep in mind this is not delta specific. Non nullness should be supported for many connectors

@colebow
Copy link
Copy Markdown
Member

colebow commented Aug 10, 2022

Our CREATE TABLE and ALTER TABLE docs do reference NOT NULL as an option on a column. Based on that, I think it would be assumed to work everywhere? Seems very standard to me.

@homar
Copy link
Copy Markdown
Member Author

homar commented Aug 10, 2022

Our CREATE TABLE and ALTER TABLE docs do reference NOT NULL as an option on a column. Based on that, I think it would be assumed to work everywhere? Seems very standard to me.

oh sorry, I must have missed that. BTW assumption is not correct. It is connector specific as connector must support nullability

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

5 participants