Skip to content

add warning when map with double or real as key is created#17246

Merged
rschlussel merged 1 commit intoprestodb:masterfrom
singcha:Warn_on_MAP_types_with_rea_double_as_keys
Feb 11, 2022
Merged

add warning when map with double or real as key is created#17246
rschlussel merged 1 commit intoprestodb:masterfrom
singcha:Warn_on_MAP_types_with_rea_double_as_keys

Conversation

@singcha
Copy link
Copy Markdown
Contributor

@singcha singcha commented Feb 1, 2022

Test plan -
Unit tests & manual tests on cluster

== RELEASE NOTES ==
General Changes
Creating a map with double/real as keys will generate warnings

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

love the test cases!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of Node: in the warning, I would say Expression: since that's more comprehensible to users

@kaikalur
Copy link
Copy Markdown
Contributor

kaikalur commented Feb 1, 2022

Hmm so this is a bit tricky. Ideally we don't want to warn on intermediate/inferred types like histogram. SO I suggest doing it only for cases where the user explicitly creates this type like column definitions or cast expression (whereve syntactically a Type is allowed).

@singcha
Copy link
Copy Markdown
Contributor Author

singcha commented Feb 1, 2022

Creating a histogram with double buckets is going to have non-deterministic behavior. Thats a user action trying to create map equivalent indirectly with double keys.
I think it still adds value from end user perspective. thoughts?

@kaikalur
Copy link
Copy Markdown
Contributor

kaikalur commented Feb 1, 2022

Creating a histogram with double buckets is going to have non-deterministic behavior. Thats a user action trying to create map equivalent indirectly with double keys. I think it still adds value from end user perspective. thoughts?

Not really as the user can't fix it. The map in this case is just a convenience for presenting histogram results. Complaining about it will be too noisy.

I suggest just doing it in the AstBuilder when you are building a type node.

@kaikalur
Copy link
Copy Markdown
Contributor

kaikalur commented Feb 1, 2022

Creating a histogram with double buckets is going to have non-deterministic behavior. Thats a user action trying to create map equivalent indirectly with double keys. I think it still adds value from end user perspective. thoughts?

Not really as the user can't fix it. The map in this case is just a convenience for presenting histogram results. Complaining about it will be too noisy.

I suggest just doing it in the AstBuilder when you are building a type node.

Also maybe warn if they try to "modify" or materialize (write to table) the map - like concat/union etc which can end up with duplicate keys. So if the histogram is not written to a table, but just used for analysis it's fine. But if it is written to a table then there should be a warning.

@rschlussel
Copy link
Copy Markdown
Contributor

So if the histogram is not written to a table, but just used for analysis it's fine. But if it is written to a table then there should be a warning.

I'm not sure that makes sense. There isn't really an issue with writing to a table. The whole problem is that the user could non-deterministically create duplicate keys. So i would either warn all the time with histogram or not at all.

@kaikalur
Copy link
Copy Markdown
Contributor

kaikalur commented Feb 1, 2022

So if the histogram is not written to a table, but just used for analysis it's fine. But if it is written to a table then there should be a warning.

I'm not sure that makes sense. There isn't really an issue with writing to a table. The whole problem is that the user could non-deterministically create duplicate keys. So i would either warn all the time with histogram or not at all.

The main thing is the map if it is created is "correct" - only when try to manipulate it you can get it into trouble. But yeah I see your point - writing to table is ok. But if we see the CREATE TABLE in Presto, we can warn on column type being a bad map. Or always warn on write to such tables to handle Spark created tables.

@singcha singcha force-pushed the Warn_on_MAP_types_with_rea_double_as_keys branch from bf7eac7 to f667ff5 Compare February 3, 2022 19:24
@singcha singcha requested a review from rschlussel February 3, 2022 19:25
@singcha singcha changed the title add warning when map with double,real or decimal as key is created add warning when map with double or real as key is created Feb 3, 2022
@singcha singcha force-pushed the Warn_on_MAP_types_with_rea_double_as_keys branch from f667ff5 to ad4a443 Compare February 3, 2022 22:48
@singcha singcha requested a review from kaikalur February 3, 2022 22:49
@singcha singcha force-pushed the Warn_on_MAP_types_with_rea_double_as_keys branch from ad4a443 to 58a3aba Compare February 4, 2022 23:25
@singcha singcha requested a review from kaikalur February 4, 2022 23:26
Copy link
Copy Markdown
Contributor

@rschlussel rschlussel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small comment, otherwise looks good to me. @kaikalur any other comments?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: missing a space before "Please"

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed!

@singcha singcha force-pushed the Warn_on_MAP_types_with_rea_double_as_keys branch from 58a3aba to a1ea2f1 Compare February 10, 2022 18:43
@singcha singcha requested a review from rschlussel February 10, 2022 18:44
@singcha singcha force-pushed the Warn_on_MAP_types_with_rea_double_as_keys branch from a1ea2f1 to b486b17 Compare February 11, 2022 16:37
Copy link
Copy Markdown
Contributor

@kaikalur kaikalur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@rschlussel rschlussel merged commit 211390b into prestodb:master Feb 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants