Fix bugs in duplicate items in UNNEST expression#18909
Fix bugs in duplicate items in UNNEST expression#18909feilong-liu merged 1 commit intoprestodb:masterfrom
Conversation
22be41d to
ddc749b
Compare
46ea0ea to
3a2e818
Compare
kaikalur
left a comment
There was a problem hiding this comment.
How about the Simple UNNEST. It seems to work OK but good to add test
SELET * from UNNEST(ARRAY[1,2], ARRAY[1,2])
presto-main/src/main/java/com/facebook/presto/sql/planner/RelationPlanner.java
Outdated
Show resolved
Hide resolved
presto-tests/src/main/java/com/facebook/presto/tests/AbstractTestQueries.java
Outdated
Show resolved
Hide resolved
d687145 to
f264b69
Compare
presto-main/src/main/java/com/facebook/presto/sql/planner/RelationPlanner.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/sql/planner/RelationPlanner.java
Outdated
Show resolved
Hide resolved
f18ad4f to
a1f3193
Compare
a1f3193 to
3171dad
Compare
rschlussel
left a comment
There was a problem hiding this comment.
Why can't we just use a multimap instead?
That was my first suggestion too :) except there is a canonicalize method that removes them further down. |
can you clarify this more? Would it make more sense to fix that to be able to support the duplicate entries? Just seems weird to make plan choices based on the data structure representations we've chosen. |
The main thing is it's not done in a principled way - the output are just the map keys! They are relying on the fact that the Builder is adding in order which I'm not a big fan of. So yeah we could do a bigger change to have list of inputs and outputs explicitly but this one is actually also a small optimization as we unnest only one array :) and the change is self-contained (corner case too - only 2 failures in the last year+) |
rschlussel
left a comment
There was a problem hiding this comment.
looks good. just the one nit.
I'll note that this means that if you have any non-deterministic functions in the duplicates, they will get executed only once, but since we do that already in other cases, i don't think it's something to worry about.
presto-main/src/main/java/com/facebook/presto/sql/planner/RelationPlanner.java
Outdated
Show resolved
Hide resolved
Current UNNEST node will fail if there are duplicate expressions in UNNEST, for example, query "SELECT * from (select * FROM (values 1) as t(k)) CROSS JOIN unnest(array[2, 3], ARRAY[2, 3]) AS r(r1, r2)". This PR fix this bug.
3171dad to
099bd96
Compare
What's the change?
Current UNNEST node will fail if there are duplicate expressions in UNNEST, for example, query
SELECT * from (select * FROM (values 1) as t(k)) CROSS JOIN unnest(array[2, 3], ARRAY[2, 3]) AS r(r1, r2). This is because we use a map to store the mapping between expressions in UNNEST and output symbols. Duplicate expressions will produce the same key for the map.In this PR, I skip the duplicate items in the unnest node, and add a projection node over the output of the UNNEST node for the output of the skipped items.
This basically is a rewrite from
SELECT * from (select * FROM (values 1) as t(k)) CROSS JOIN unnest(array[2, 3], ARRAY[2, 3]) AS r(r1, r2)to
SELECT *, r1 as r2 from (select * FROM (values 1) as t(k)) CROSS JOIN unnest(array[2, 3]) AS r(r1)Test plan - (Please fill in how you tested your changes)
Added unit test.