-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Coerce types for all union children plans when eliminating nesting #11386
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @gruuya
This fix makes sense to me. Thank you
cc @erratic-pattern as I think you observed something similar at some point
@@ -60,7 +60,8 @@ impl OptimizerRule for EliminateNestedUnion { | |||
let inputs = inputs | |||
.into_iter() | |||
.flat_map(extract_plans_from_union) | |||
.collect::<Vec<_>>(); | |||
.map(|plan| coerce_plan_expr_for_schema(&plan, &schema)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense -- that the children were not all coerced with respect to the outermost union schema but rather with respect to the inner one.
@@ -135,6 +135,21 @@ SELECT SUM(d) FROM ( | |||
---- | |||
5 | |||
|
|||
# three way union with aggregate and type coercion |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I verified that this test covers the fix by running this test without the code changes and it fails as expected
Running "pg_compat/pg_compat_union.slt"
Running "union.slt"
External error: query failed: DataFusion error: External error: External error: External error: Arrow error: Invalid argument error: RowConverter column schema mismatch, expected Int32 got Int64
[SQL] SELECT c1, SUM(c2) FROM (
SELECT 1 as c1, 1::int as c2
UNION
SELECT 2 as c1, 2::int as c2
UNION
SELECT 3 as c1, COALESCE(3::int, 0) as c2
) as a
GROUP BY c1
at test_files/union.slt:139
Error: Execution("1 failures")
error: test failed, to rerun pass `-p datafusion-sqllogictest --test sqllogictests`
#11258 is (possibly) a similar type coercion bug when |
Thanks again! |
Which issue does this PR close?
Closes #11385.
Rationale for this change
Investigating the above issue led me to identify a couple of aspects that need to align in order for the bug to manifest:
In particular here's a minimal repro of the above issue
What happens is that the nested union elimination unwraps the first two child plans and coerces their schema, however the remaining plan isn't being coerced. Upon physical planning Union inherits the schema of the first child plan.
Consequently during execution, the RowConverter gets instantiated with
Int32
type, whereas the last child will produceInt64
elements, sincecoalesce
enforces type coercion to align the left element with the right one (represented asInt64(0)
)What changes are included in this PR?
Coerce all child plans of the outer union as per it's schema, not only plans in the inner union.
Are these changes tested?
There's a new SLT.
Are there any user-facing changes?
No error in TPC-DS Q75