Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coerce types for all union children plans when eliminating nesting #11386

Merged
merged 1 commit into from
Jul 11, 2024

Conversation

gruuya
Copy link
Contributor

@gruuya gruuya commented Jul 10, 2024

Which issue does this PR close?

Closes #11385.

Rationale for this change

Investigating the above issue led me to identify a couple of aspects that need to align in order for the bug to manifest:

  • there's a (grouping) aggregation
  • of a >2 way union
  • with some type coercion

In particular here's a minimal repro of the above issue

> select c1, sum(c2) as sum_c2
from (select 1 as c1, 1::int as c2
    union
    select 2 as c1, 2::int as c2
    union
    select 3 as c1, coalesce(3::int, 0) as c2)
group by c1;
External error: External error: External error: Arrow error: Invalid argument error: RowConverter column schema mismatch, expected Int32 got Int64

What happens is that the nested union elimination unwraps the first two child plans and coerces their schema, however the remaining plan isn't being coerced. Upon physical planning Union inherits the schema of the first child plan.

Consequently during execution, the RowConverter gets instantiated with Int32 type, whereas the last child will produce Int64 elements, since coalesce enforces type coercion to align the left element with the right one (represented as Int64(0))

What changes are included in this PR?

Coerce all child plans of the outer union as per it's schema, not only plans in the inner union.

Are these changes tested?

There's a new SLT.

Are there any user-facing changes?

No error in TPC-DS Q75

@github-actions github-actions bot added optimizer Optimizer rules sqllogictest SQL Logic Tests (.slt) labels Jul 10, 2024
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @gruuya

This fix makes sense to me. Thank you

cc @erratic-pattern as I think you observed something similar at some point

@@ -60,7 +60,8 @@ impl OptimizerRule for EliminateNestedUnion {
let inputs = inputs
.into_iter()
.flat_map(extract_plans_from_union)
.collect::<Vec<_>>();
.map(|plan| coerce_plan_expr_for_schema(&plan, &schema))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense -- that the children were not all coerced with respect to the outermost union schema but rather with respect to the inner one.

@@ -135,6 +135,21 @@ SELECT SUM(d) FROM (
----
5

# three way union with aggregate and type coercion
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I verified that this test covers the fix by running this test without the code changes and it fails as expected

Running "pg_compat/pg_compat_union.slt"
Running "union.slt"
External error: query failed: DataFusion error: External error: External error: External error: Arrow error: Invalid argument error: RowConverter column schema mismatch, expected Int32 got Int64
[SQL] SELECT c1, SUM(c2) FROM (
    SELECT 1 as c1, 1::int as c2
    UNION
    SELECT 2 as c1, 2::int as c2
    UNION
    SELECT 3 as c1, COALESCE(3::int, 0) as c2
) as a
GROUP BY c1
at test_files/union.slt:139

Error: Execution("1 failures")
error: test failed, to rerun pass `-p datafusion-sqllogictest --test sqllogictests`

@erratic-pattern
Copy link
Contributor

cc @erratic-pattern as I think you observed something similar at some point

#11258 is (possibly) a similar type coercion bug when CASE is used inside an aggregate, but I think it will involve a different fix.

@alamb alamb merged commit d314ced into apache:main Jul 11, 2024
23 checks passed
@alamb
Copy link
Contributor

alamb commented Jul 11, 2024

Thanks again!

Lordworms pushed a commit to Lordworms/arrow-datafusion that referenced this pull request Jul 12, 2024
findepi pushed a commit to findepi/datafusion that referenced this pull request Jul 16, 2024
xinlifoobar pushed a commit to xinlifoobar/datafusion that referenced this pull request Jul 17, 2024
xinlifoobar pushed a commit to xinlifoobar/datafusion that referenced this pull request Jul 18, 2024
@gruuya gruuya deleted the union-schema-coercion-fix branch July 31, 2024 10:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
optimizer Optimizer rules sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Regression in TPC-DS Q75
3 participants