Skip to content

The SQL Unparser does not correctly handle UNION #15813

@phillipleblanc

Description

@phillipleblanc

Describe the bug

The DataFusion SQL Unparser does not correctly roundtrip SQL queries that use UNION rather than UNION ALL. For example the following query:

SELECT col1 FROM footable
UNION
SELECT col1 FROM bartable

Should result in a query that filters out duplicate rows from the final result. DataFusion handles this by adding a LogicalPlan::Distinct node as the parent of the LogicalPlan::Union node.

Distinct:
  Union:
    TableScan
    TableScan

However, this is currently unparsed to the following SQL:

SELECT col1 FROM footable
UNION ALL
SELECT col1 FROM bartable

That will cause incorrect results when executed, because the duplicate rows will not be filtered out.

To Reproduce

Parse the following query into a DataFusion LogicalPlan and then immediately unparse it and note that it unparses to a UNION ALL instead of a UNION:

SELECT j1_string AS col1, j1_id AS id FROM j1
UNION
SELECT j2_string AS col1, j2_id AS id FROM j2
UNION
SELECT j3_string AS col1, j3_id AS id FROM j3

Expected behavior

The UNION is correctly preserved in the unparsed SQL from a LogicalPlan that adds a Distinct node directly above a Union node.

Additional context

I've already fixed this and will submit a PR shortly.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions