Skip to content

Optimize plan created for join with USING clause#12193

Closed
Praveen2112 wants to merge 1 commit intoprestodb:masterfrom
Praveen2112:plan_changes_for_join_using
Closed

Optimize plan created for join with USING clause#12193
Praveen2112 wants to merge 1 commit intoprestodb:masterfrom
Praveen2112:plan_changes_for_join_using

Conversation

@Praveen2112
Copy link
Contributor

When a plan is created for Join with USING clause , we create a ProjectNode on top of Join with the following projections

coalesce(l.k1, r.k1)
...,
coalesce(l.kn, r.kn)
l.v1,
...,
l.vn,
r.v1,
...,
r.vn

Actually Coalesce is required only for a FullJoin,

  1. For a Inner join l.k1 can't be null so we don't need the Coalesce and we can express it as l.k1
  2. For a Left join if l.k1 is null then r.k1 would also be null so we don't need Coalesce and we can express it as 'l.k1
  3. For a Right join if r.k1 is null then l.k1 would also be null and it can be expressed as r.k1 (Please correct me If I am wring)

This patch adds Coalsce only for a FULL join.

@findepi
Copy link
Contributor

findepi commented Jan 8, 2019

#12192 should solve this, as user can do a JOIN .. ON .. and provide coalesce(l.key, r.key) expression explicitly in the query.
See https://github.com/prestodb/presto/pull/12192/files#r246011890

@Praveen2112
Copy link
Contributor Author

Praveen2112 commented Jan 8, 2019

@findepi #12192 will solve for Inner join. But if user fires a query like this SELECT custkey, totalprice FROM orders o LEFT JOIN customer c USING (custkey) we could get a coalesce(custkey, custkey_0) which is again not required.

@findepi
Copy link
Contributor

findepi commented Jan 8, 2019

LEFT JOIN [...] we could get a coalesce(custkey, custkey_0) which is again not required.

here you want to eliminate that because whenever custkey_0 (from right side) is not null, custkey (from the left side) is also not null.

this can be handled in code generation (to avoid creation of these expression), but could also be addressed with a query transformation/simplification rule, as required information can be derived from the plan.
Handling this later allows us to handle more cases.

at a technical level, this can be addressed by extending #12192 or writing some simplification Rule (see for example SimplifyExpressions)

@Praveen2112 Praveen2112 force-pushed the plan_changes_for_join_using branch from 0d74315 to dcd01e1 Compare January 8, 2019 17:34
@Praveen2112 Praveen2112 closed this Jun 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants