Skip to content

Merge subqueries that "join" on lookup index columns.#10966

Merged
systay merged 7 commits intovitessio:mainfrom
arthurschreiber:arthur/improve-subquery-merging
Sep 30, 2022
Merged

Merge subqueries that "join" on lookup index columns.#10966
systay merged 7 commits intovitessio:mainfrom
arthurschreiber:arthur/improve-subquery-merging

Conversation

@arthurschreiber
Copy link
Member

Description

This pull request tries to improve the handling of queries that contain IN or = operations with one operant being a lookup vindex column, and the other operant being a subquery that selects a column from the same or a different table that uses the same lookup vindex.

Here's an example:

SELECT music.id FROM music WHERE music.id IN (SELECT music.id FROM music WHERE music.user_id IN (1, 2, 3))

Without this change, the planner breaks this query into two parts, one part executing the inner query and another for the outer query.

But because we're selecting data from music.id in the inner query, which has a unique lookup index defined, and comparing it to music.id in the outer query, we can merge the inner and outer queries together.

I'm fairly confident the code changes are bad, but I think they show that this change is viable. 😅

Related Issue(s)

Checklist

  • "Backport me!" label has been added if this change should be backported
  • Tests were added or are not required
  • Documentation was added or is not required

Deployment Notes

@vitess-bot
Copy link
Contributor

vitess-bot bot commented Aug 9, 2022

Review Checklist

Hello reviewers! 👋 Please follow this checklist when reviewing this Pull Request.

General

  • Ensure that the Pull Request has a descriptive title.
  • If this is a change that users need to know about, please apply the release notes (needs details) label so that merging is blocked unless the summary release notes document is included.

If a new flag is being introduced:

  • Is it really necessary to add this flag?
  • Flag names should be clear and intuitive (as far as possible)
  • Help text should be descriptive.
  • Flag names should use dashes (-) as word separators rather than underscores (_).

If a workflow is added or modified:

  • Each item in Jobs should be named in order to mark it as required.
  • If the workflow should be required, the maintainer team should be notified.

Bug fixes

  • There should be at least one unit or end-to-end test.
  • The Pull Request description should include a link to an issue that describes the bug.

Non-trivial changes

  • There should be some code comments as to why things are implemented the way they are.

New/Existing features

  • Should be documented, either by modifying the existing documentation or creating new documentation.
  • New features should have a link to a feature request issue or an RFC that documents the use cases, corner cases and test cases.

Backward compatibility

  • Protobuf changes should be wire-compatible.
  • Changes to _vt tables and RPCs need to be backward compatible.
  • vtctl command output order should be stable and awk-able.
  • RPC changes should be compatible with vitess-operator
  • If a flag is removed, then it should also be removed from VTop, if used there.

@arthurschreiber arthurschreiber force-pushed the arthur/improve-subquery-merging branch from 5d54ce0 to 319401a Compare August 9, 2022 15:47
@arthurschreiber arthurschreiber force-pushed the arthur/improve-subquery-merging branch 2 times, most recently from 744ef9f to ee8ed2c Compare August 10, 2022 09:28
@glortho
Copy link

glortho commented Aug 19, 2022

Using this branch, we're seeing this error occasionally when trying to vtexplain some of these types of queries:

[BUG] unknown type encountered: *physical.Table (transformToLogicalPlan)

Eyeballing them, it seems all the problematic ones have a LEFT OUTER JOIN + at least one other join in common.

I'll send examples to you separately @arthurschreiber.

@arthurschreiber
Copy link
Member Author

Using this branch, we're seeing this error occasionally when trying to vtexplain some of these types of queries:

I believe this was fixed with some of the recent changes I pushed. 👍

@arthurschreiber
Copy link
Member Author

This pull request also contains the change described in #11072 (comment)

@arthurschreiber arthurschreiber force-pushed the arthur/improve-subquery-merging branch from c6be0b9 to a20a5c4 Compare August 26, 2022 09:27
@systay systay added Type: Enhancement Logical improvement (somewhere between a bug and feature) Component: Query Serving labels Sep 26, 2022
@systay systay force-pushed the arthur/improve-subquery-merging branch from 66814b5 to 911e489 Compare September 29, 2022 10:41
@arthurschreiber arthurschreiber marked this pull request as ready for review September 29, 2022 12:28
This is a squash of a number of commits created by Arthur.

Co-authored-by: Arthur Schreiber <arthurschreiber@github.com>
Signed-off-by: Andres Taylor <andres@planetscale.com>
@systay systay force-pushed the arthur/improve-subquery-merging branch from 8dcec53 to 2c31f14 Compare September 29, 2022 15:45
Signed-off-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: Andres Taylor <andres@planetscale.com>
Copy link
Member

@harshit-gangal harshit-gangal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really great optimization.
See if some helpful comments can be added on top of the functions to help understand the merge rules.

Signed-off-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: Andres Taylor <andres@planetscale.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Component: Query Serving Type: Enhancement Logical improvement (somewhere between a bug and feature)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants