[vtgate planner] Routing & Merging refactor#12197
Conversation
Review ChecklistHello reviewers! 👋 Please follow this checklist when reviewing this Pull Request. General
If a new flag is being introduced:
If a workflow is added or modified:
Bug fixes
Non-trivial changes
New/Existing features
Backward compatibility
|
752194e to
b19d35e
Compare
3b61762 to
8f8d98f
Compare
5addfe1 to
118a776
Compare
There was a problem hiding this comment.
We are now using the evalengine to check if we can at plan-time evaluate expression. If we can, and if the result is false, we can use the None opcode which is very cheap.
There was a problem hiding this comment.
The old query was wrong - we should never use the given schema name. Instead we have to replace the literal value with the argument :__vtschemaname which is then filled in by the vttablet with the name of the underlying MySQL database.
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
There was a problem hiding this comment.
The old plan was wrong. Given WHERE kcu.table_schema = ? AND rc.constraint_schema = ?, we won't know until runtime if the user wants to look at information from the same or different keyspaces/schemas, and so merging these routes into a single one is invalid.
There was a problem hiding this comment.
Since we have merged the two routes, there is no need to specify the table_schema schema twice.
There was a problem hiding this comment.
Here we know that the two schemas being searched for are different - WHERE tc.table_schema = 'table_schema' AND ... cc.constraint_schema = 'constraint_schema'. Merging these two was wrong.
There was a problem hiding this comment.
ambiguous_ref_with_source exists both as an unsharded table in the main keyspace, and as a reference table in the user keyspace. The latter is a copy of the unsharded table spread out to all shards so that all joins can be local.
During route planning we have decided that we want to send the query to the unsharded main keyspace. The OpCode is more accurate if it's Unsharded for this route.
Signed-off-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: Andres Taylor <andres@planetscale.com>
| type InfoSchemaRouting struct { | ||
| SysTableTableSchema []sqlparser.Expr | ||
| SysTableTableName map[string]sqlparser.Expr | ||
| Table *QueryTable | ||
| } | ||
|
|
||
| func (isr *InfoSchemaRouting) UpdateRoutingParams(_ *plancontext.PlanningContext, rp *engine.RoutingParameters) error { | ||
| rp.SysTableTableSchema = nil | ||
| for _, expr := range isr.SysTableTableSchema { |
There was a problem hiding this comment.
now that we do not merge if SysTableTableSchema is different, should SysTableTableSchema be an Expr than slice of Expr? Similarly, Do we need SysTableTableName as a map?
There was a problem hiding this comment.
valid points. wdyt about doing these fixes in a separate PR? this one has grown enough for now :)
There was a problem hiding this comment.
We should add a task for it
Signed-off-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: Andres Taylor <andres@planetscale.com>
The rewriting on v16 didn't consider the case where we already had an extract subquery. In that case we don't extract again, to avoid infinite recursion. This does not affect v17 and later as this was fixed in the refactor in vitessio#12197. Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>
Description
This PR refactors how routing of queries is done during query planning.
Why?
The logic for which routes can be merged together is an important and complex part of the query planner.
Making the code easy to understand and talk about is critical to get this correct.
The old
Routeoperator consisted of a set of fields:Of these, only
Source,RouteOpCodeandMergedWithare valid for all types of routes.All other fields only make sense for some OpCodes that the route represents.
The fields
VindexPredsandSelectedonly make sense for sharded tables, which are represented a lot of OpCodes, such asScatter,EqualUnique, etc.SysTableTableSchema,SysTableTableNameare only used for information_schema tables (OpCode DBA).In a lot of places, we had to use a switch statement on the OpCode to handle things differently depending on the type of
Routewe were dealing with.The Change
To me, this screamed for an interface and multiple different implementation of this interface, depending on which type of route we have.
The new operator now contains:
The
Routinginterface is then used for picking the best plan per table in the query, and then the merging of multipleRoutes into as few as possible.While doing this refactoring, I tried to keep the tests intact and only change the code behind. For the few exceptions to this rule, I have added comments in this PR explaining why the change was introduced.
Checklist