Partial Movetables: allow moving a keyspace one shard at a time#9987
Partial Movetables: allow moving a keyspace one shard at a time#9987rohit-nayak-ps merged 20 commits intovitessio:mainfrom
Conversation
d9a7275 to
e4fffef
Compare
0a8a4a3 to
999acbf
Compare
7395dd5 to
4f81532
Compare
4f81532 to
f79fba0
Compare
e2604c8 to
2574892
Compare
142cf66 to
5a9e6a5
Compare
|
Squashed all commits since it was becoming tougher to fix conflicts each time with a large number of commits.. |
2a2c20c to
9fe807f
Compare
c09e500 to
0875830
Compare
Signed-off-by: Rohit Nayak <rohit@planetscale.com>
Signed-off-by: Rohit Nayak <rohit@planetscale.com>
Signed-off-by: Rohit Nayak <rohit@planetscale.com>
Signed-off-by: Rohit Nayak <rohit@planetscale.com>
Signed-off-by: Rohit Nayak <rohit@planetscale.com>
Signed-off-by: Rohit Nayak <rohit@planetscale.com>
…e multiple config.json files Signed-off-by: Rohit Nayak <rohit@planetscale.com>
Signed-off-by: Rohit Nayak <rohit@planetscale.com>
Signed-off-by: Rohit Nayak <rohit@planetscale.com>
b30c6df to
3dcaeda
Compare
Signed-off-by: Rohit Nayak <rohit@planetscale.com>
Signed-off-by: Rohit Nayak <rohit@planetscale.com>
mattlord
left a comment
There was a problem hiding this comment.
Nice work on this! There were a couple of things we should change before merging (example related), but otherwise it's a few minor questions and suggestions that you can make the final call on. I'll review your feedback tomorrow and quickly approve. Thanks!
go/test/endtoend/vtgate/misc_test.go
Outdated
There was a problem hiding this comment.
I'm not sure what done means here. 🙂 We're making it a future ToDo post merge?
go/vt/vtgate/planbuilder/bypass.go
Outdated
There was a problem hiding this comment.
I see. In that case, any reason not to export the flag variable enableShardRouting->vtgate.EnableShardRouting and reference that directly? That would make more sense to me than creating these new exported functions in the planbuilder package: EnableShardRoutingFlag() and IsShardRoutingEnabled().
Signed-off-by: Rohit Nayak <rohit@planetscale.com>
Signed-off-by: Rohit Nayak <rohit@planetscale.com>
| enableSchemaChangeSignal = flag.Bool("schema_change_signal", true, "Enable the schema tracker; requires queryserver-config-schema-change-signal to be enabled on the underlying vttablets for this to work") | ||
| schemaChangeUser = flag.String("schema_change_signal_user", "", "User to be used to send down query to vttablet to retrieve schema changes") | ||
|
|
||
| enableShardRouting = flag.Bool("enable_partial_keyspace_migration", false, "(Experimental) Follow shard routing rules: enable only while migrating a keyspace shard by shard. See documentation on Partial MoveTables for more. (default false)") |
There was a problem hiding this comment.
Minor thing, but I think we're supposed to use dashes for new flags.
Signed-off-by: Rohit Nayak <rohit@planetscale.com>
…moving a keyspace one shard at a time vitessio#9987 Signed-off-by: Vilius Okockis <vilius.okockis@vinted.com>
TL;DR;
Description
This feature introduces the concept of partial keyspaces where some shards are served from a different keyspace. This is
useful for a specific but critical use-case where a large production Vitess setup (100s of shards) is being migrated to
a new data center or provider. Migrating the entire cluster in one go using
MoveTablescould cause an unacceptabledowntime due to the large number of primaries that need to be synced when writes are switched.
Sample Usage
partial MoveTables signalled by
--source_shardsvtctlclient MoveTables -- -source customer --tables 'customer,corder' --source_shards '-80' Create customer2.partial1VDiff works as-is
vtctlclient VDiff customer2.partial1SwitchTraffic generates this shard routing rule
vtctlclient MoveTables -- SwitchTraffic customer2.partial1{"rules":[{"from_keyspace":"customer", "to_keyspace":"customer2", "shards":"-80"}]}Demo that shard routing now works
Summary of code changes
Core Changes
Workflow Show changes
While we had the possibility of partial reads being switched earlier, now writes can also be partially switched in a workflow.
Shard Routing Rules
Topo
Shard Routing Rules are a new concept introduced for this feature. It maps a (keyspace, shard) tuple to another keyspace. This is used to create a new cluster level map which maps a (keyspace, shard) to another keyspace. These are set by SwitchTraffic in a partial MoveTables and used by vtgate while routing queries. vtctlclient commands
ApplyShardRoutingRulesandGetShardRoutingRulesallow setting/getting of these rules.vtgate Shard Targeting
The shard targeted query routing in vtgate's bypass mechanism
go/vt/vtgate/planbuilder/bypass.go. We create a map from the SrvVSchema's shard routing rules object and check if a specified shard destination needs to be rerouted.vtgate Global Routing
The global query routing using vtgate's ResolveDestinations() methods
go/vt/vtgate/vcursor_impl.go. We go through all selected shard destinations and modify those that are mapped in the shard routing rules.New column workflow_sub_type in _vt.vreplication
There is a new bool column
workflow_sub_typeadded to_vt.vreplication, set for partial movetables. It is used forvisibility and for bypassing certain validations that expect a full keyspace.
Flags
vtgate --enable-partial-keyspace-migration
Default: false. It is used when a cluster is setup with shard routing rules and tells vtgate to use these rules while routing queries.
MoveTables --source_shards
The only flag needed to tell MoveTables to perform a partial movetables is to pass it this flag, example
--source_shards -80. This flag already exists and is used by Reshard.Notes
Both read and write traffic is switched at the same time when a shard routing is deemed complete (using SwitchTraffic). This is because we add a shard routing rule when this happens. Switching of read and write separately is done by updating the regular routing rules by targeting @replica and @rdonly.
Test Changes
e2e test TestPartialMoveTables
It creates a workflow which moves tables from one shard only to the target shard, ensures it completed correctly and switches traffic. It ensures that the shard routing rules and regular routing rules are setup correctly and that the vtgate queries both shard targeted and global queries are routed correctly.
Modified vtgate tests
A new set of tests have been added where the test cluster is setup as a partial cluster: one shard is in a different keyspace with the ShardRoutingRules setup. The main change is that the vtgate params need to specify the default keyspace as the
DbNameto avoid ambiguity.Some tests have to be skipped for partial keyspaces because they intrinsically affect tablets in the base cluster which are now not all serving.
TODOs:
[ ] Add explicit reasons for Skipped vtgate unit tests during partial keyspace
Checklist