diff --git a/content/en/docs/18.0/reference/vreplication/movetables.md b/content/en/docs/18.0/reference/vreplication/movetables.md index 768af7dcc..652dd2e4c 100644 --- a/content/en/docs/18.0/reference/vreplication/movetables.md +++ b/content/en/docs/18.0/reference/vreplication/movetables.md @@ -2,11 +2,13 @@ title: MoveTables description: Move tables between keyspaces without downtime weight: 10 -aliases: ['/docs/reference/vreplication/v2/movetables/'] +aliases: [ '/docs/reference/vreplication/v2/movetables/' ] --- {{< warning >}} -These workflows can have a significant impact on the source tablets (which are often in production) — especially when a PRIMARY tablet is used as a source. You can limit the impact on the source tablets using the [`--vreplication_copy_phase_max_*` vttablet flags](../flags/#vreplication_copy_phase_max_innodb_history_list_length) +These workflows can have a significant impact on the source tablets (which are often in production) — especially when a +PRIMARY tablet is used as a source. You can limit the impact on the source tablets using +the [`--vreplication_copy_phase_max_*` vttablet flags](../flags/#vreplication_copy_phase_max_innodb_history_list_length) {{< /warning >}} ## Command @@ -14,6 +16,7 @@ These workflows can have a significant impact on the source tablets (which are o ``` MoveTables -- ``` + or ``` @@ -21,24 +24,29 @@ MoveTables -- [--source=] [--tables=] [--cells=] [--tablet_types=] [--all] [--exclude=] [--auto_start] [--stop_after_copy] [--timeout=timeoutDuration] [--reverse_replication] [--keep_data] [--keep_routing_rules] [--on-ddl=] [--source_time_zone=] - [--initialize-target-sequences] + [--initialize-target-sequences] [--atomic-copy] ``` ## Description -`MoveTables` is used to start and manage workflows to move one or more tables from an external database or an existing Vitess keyspace into a new Vitess keyspace. The target keyspace can be unsharded or sharded. +`MoveTables` is used to start and manage workflows to move one or more tables from an external database or an existing +Vitess keyspace into a new Vitess keyspace. The target keyspace can be unsharded or sharded. -`MoveTables` is typically used for migrating data into Vitess or to implement vertical sharding. You might use the former when you first start using Vitess and the latter if you want to distribute your load across servers without sharding tables. +`MoveTables` is typically used for migrating data into Vitess or to implement vertical sharding. You might use the +former when you first start using Vitess and the latter if you want to distribute your load across servers without +sharding tables. ## Parameters ### action `MoveTables` is an "umbrella" command. The `action` sub-command defines the operation on the workflow. -Action must be one of the following: `Create`, `Show`, `Progress`, `SwitchTraffic`, `ReverseTrafffic`, `Cancel`, or `Complete`. +Action must be one of the following: `Create`, `Show`, `Progress`, `SwitchTraffic`, `ReverseTrafffic`, `Cancel`, +or `Complete`. #### Create +
`Create` sets up and creates a new workflow. The workflow name should not conflict with that of an existing workflow. @@ -46,51 +54,71 @@ Action must be one of the following: `Create`, `Show`, `Progress`, `SwitchTraffi
#### Show +
-`Show` displays useful information about a workflow. (At this time the [Workflow](../workflow) Show command gives more information. This will be improved over time.) +`Show` displays useful information about a workflow. (At this time the [Workflow](../workflow) Show command gives more +information. This will be improved over time.)
#### Progress +
-`Progress` reports the progress of a workflow by showing the percentage of data copied across targets, if workflow is in copy state, and the replication lag between the target and the source once the copy phase is completed. +`Progress` reports the progress of a workflow by showing the percentage of data copied across targets, if workflow is in +copy state, and the replication lag between the target and the source once the copy phase is completed. -It is too expensive to get real-time row counts of tables, using _count(*)_, say. So we use the statistics available in the `information_schema` to approximate copy progress. This data can be significantly off (up to 50-60%) depending on the utilization of the underlying mysql server resources. You can manually run `analyze table` to update the statistics if so desired. +It is too expensive to get real-time row counts of tables, using _count(*)_, say. So we use the statistics available in +the `information_schema` to approximate copy progress. This data can be significantly off (up to 50-60%) depending on +the utilization of the underlying mysql server resources. You can manually run `analyze table` to update the statistics +if so desired.
#### SwitchTraffic +
-`SwitchTraffic` switches traffic forward for the `tablet_types` specified. This replaces the previous `SwitchReads` and `SwitchWrites` commands with a single one. It is now possible to switch all traffic with just one command, and this is the default behavior. Also, you can now switch replica, rdonly and primary traffic in any order: earlier you needed to first `SwitchReads` (for replicas and rdonly tablets) first before `SwitchWrites`. +`SwitchTraffic` switches traffic forward for the `tablet_types` specified. This replaces the previous `SwitchReads` +and `SwitchWrites` commands with a single one. It is now possible to switch all traffic with just one command, and this +is the default behavior. Also, you can now switch replica, rdonly and primary traffic in any order: earlier you needed +to first `SwitchReads` (for replicas and rdonly tablets) first before `SwitchWrites`.
#### ReverseTraffic +
-`ReverseTraffic` switches traffic in the reverse direction for the `tablet_types` specified. The traffic should have been previously switched forward using `SwitchTraffic` for the `cells` and `tablet_types` specified. +`ReverseTraffic` switches traffic in the reverse direction for the `tablet_types` specified. The traffic should have +been previously switched forward using `SwitchTraffic` for the `cells` and `tablet_types` specified.
#### Cancel +
-`Cancel` can be used if a workflow was created in error or was misconfigured and you prefer to create a new workflow instead of fixing this one. `Cancel` can only be called if no traffic has been switched. It removes vreplication-related artifacts like rows from the vreplication and copy_state tables in the sidecar `_vt` database along with routing rules and blacklisted tables from the topo and, by default, the target tables on the target keyspace +`Cancel` can be used if a workflow was created in error or was misconfigured and you prefer to create a new workflow +instead of fixing this one. `Cancel` can only be called if no traffic has been switched. It removes vreplication-related +artifacts like rows from the vreplication and copy_state tables in the sidecar `_vt` database along with routing rules +and blacklisted tables from the topo and, by default, the target tables on the target keyspace (see [`--keep_data`](./#--keep_data) and [`--rename_tables`](#--rename_tables)).
#### Complete +
{{< warning >}} This is a destructive command {{< /warning >}} -`Complete` is used after all traffic has been switched. It removes vreplication-related artifacts like rows from vreplication and copy_state tables in the sidecar `_vt` database along with routing rules and and blacklisted tables from the topo. By default, the source tables are also dropped on the target keyspace +`Complete` is used after all traffic has been switched. It removes vreplication-related artifacts like rows from +vreplication and copy_state tables in the sidecar `_vt` database along with routing rules and and blacklisted tables +from the topo. By default, the source tables are also dropped on the target keyspace (see [`--keep_data`](./#--keep_data) and [`--rename_tables`](#--rename_tables)).
@@ -99,7 +127,8 @@ This is a destructive command Each `action` has additional options/parameters that can be used to modify its behavior. -`actions` are common to both `MoveTables` and `Reshard` workflows. Only the `create` action has different parameters, all other actions have common options and similar semantics. +`actions` are common to both `MoveTables` and `Reshard` workflows. Only the `create` action has different parameters, +all other actions have common options and similar semantics. #### --all @@ -110,7 +139,31 @@ Move all tables from the source keyspace. +#### --atomic-copy +**optional**\ +**default** "false" + +When this flag is set, the workflow will run in a single copy phase, with no catchup/fastforward phases. A single +snapshot is taken of all tables in the source and copied to the target. This allows you to import an entire database +for example if it contains tables with parent/child relationships or for performance reasons, to speed up the copy +of a database with a large number of small tables. The single copy phase copies *all* tables in the source to the +target. This implies the following limitations +* if the workflow breaks at any time during the copy phase it will have to be cancelled and restarted as a new workflow. +* since the copy phase works with a single snapshot of all tables it could take a long time to complete if the tables + are large or the network is slow. You may want to increase the default `--vreplication_copy_phase_duration` + vttablet flag which is set to 1 hour by default. +* the read lock held by the snapshot could impact performance due to the InnoDB History Length on certain + hardware if the write-qps is high. + +Implies: +* `--all` (all tables will be copied) + +Incompatible with: +* `--tables` (cannot specify tables to copy, all tables will be copied) +* `--exclude` (cannot exclude tables from copy, all tables will be copied) + #### --auto_start + **optional**\ **default** true @@ -124,15 +177,16 @@ to false then the workflow is in a Stopped state until you explicitly start it. ###### Uses * Allows updating the rows in `_vt.vreplication` after `MoveTables` has setup the -streams. For example, you can add some filters to specific tables or change the -projection clause to modify the values on the target. This -provides an easier way to create simpler Materialize workflows by first using -`MoveTables` with auto_start false, updating the BinlogSource as required by your -`Materialize` and then start the workflow. + streams. For example, you can add some filters to specific tables or change the + projection clause to modify the values on the target. This + provides an easier way to create simpler Materialize workflows by first using + `MoveTables` with auto_start false, updating the BinlogSource as required by your + `Materialize` and then start the workflow. * Changing the `copy_state` and/or `pos` values to restart a broken `MoveTables` workflow -from a specific point of time + from a specific point of time #### --cells + **optional**\ **default** local cell (of source tablet)\ **string** @@ -150,6 +204,7 @@ Comma seperated list of Cell(s) and/or CellAlias(es) to replicate from. * Select cells where replica lags are lower #### --defer-secondary-keys + **optional**\ **default** false @@ -159,7 +214,8 @@ If true, any secondary keys are dropped from the table definitions on the target tables for the [copy phase](../internal/life-of-a-stream/#copy). The exact same key definitions are then re-added when the copy phase completes for each table. -With this method all secondary index records for the table are generated in one bulk operation. This should significantly +With this method all secondary index records for the table are generated in one bulk operation. This should +significantly improve the overall copy phase execution time on large tables with many secondary keys — especially with [MySQL 8.0.31](https://dev.mysql.com/doc/relnotes/mysql/8.0/en/news-8-0-31.html) and later due to InnoDB's support for parallel index builds. This is logically similar to the @@ -168,6 +224,7 @@ parallel index builds. This is logically similar to the #### --drop_foreign_keys + **optional**\ **default** false @@ -178,6 +235,7 @@ If true, tables in the target keyspace will be created without any foreign keys #### --dry_run + **optional**\ **default** false @@ -189,6 +247,7 @@ but the command logs all the steps that would be taken. #### --exclude + **optional** only applies if `--all` is specified
@@ -198,12 +257,14 @@ If moving all tables, specifies tables to be skipped.
#### --initialize-target-sequences + **optional**\ **default** false
-If specified, when switching write (primary tablet) traffic for tables that are being moved from an unsharded keyspace to a +If specified, when switching write (primary tablet) traffic for tables that are being moved from an unsharded keyspace +to a sharded one, initialize any sequences being used by those tables on the target. They are initialized using the current maximum value for the column across all shards on the target. @@ -212,23 +273,30 @@ maximum value for the column across all shards on the target. ###### Uses * It's common that users import unsharded data into Vitess — sharding it in the process — or move -tables from an unsharded keyspace to a sharded one as they become too large for a single MySQL instance. -When doing either of these you would typically be leveraging [MySQL auto_increment](https://dev.mysql.com/doc/refman/en/example-auto-increment.html) -columns for primary keys on the unsharded tables (source). On the sharded target, however, you will then -need to use [Vitess Sequences](../../features/vitess-sequences/) in order to ensure that you continue having -automatically generated incrementing unique primary keys _across all shards_. When it comes to [switching the write traffic](#switchtraffic) -during this move you would need to manually ensure that you [initialize the sequences](../../features/vitess-sequences/#initializing-a-sequence) -so that the next values they provide are higher than any already used on the source (with ample buffer in between -to avoid potential identifier reuse and duplicate key errors immediately following the cutover). This flag tells Vitess -to manage this sequence initialization for you as part of the `SwitchTraffic` operation to ensure a seamless cutover -without any additional manual steps. For more information, please see [the feature request](https://github.com/vitessio/vitess/issues/13685). + tables from an unsharded keyspace to a sharded one as they become too large for a single MySQL instance. + When doing either of these you would typically be + leveraging [MySQL auto_increment](https://dev.mysql.com/doc/refman/en/example-auto-increment.html) + columns for primary keys on the unsharded tables (source). On the sharded target, however, you will then + need to use [Vitess Sequences](../../features/vitess-sequences/) in order to ensure that you continue having + automatically generated incrementing unique primary keys _across all shards_. When it comes + to [switching the write traffic](#switchtraffic) + during this move you would need to manually ensure that + you [initialize the sequences](../../features/vitess-sequences/#initializing-a-sequence) + so that the next values they provide are higher than any already used on the source (with ample buffer in between + to avoid potential identifier reuse and duplicate key errors immediately following the cutover). This flag tells + Vitess + to manage this sequence initialization for you as part of the `SwitchTraffic` operation to ensure a seamless cutover + without any additional manual steps. For more information, please + see [the feature request](https://github.com/vitessio/vitess/issues/13685). {{< info >}} -You will still need to take the manual step of [creating each backing sequence table](../../features/vitess-sequences/#creating-a-sequence) +You will still need to take the manual step +of [creating each backing sequence table](../../features/vitess-sequences/#creating-a-sequence) in an unsharded keyspace of your choosing prior to the `SwitchTraffic` operation. {{< /info>}} #### --keep_data + **optional**\ **default** false @@ -239,30 +307,43 @@ Usually, the target tables are deleted by `Cancel`. If this flag is used the tar
#### --keep_routing_rules + **optional**\ **default** false
-Usually, any routing rules created by the workflow in the source and target keyspace are removed by `Complete` or `Cancel`. If this flag is used the routing rules will be left in place. +Usually, any routing rules created by the workflow in the source and target keyspace are removed by `Complete` +or `Cancel`. If this flag is used the routing rules will be left in place.
#### --max_replication_lag_allowed + **optional**\ **default** the value used for `--timeout`
-While executing `SwitchTraffic` we ensure that the VReplication lag for the workflow is less than this duration, otherwise report an error and don't attempt the switch. The calculated VReplication lag is the estimated maximum lag across workflow streams between the last event seen at the source and the last event processed by the target (which would be a heartbeat event if we're fully caught up). Usually, when VReplication has caught up, this lag should be very small (under a second). +While executing `SwitchTraffic` we ensure that the VReplication lag for the workflow is less than this duration, +otherwise report an error and don't attempt the switch. The calculated VReplication lag is the estimated maximum lag +across workflow streams between the last event seen at the source and the last event processed by the target (which +would be a heartbeat event if we're fully caught up). Usually, when VReplication has caught up, this lag should be very +small (under a second). -While switching write traffic, we temporarily make the source databases read-only, and wait for the targets to catchup. This means that the application can effectively be partially down for this cutover period as writes will pause or error out. While switching write traffic this flag can ensure that you only switch traffic if the current lag is low, thus limiting this period of write-unavailability and avoiding it entirely if we're not likely to catch up within the `--timeout` window. +While switching write traffic, we temporarily make the source databases read-only, and wait for the targets to catchup. +This means that the application can effectively be partially down for this cutover period as writes will pause or error +out. While switching write traffic this flag can ensure that you only switch traffic if the current lag is low, thus +limiting this period of write-unavailability and avoiding it entirely if we're not likely to catch up within +the `--timeout` window. -While switching read traffic this can also be used to set an approximate upper bound on how stale reads will be against the replica tablets when using `@replica` shard targeting. +While switching read traffic this can also be used to set an approximate upper bound on how stale reads will be against +the replica tablets when using `@replica` shard targeting.
#### --on-ddl + **optional**\ **default** IGNORE @@ -283,39 +364,49 @@ in the replication stream from the source. The values can be as follows: {{< warning >}} We caution against against using `EXEC` or `EXEC_IGNORE` for the following reasons: - * You may want a different schema on the target - * You may want to apply the DDL in a different way on the target - * The DDL may take a long time to apply on the target and may disrupt replication, performance, and query execution while it is being applied (if serving traffic from the target) -{{< /warning >}} + +* You may want a different schema on the target +* You may want to apply the DDL in a different way on the target +* The DDL may take a long time to apply on the target and may disrupt replication, performance, and query execution + while it is being applied (if serving traffic from the target) + {{< /warning >}} #### --rename_tables + **optional**\ **default** false
-During `Complete` or `Cancel` operations, the tables are renamed instead of being deleted. Currently the new name is _<table_name>_old. +During `Complete` or `Cancel` operations, the tables are renamed instead of being deleted. Currently the new name is _ +<table_name>_old. -We use the same renaming logic used by [`pt-online-schema-change`](https://docs.percona.com/percona-toolkit/pt-online-schema-change.html). +We use the same renaming logic used +by [`pt-online-schema-change`](https://docs.percona.com/percona-toolkit/pt-online-schema-change.html). Such tables are automatically skipped by vreplication if they exist on the source.
#### --reverse_replication + **optional**\ **default** true
-`SwitchTraffic` for primary tablet types, by default, starts a reverse replication stream with the current target as the source, replicating back to the original source. This enables a quick and simple rollback mechanism using `ReverseTraffic`. This reverse workflow name is that of the original workflow concatenated with \_reverse. +`SwitchTraffic` for primary tablet types, by default, starts a reverse replication stream with the current target as the +source, replicating back to the original source. This enables a quick and simple rollback mechanism +using `ReverseTraffic`. This reverse workflow name is that of the original workflow concatenated with \_reverse. -If set to false these reverse replication streams will not be created and you will not be able to rollback once you have switched write traffic over to the target. +If set to false these reverse replication streams will not be created and you will not be able to rollback once you have +switched write traffic over to the target.
#### --source + **mandatory**
@@ -324,24 +415,32 @@ Name of existing keyspace that contains the tables to be moved.
#### --source_time_zone + **optional**\ **default** ""
-Specifying this flag causes all `DATETIME` fields to be converted from the given time zone into `UTC`. It is expected that the application has -stored *all* `DATETIME` fields, in all tables being moved, in the specified time zone. On the target these `DATETIME` values will be stored in `UTC`. +Specifying this flag causes all `DATETIME` fields to be converted from the given time zone into `UTC`. It is expected +that the application has +stored *all* `DATETIME` fields, in all tables being moved, in the specified time zone. On the target these `DATETIME` +values will be stored in `UTC`. -As a best practice, Vitess expects users to run their MySQL servers in `UTC`. So we do not specify a target time zone for the conversion. -It is expected that the [time zone tables have been pre-populated](https://dev.mysql.com/doc/refman/en/time-zone-support.html#time-zone-installation) on the target mysql servers. +As a best practice, Vitess expects users to run their MySQL servers in `UTC`. So we do not specify a target time zone +for the conversion. +It is expected that +the [time zone tables have been pre-populated](https://dev.mysql.com/doc/refman/en/time-zone-support.html#time-zone-installation) +on the target mysql servers. Any reverse replication streams running after a SwitchWrites will do the reverse date conversion on the source. -Note that selecting the `DATETIME` columns from the target will now give the times in UTC. It is expected that the application will +Note that selecting the `DATETIME` columns from the target will now give the times in UTC. It is expected that the +application will perform any conversions using, for example, `SET GLOBAL time_zone = 'US/Pacific'`or `convert_tz()`. -Also note that only columns of `DATETIME` data types are converted. If you store `DATETIME` values as `VARCHAR` or `VARBINARY` strings, -setting this flag will not convert them. +Also note that only columns of `DATETIME` data types are converted. If you store `DATETIME` values as `VARCHAR` +or `VARBINARY` strings, +setting this flag will not convert them.
@@ -359,10 +458,12 @@ is small enough to start replicating, the workflow state will be set to Stopped. ###### Uses + * If you just want a consistent snapshot of all the tables you can set this flag. The workflow -will stop once the copy is done and you can then mark the workflow as `Complete`. + will stop once the copy is done and you can then mark the workflow as `Complete`. #### --tables + **optional** one of `--tables` or `--all` needs to be specified
@@ -384,9 +485,11 @@ _Or_
-#### --tablet_types +#### --tablet_types + **optional**\ -**default** `--vreplication_tablet_type` parameter value for the tablet. `--vreplication_tablet_type` has the default value of "in_order:REPLICA,PRIMARY".\ +**default** `--vreplication_tablet_type` parameter value for the tablet. `--vreplication_tablet_type` has the default +value of "in_order:REPLICA,PRIMARY".\ **string**
@@ -397,12 +500,14 @@ specified impacts [tablet selection](../tablet_selection/) for the workflow.
#### --timeout + **optional**\ **default** 30s
-For primary tablets, SwitchTraffic first stops writes on the source primary and waits for the replication to the target to +For primary tablets, SwitchTraffic first stops writes on the source primary and waits for the replication to the target +to catchup with the point where the writes were stopped. If the wait time is longer than timeout the command will error out. For setups with high write qps you may need to increase this value. @@ -412,38 +517,45 @@ the command will error out. For setups with high write qps you may need to incre
-All workflows are identified by `targetKeyspace.workflow` where `targetKeyspace` is the name of the keyspace to which the tables are being moved. `workflow` is a name you assign to the `MoveTables` workflow to identify it. +All workflows are identified by `targetKeyspace.workflow` where `targetKeyspace` is the name of the keyspace to which +the tables are being moved. `workflow` is a name you assign to the `MoveTables` workflow to identify it.
- ## The most basic MoveTables Workflow lifecycle 1. Initiate the migration using `Create`
-`MoveTables -- --source= --tables= Create ` + `MoveTables -- --source= --tables= Create ` 1. Monitor the workflow using `Show` or `Progress`
-`MoveTables Show ` _*or*_
-`MoveTables Progress `
+ `MoveTables Show ` _*or*_
+ `MoveTables Progress `
1. Confirm that data has been copied over correctly using [VDiff](../vdiff) 1. Cutover to the target keyspace with `SwitchTraffic`
-`MoveTables SwitchTraffic ` + `MoveTables SwitchTraffic ` 1. Cleanup vreplication artifacts and source tables with `Complete`
-`MoveTables Complete ` - + `MoveTables Complete ` ## Common use cases for MoveTables ### Adopting Vitess -For those wanting to try out Vitess for the first time, `MoveTables` provides an easy way to route part of their workload to Vitess with the ability to migrate back at any time without any risk. You point a vttablet to your existing MySQL installation, spin up an unsharded Vitess cluster and use a `MoveTables` workflow to start serving some tables from Vitess. You can also go further and use a Reshard workflow to experiment with a sharded version of a part of your database. +For those wanting to try out Vitess for the first time, `MoveTables` provides an easy way to route part of their +workload to Vitess with the ability to migrate back at any time without any risk. You point a vttablet to your existing +MySQL installation, spin up an unsharded Vitess cluster and use a `MoveTables` workflow to start serving some tables +from Vitess. You can also go further and use a Reshard workflow to experiment with a sharded version of a part of your +database. -See this [user guide](../../../user-guides/configuration-advanced/unmanaged-tablet/#move-legacytable-to-the-commerce-keyspace) for detailed steps. +See +this [user guide](../../../user-guides/configuration-advanced/unmanaged-tablet/#move-legacytable-to-the-commerce-keyspace) +for detailed steps. ### Vertical Sharding -For existing Vitess users you can easily move one or more tables to another keyspace, either for balancing load or as preparation for sharding your tables. +For existing Vitess users you can easily move one or more tables to another keyspace, either for balancing load or as +preparation for sharding your tables. -See this [user guide](../../../user-guides/migration/move-tables/) which describes how `MoveTables` works in the local example provided in the Vitess repo. +See this [user guide](../../../user-guides/migration/move-tables/) which describes how `MoveTables` works in the local +example provided in the Vitess repo. ### More Reading