From 1d66e5fd8410613e625e8a38746d96f8e686f21b Mon Sep 17 00:00:00 2001 From: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> Date: Thu, 29 Jun 2023 14:06:05 +0300 Subject: [PATCH 01/17] Explaining full/incremental backups and restores Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> --- .../backup-and-restore/overview.md | 30 +++++++++++++++++++ 1 file changed, 30 insertions(+) diff --git a/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/overview.md b/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/overview.md index 1f7e98996..640ce143a 100644 --- a/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/overview.md +++ b/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/overview.md @@ -26,6 +26,36 @@ The engine is the techology used for generating the backup. Currently Vitess has * Builtin: Shutdown an instance and copy all the database files (default) * XtraBackup: An online backup using Percona's [XtraBackup](https://www.percona.com/software/mysql-database/percona-xtrabackup) +### Backup types + +Vitess supports Full Backups as well as Incremental Backups, and their respective counterparts full restores and point-in-time restores. + +* A full backup contains the entire data in the database. The backup represents a consistent state of the data, i.e. it is a snapshot of the data at some point in time. +* An incremental backup contains a changelog, or a transition of data from one state to another. Vitess implements incremental backups by making a copy of MySQL binary logs. + +Generally speaking and on most workloads, the cost of a full backup is higher, and the cost of incremental backups is lower. The time it takes to create a full backup is significant, and it is therefore impractical to take full backups in very small intervals. Moreover, a full backup onsumes the disk space needed for the entire dataset. Incremental backups, on the other hand, are quick to run, and have very little impact, if any, to the running servers. They only contain the changes in between two points in time, and on most workloads are more compact. + +Full and incremental backups are expected to be interleaved. For example: one would create a full backup once per day, and incremental backups once per hour. + +Full backups are simply states of the database. Incremental backups, however, need to start with some point and end with some point. The common practice is for an incremental backup to continue from the point of the last good backup, which can be a full or incremental backup. An inremental backup in Vitess end at the point in time of execution. + +The identity of the tablet on which a full backup or an incremental backup is taken is immaterial. It is possible to take a full backup on one tablet and incremental backups on another. It is possible to take full backups on two different tablets. It is also possible to take incremental backups, independently, on two different tablets, even though the contents of those incremental backups overlaps. Vitess uses MySQL GTID to determine positioning and prune duplicates. + +### Restores + +Restores are the counterparts of backups. A restore uses the engine utilized to create a backup. One may run a restore from a full backup, or a point-in-time restore (PITR) based on additional incremental backups. + +A Vitess restore operates on a tablet. The restore process completely wipes out the data in the tablet's MySQL server and repopulates the server with the backup(s) data. The MySQL server is shutdown durign the process. As a safety mechanism, it is not possible to restore onto a `PRIMARY` tablet. + +### Restore Types + +Vitess supports full restores and incremental (aka point in time) restores. The two serve different purposes. + +* A full restore loads the dataset from a full backup onto a non-`PRIMARY` tablet. Once the data is loaded, the restore process starts the MySQL service and makes it join the replication stream. It is expected that a freshly restored server will lag behind the shard's `PRIMARY` for a period of time. + The full restore flow is useful for seeding new replica tablets. It may also be used to fix replicas that have been corrupted. +* An incremental, or a point-in-time restore, restores a tablet/MySQL up to a specific position or time. This is done by first loading a full backup dataset, followed by applying the changelog captured in zero or more incremental backups. Once that is complete, the tablet type is set to `DRAINED` and the tablet does _not_ join the replication stream. + The purpose of point-in-time restore is to recover data from an accidental write/deletion. If the database administrator knows at about what time the accidental write took place, they can restore a replica tablet to a point in time shortly before the accidental write. Since the server does not join the replication stream, its data then remains static, and the administrator may review or copy the data as they please. Finally, it is then possible to change the tablet type back to `REPLICA` and have it join the shard's replication. + ## Vtbackup, VTTablet and Vtctld Vtbackup, VTTablet, and Vtctld may all participate in backups and restores. From 014e04a661896b71d6ce4d5bd6f0716661dd6c4c Mon Sep 17 00:00:00 2001 From: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> Date: Thu, 29 Jun 2023 15:00:47 +0300 Subject: [PATCH 02/17] clarify the prevention of a backup on PRIMARY Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> --- .../user-guides/operating-vitess/backup-and-restore/overview.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/overview.md b/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/overview.md index 640ce143a..9a1a86948 100644 --- a/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/overview.md +++ b/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/overview.md @@ -45,7 +45,7 @@ The identity of the tablet on which a full backup or an incremental backup is ta Restores are the counterparts of backups. A restore uses the engine utilized to create a backup. One may run a restore from a full backup, or a point-in-time restore (PITR) based on additional incremental backups. -A Vitess restore operates on a tablet. The restore process completely wipes out the data in the tablet's MySQL server and repopulates the server with the backup(s) data. The MySQL server is shutdown durign the process. As a safety mechanism, it is not possible to restore onto a `PRIMARY` tablet. +A Vitess restore operates on a tablet. The restore process completely wipes out the data in the tablet's MySQL server and repopulates the server with the backup(s) data. The MySQL server is shutdown durign the process. As a safety mechanism, Vitess by default prevents a restore onto a `PRIMARY` tablet. ### Restore Types From 43adafe965b53bd991e64292d9b65eacb6883384 Mon Sep 17 00:00:00 2001 From: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> Date: Thu, 29 Jun 2023 15:01:02 +0300 Subject: [PATCH 03/17] Creating an incremental backup Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> --- .../backup-and-restore/creating-a-backup.md | 33 +++++++++++++++++-- 1 file changed, 30 insertions(+), 3 deletions(-) diff --git a/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/creating-a-backup.md b/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/creating-a-backup.md index 665fba502..17e1b34a0 100644 --- a/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/creating-a-backup.md +++ b/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/creating-a-backup.md @@ -4,6 +4,12 @@ weight: 2 aliases: ['/docs/user-guides/backup-and-restore/'] --- +## Choosing the backup type + +As described in [Backup types](../overview/#backup-types), you choose to run a Full Backup (the default) or an Incremental Backup. + +Full backups will use the backup engine chosen in the tablet's [configuration](#configuration). Incremental backups will always copy MySQL's binary logs, irrespective of the configured backup engine. + ## Using xtrabackup The default backup implementation is `builtin`, however we strongly recommend using the `xtrabackup` engine as it is more robust and allows for non-blocking backups. Restores will always be done with whichever engine was used to create the backup. @@ -75,11 +81,11 @@ I0310 12:49:32.279773 215835 backup.go:163] I0310 20:49:32.279485 xtrabackupeng To continue with risk: Set `--xtrabackup_backup_flags=--no-server-version-check`. Note this occurs when your MySQL server version is technically unsupported by `xtrabackup`. -## Create backups with vtctl +## Create a full backup with vtctl __Run the following vtctl command to create a backup:__ -``` sh +```sh vtctldclient --server=: Backup ``` @@ -89,10 +95,31 @@ If the engine is `xtrabackup`, the tablet can continue to serve traffic while th __Run the following vtctl command to backup a specific shard:__ -``` sh +```sh vtctldclient --server=: BackupShard [--allow_primary=false] ``` +## Create an incremental backup with vtctl + +An incremental backup requires additional information: the point from which to start the backup. An incremental backup is taken by supplying `--incremental_from_pos` to the `Backup` command. The argument may either indicate a valid position, or the value `auto`. Examples: + +```sh +vtctlclient -- Backup --incremental_from_pos="MySQL56/0d7aaca6-1666-11ee-aeaf-0a43f95f28a3:1-53" zone1-0000000102 + +vtctlclient -- Backup --incremental_from_pos="auto" zone1-0000000102 +``` + +When `--incremental_from_pos="auto"`, Vitess chooses the position of the last successful backup as the starting point for the incremental backup. This is a convenient way to ensure a sequence of contiguous incremental backups. + +An incremental backup backs up one or more MySQL binary log files. These binary log files may begin with the requested position, or with an earlier position. They will necessarily include the requested position. When the incremental backup begins, Vitess rotates the MySQL binary logs on the tablet, so that it does not back up an active log file. + +An incremental backup fails in these scenarios: + +- It is unable to find binary log files that covers the requested position. This can happen if the binary logs are purged earlier than the incremental backup was taken. It essentially means there's a gap in the changelog events. **Note** that while on one tablet the binary logs may be missing, another tablet may still have binary logs that cover the requested position. +- There is no change to the database since the requested position, i.e. the GTID position has not changed since. + +`v17` only supports `--incremental_from_pos` in the `Backup` command, not in `BackupShard`. Also, only `vtctlclient` supports the flag, where `vtctldclient` does not. `v18` is expected to support incremental backups for `BackupShard` and for `vtctldclient`. + ## Backing up Topology Server The Topology Server stores metadata (and not tablet data). It is recommended to create a backup using the method described by the underlying plugin: From 3c511f090f03a95b5bdeb3b035b7f9b8c904a50e Mon Sep 17 00:00:00 2001 From: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> Date: Wed, 5 Jul 2023 12:51:48 +0300 Subject: [PATCH 04/17] bootstrap and restore: document --restore_to_pos Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> --- .../bootstrap-and-restore.md | 39 ++++++++++++++++++- 1 file changed, 38 insertions(+), 1 deletion(-) diff --git a/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/bootstrap-and-restore.md b/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/bootstrap-and-restore.md index 5ab721ed7..909a3c4f2 100644 --- a/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/bootstrap-and-restore.md +++ b/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/bootstrap-and-restore.md @@ -4,7 +4,8 @@ weight: 3 aliases: ['/docs/user-guides/backup-and-restore/'] --- -## Restoring a backup +Restores can be done automatically by way of seeding/bootstrapping new tablets, or they can be invoked manually on a tablet to restore a full backup or do a point-in-time recovery. +## Auto restoring a backup on startup When a tablet starts, Vitess checks the value of the `--restore_from_backup` command-line flag to determine whether to restore a backup to that tablet. Restores will always be done with whichever engine was used to create the backup. @@ -32,3 +33,39 @@ Bootstrapping a new tablet is almost identical to restoring an existing tablet. ``` The bootstrapped tablet will restore the data from the backup and then apply changes, which occurred after the backup, by restarting replication. + +## Manual restore + +A manual restore is done on a specific tablet. The tablet's MySQL server is shut down and its data is wiped out. + +### Restore a full backup + +To restore the tablet from the most recent full backup, run: + +```shell +vtctldclient --server=: RestoreFromBackup +``` + +Example: + +```shell +vtctldclient --server localhost:15999 --alsologtostderr RestoreFromBackup zone1-0000000101 +``` + +If successful, the tablet's MySQL server rejoins the shard's replication stream, to eventually captch up and be able to serve traffic. + +### Restore to a point-in-time + +`v17` supports restoring to a specific _position_: + +```shell +vtctlclient -- RestoreFromBackup --restore_to_pos +``` + +Example: + +```shell +vtctlclient -- RestoreFromBackup --restore_to_pos "MySQL56/0d7aaca6-1666-11ee-aeaf-0a43f95f28a3:1-60" zone1-0000000102 +``` + +`v18` will supports restore to a given timestamp. From 044d0a4e8dd11df0edd3ab7e6d94d948ddac073e Mon Sep 17 00:00:00 2001 From: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> Date: Wed, 5 Jul 2023 13:49:23 +0300 Subject: [PATCH 05/17] more notes on incremental restore Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> --- .../backup-and-restore/bootstrap-and-restore.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/bootstrap-and-restore.md b/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/bootstrap-and-restore.md index 909a3c4f2..6ab431628 100644 --- a/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/bootstrap-and-restore.md +++ b/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/bootstrap-and-restore.md @@ -56,7 +56,7 @@ If successful, the tablet's MySQL server rejoins the shard's replication stream, ### Restore to a point-in-time -`v17` supports restoring to a specific _position_: +`v17` supports incremental restore, or restoring to a specific _position_: ```shell vtctlclient -- RestoreFromBackup --restore_to_pos @@ -68,4 +68,6 @@ Example: vtctlclient -- RestoreFromBackup --restore_to_pos "MySQL56/0d7aaca6-1666-11ee-aeaf-0a43f95f28a3:1-60" zone1-0000000102 ``` +This restore method assumes backups have been taken that cover the specified position. The restore process will first determine a restore path: a sequence of backups, starting with a full backup followed by zero or more incremental backups, that when combined, include the specified position. See more on [Restore Types](../overview/#restore-types) and on [Taking Incremental Backup](../creating-a-backup/#create-an-incremental-backup-with-vtctl). + `v18` will supports restore to a given timestamp. From d5678c7c8d1913d87160893477f1c1b587b3e1de Mon Sep 17 00:00:00 2001 From: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> Date: Wed, 5 Jul 2023 13:49:53 +0300 Subject: [PATCH 06/17] Elaborating on the new PITR, with links to user guides Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> --- .../en/docs/17.0/reference/features/recovery.md | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/content/en/docs/17.0/reference/features/recovery.md b/content/en/docs/17.0/reference/features/recovery.md index 7d9786d50..b33c54781 100644 --- a/content/en/docs/17.0/reference/features/recovery.md +++ b/content/en/docs/17.0/reference/features/recovery.md @@ -6,6 +6,23 @@ aliases: ['/docs/recovery/pitr','/docs/reference/pitr/'] ## Point in Time Recovery +Vitess supports incremental backup and recoveries, aka point in time recoveries. `v17` offers restore-to-position functionality, and `v18` is slated to support restore-to-timestamp functionality. + +Point in time recoveries are based on Full and Incremental backups. It is possible to recover a database to a position that is _covered_ by some backup. + +See [Backup Types](../../../user-guides/operating-vitess/backup-and-restore/overview/#backup-types) and [Restore Types](../../../user-guides/operating-vitess/backup-and-restore/overview/#restore-types) for an overview of incremental backups and restores. + +See the user guides for how to [Create an Incremental Backup](../../../user-guides/operating-vitess/backup-and-restore/creating-a-backup/#create-an-incremental-backup-with-vtctl) and how to [Restore to a position](../../../user-guides/operating-vitess/backup-and-restore/bootstrap-and-restore/#restore-to-a-point-in-time). + +### Supported Databases +- MySQL 5.7, 8.0 + +### Notes + +This functionality replaces a legacy functionality, based on binlog servers and transient binary logs. + +## Point in Time Recovery: legacy functionality based on binlog server + ### Supported Databases - MySQL 8.0 From 495687b97f0d34597fa26178bd0b1d868cb29da6 Mon Sep 17 00:00:00 2001 From: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> Date: Thu, 6 Jul 2023 06:57:14 +0300 Subject: [PATCH 07/17] capitalize. Clarify Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> --- content/en/docs/17.0/reference/features/recovery.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/en/docs/17.0/reference/features/recovery.md b/content/en/docs/17.0/reference/features/recovery.md index b33c54781..9e7002e32 100644 --- a/content/en/docs/17.0/reference/features/recovery.md +++ b/content/en/docs/17.0/reference/features/recovery.md @@ -6,7 +6,7 @@ aliases: ['/docs/recovery/pitr','/docs/reference/pitr/'] ## Point in Time Recovery -Vitess supports incremental backup and recoveries, aka point in time recoveries. `v17` offers restore-to-position functionality, and `v18` is slated to support restore-to-timestamp functionality. +Vitess supports incremental backup and recoveries, AKA point in time recoveries. `v17` offers restore-to-position functionality, and `v18` is slated to support restore-to-timestamp functionality in addition. Point in time recoveries are based on Full and Incremental backups. It is possible to recover a database to a position that is _covered_ by some backup. From 6b91dcbd3178a2220175ef12812afdf9afa22b05 Mon Sep 17 00:00:00 2001 From: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> Date: Thu, 6 Jul 2023 06:58:14 +0300 Subject: [PATCH 08/17] Decapitalize Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> --- content/en/docs/17.0/reference/features/recovery.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/en/docs/17.0/reference/features/recovery.md b/content/en/docs/17.0/reference/features/recovery.md index 9e7002e32..0be939330 100644 --- a/content/en/docs/17.0/reference/features/recovery.md +++ b/content/en/docs/17.0/reference/features/recovery.md @@ -8,7 +8,7 @@ aliases: ['/docs/recovery/pitr','/docs/reference/pitr/'] Vitess supports incremental backup and recoveries, AKA point in time recoveries. `v17` offers restore-to-position functionality, and `v18` is slated to support restore-to-timestamp functionality in addition. -Point in time recoveries are based on Full and Incremental backups. It is possible to recover a database to a position that is _covered_ by some backup. +Point in time recoveries are based on full and incremental backups. It is possible to recover a database to a position that is _covered_ by some backup. See [Backup Types](../../../user-guides/operating-vitess/backup-and-restore/overview/#backup-types) and [Restore Types](../../../user-guides/operating-vitess/backup-and-restore/overview/#restore-types) for an overview of incremental backups and restores. From 09697faa420c5449ef98aeba0e370369bd5c3e2d Mon Sep 17 00:00:00 2001 From: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> Date: Thu, 6 Jul 2023 07:01:04 +0300 Subject: [PATCH 09/17] Decapitalize Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> --- .../operating-vitess/backup-and-restore/creating-a-backup.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/creating-a-backup.md b/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/creating-a-backup.md index 17e1b34a0..ff4ffaaba 100644 --- a/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/creating-a-backup.md +++ b/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/creating-a-backup.md @@ -6,7 +6,7 @@ aliases: ['/docs/user-guides/backup-and-restore/'] ## Choosing the backup type -As described in [Backup types](../overview/#backup-types), you choose to run a Full Backup (the default) or an Incremental Backup. +As described in [Backup types](../overview/#backup-types), you choose to run a full Backup (the default) or an incremental Backup. Full backups will use the backup engine chosen in the tablet's [configuration](#configuration). Incremental backups will always copy MySQL's binary logs, irrespective of the configured backup engine. From d80e27368ad652ae86442dbf174eed2b018a1404 Mon Sep 17 00:00:00 2001 From: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> Date: Thu, 6 Jul 2023 07:03:17 +0300 Subject: [PATCH 10/17] Decapitalize Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> --- .../user-guides/operating-vitess/backup-and-restore/overview.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/overview.md b/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/overview.md index 9a1a86948..8484c3d70 100644 --- a/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/overview.md +++ b/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/overview.md @@ -28,7 +28,7 @@ The engine is the techology used for generating the backup. Currently Vitess has ### Backup types -Vitess supports Full Backups as well as Incremental Backups, and their respective counterparts full restores and point-in-time restores. +Vitess supports full backups as well as incremental backups, and their respective counterparts full restores and point-in-time restores. * A full backup contains the entire data in the database. The backup represents a consistent state of the data, i.e. it is a snapshot of the data at some point in time. * An incremental backup contains a changelog, or a transition of data from one state to another. Vitess implements incremental backups by making a copy of MySQL binary logs. From 9499d20dd52cc579bab5930b717993ed2950e937 Mon Sep 17 00:00:00 2001 From: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> Date: Thu, 6 Jul 2023 07:06:30 +0300 Subject: [PATCH 11/17] typo Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> --- .../user-guides/operating-vitess/backup-and-restore/overview.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/overview.md b/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/overview.md index 8484c3d70..db968f9f3 100644 --- a/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/overview.md +++ b/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/overview.md @@ -33,7 +33,7 @@ Vitess supports full backups as well as incremental backups, and their respectiv * A full backup contains the entire data in the database. The backup represents a consistent state of the data, i.e. it is a snapshot of the data at some point in time. * An incremental backup contains a changelog, or a transition of data from one state to another. Vitess implements incremental backups by making a copy of MySQL binary logs. -Generally speaking and on most workloads, the cost of a full backup is higher, and the cost of incremental backups is lower. The time it takes to create a full backup is significant, and it is therefore impractical to take full backups in very small intervals. Moreover, a full backup onsumes the disk space needed for the entire dataset. Incremental backups, on the other hand, are quick to run, and have very little impact, if any, to the running servers. They only contain the changes in between two points in time, and on most workloads are more compact. +Generally speaking and on most workloads, the cost of a full backup is higher, and the cost of incremental backups is lower. The time it takes to create a full backup is significant, and it is therefore impractical to take full backups in very small intervals. Moreover, a full backup consumes the disk space needed for the entire dataset. Incremental backups, on the other hand, are quick to run, and have very little impact, if any, to the running servers. They only contain the changes in between two points in time, and on most workloads are more compact. Full and incremental backups are expected to be interleaved. For example: one would create a full backup once per day, and incremental backups once per hour. From a0626906b1ba8703a3e1223158e5cd7e5fabe72d Mon Sep 17 00:00:00 2001 From: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> Date: Thu, 6 Jul 2023 07:06:55 +0300 Subject: [PATCH 12/17] Update content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/overview.md Co-authored-by: Matt Lord Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> --- .../user-guides/operating-vitess/backup-and-restore/overview.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/overview.md b/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/overview.md index db968f9f3..6cb2ca01e 100644 --- a/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/overview.md +++ b/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/overview.md @@ -39,7 +39,7 @@ Full and incremental backups are expected to be interleaved. For example: one wo Full backups are simply states of the database. Incremental backups, however, need to start with some point and end with some point. The common practice is for an incremental backup to continue from the point of the last good backup, which can be a full or incremental backup. An inremental backup in Vitess end at the point in time of execution. -The identity of the tablet on which a full backup or an incremental backup is taken is immaterial. It is possible to take a full backup on one tablet and incremental backups on another. It is possible to take full backups on two different tablets. It is also possible to take incremental backups, independently, on two different tablets, even though the contents of those incremental backups overlaps. Vitess uses MySQL GTID to determine positioning and prune duplicates. +The identity of the tablet on which a full backup or an incremental backup is taken is immaterial. It is possible to take a full backup on one tablet and incremental backups on another. It is possible to take full backups on two different tablets. It is also possible to take incremental backups, independently, on two different tablets, even though the contents of those incremental backups overlaps. Vitess uses MySQL GTID sets to determine positioning and prune duplicates. ### Restores From 514210ceaed9c7810c249e58e2fdcd2326894c56 Mon Sep 17 00:00:00 2001 From: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> Date: Thu, 6 Jul 2023 07:07:09 +0300 Subject: [PATCH 13/17] Update content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/overview.md Co-authored-by: Matt Lord Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> --- .../user-guides/operating-vitess/backup-and-restore/overview.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/overview.md b/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/overview.md index 6cb2ca01e..5c07930cf 100644 --- a/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/overview.md +++ b/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/overview.md @@ -49,7 +49,7 @@ A Vitess restore operates on a tablet. The restore process completely wipes out ### Restore Types -Vitess supports full restores and incremental (aka point in time) restores. The two serve different purposes. +Vitess supports full restores and incremental (AKA point-in-time) restores. The two serve different purposes. * A full restore loads the dataset from a full backup onto a non-`PRIMARY` tablet. Once the data is loaded, the restore process starts the MySQL service and makes it join the replication stream. It is expected that a freshly restored server will lag behind the shard's `PRIMARY` for a period of time. The full restore flow is useful for seeding new replica tablets. It may also be used to fix replicas that have been corrupted. From 795ec4c88d61c7814cc1ae57ae7fca8269e693c5 Mon Sep 17 00:00:00 2001 From: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> Date: Thu, 6 Jul 2023 07:07:19 +0300 Subject: [PATCH 14/17] Update content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/overview.md Co-authored-by: Matt Lord Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> --- .../user-guides/operating-vitess/backup-and-restore/overview.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/overview.md b/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/overview.md index 5c07930cf..46b43e537 100644 --- a/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/overview.md +++ b/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/overview.md @@ -54,7 +54,7 @@ Vitess supports full restores and incremental (AKA point-in-time) restores. The * A full restore loads the dataset from a full backup onto a non-`PRIMARY` tablet. Once the data is loaded, the restore process starts the MySQL service and makes it join the replication stream. It is expected that a freshly restored server will lag behind the shard's `PRIMARY` for a period of time. The full restore flow is useful for seeding new replica tablets. It may also be used to fix replicas that have been corrupted. * An incremental, or a point-in-time restore, restores a tablet/MySQL up to a specific position or time. This is done by first loading a full backup dataset, followed by applying the changelog captured in zero or more incremental backups. Once that is complete, the tablet type is set to `DRAINED` and the tablet does _not_ join the replication stream. - The purpose of point-in-time restore is to recover data from an accidental write/deletion. If the database administrator knows at about what time the accidental write took place, they can restore a replica tablet to a point in time shortly before the accidental write. Since the server does not join the replication stream, its data then remains static, and the administrator may review or copy the data as they please. Finally, it is then possible to change the tablet type back to `REPLICA` and have it join the shard's replication. + The common purpose of point-in-time restore is to recover data from an accidental write/deletion. If the database administrator knows at about what time the accidental write took place, they can restore a replica tablet to a point in time shortly before the accidental write. Since the server does not join the replication stream, its data then remains static, and the administrator may review or copy the data as they please. Finally, it is then possible to change the tablet type back to `REPLICA` and have it join the shard's replication. ## Vtbackup, VTTablet and Vtctld From 49d96422145aee0a287082c2394265b9bb41ffe8 Mon Sep 17 00:00:00 2001 From: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> Date: Thu, 6 Jul 2023 07:07:56 +0300 Subject: [PATCH 15/17] typo Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> --- .../user-guides/operating-vitess/backup-and-restore/overview.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/overview.md b/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/overview.md index 46b43e537..757c07fe3 100644 --- a/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/overview.md +++ b/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/overview.md @@ -45,7 +45,7 @@ The identity of the tablet on which a full backup or an incremental backup is ta Restores are the counterparts of backups. A restore uses the engine utilized to create a backup. One may run a restore from a full backup, or a point-in-time restore (PITR) based on additional incremental backups. -A Vitess restore operates on a tablet. The restore process completely wipes out the data in the tablet's MySQL server and repopulates the server with the backup(s) data. The MySQL server is shutdown durign the process. As a safety mechanism, Vitess by default prevents a restore onto a `PRIMARY` tablet. +A Vitess restore operates on a tablet. The restore process completely wipes out the data in the tablet's MySQL server and repopulates the server with the backup(s) data. The MySQL server is shutdown during the process. As a safety mechanism, Vitess by default prevents a restore onto a `PRIMARY` tablet. ### Restore Types From c8955457d3eae9c7ee37cfdf59a753cb7e467955 Mon Sep 17 00:00:00 2001 From: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> Date: Tue, 11 Jul 2023 09:07:57 +0300 Subject: [PATCH 16/17] clarify non-PRIMARY is eligible to restore Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> --- .../user-guides/operating-vitess/backup-and-restore/overview.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/overview.md b/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/overview.md index 757c07fe3..322edb1fc 100644 --- a/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/overview.md +++ b/content/en/docs/17.0/user-guides/operating-vitess/backup-and-restore/overview.md @@ -45,7 +45,7 @@ The identity of the tablet on which a full backup or an incremental backup is ta Restores are the counterparts of backups. A restore uses the engine utilized to create a backup. One may run a restore from a full backup, or a point-in-time restore (PITR) based on additional incremental backups. -A Vitess restore operates on a tablet. The restore process completely wipes out the data in the tablet's MySQL server and repopulates the server with the backup(s) data. The MySQL server is shutdown during the process. As a safety mechanism, Vitess by default prevents a restore onto a `PRIMARY` tablet. +A Vitess restore operates on a tablet. The restore process completely wipes out the data in the tablet's MySQL server and repopulates the server with the backup(s) data. The MySQL server is shutdown during the process. As a safety mechanism, Vitess by default prevents a restore onto a `PRIMARY` tablet. Any non-`PRIMARY` tablet is otherwise eligible to restore. ### Restore Types From c6bdede7ad20cf2899b1efa6a29713882a59a14e Mon Sep 17 00:00:00 2001 From: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> Date: Wed, 12 Jul 2023 08:53:25 +0300 Subject: [PATCH 17/17] applying doc changes to 18.0 docs Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> --- .../docs/18.0/reference/features/recovery.md | 17 ++++++++ .../bootstrap-and-restore.md | 41 ++++++++++++++++++- .../backup-and-restore/creating-a-backup.md | 33 +++++++++++++-- .../backup-and-restore/overview.md | 30 ++++++++++++++ 4 files changed, 117 insertions(+), 4 deletions(-) diff --git a/content/en/docs/18.0/reference/features/recovery.md b/content/en/docs/18.0/reference/features/recovery.md index 7d9786d50..0be939330 100644 --- a/content/en/docs/18.0/reference/features/recovery.md +++ b/content/en/docs/18.0/reference/features/recovery.md @@ -6,6 +6,23 @@ aliases: ['/docs/recovery/pitr','/docs/reference/pitr/'] ## Point in Time Recovery +Vitess supports incremental backup and recoveries, AKA point in time recoveries. `v17` offers restore-to-position functionality, and `v18` is slated to support restore-to-timestamp functionality in addition. + +Point in time recoveries are based on full and incremental backups. It is possible to recover a database to a position that is _covered_ by some backup. + +See [Backup Types](../../../user-guides/operating-vitess/backup-and-restore/overview/#backup-types) and [Restore Types](../../../user-guides/operating-vitess/backup-and-restore/overview/#restore-types) for an overview of incremental backups and restores. + +See the user guides for how to [Create an Incremental Backup](../../../user-guides/operating-vitess/backup-and-restore/creating-a-backup/#create-an-incremental-backup-with-vtctl) and how to [Restore to a position](../../../user-guides/operating-vitess/backup-and-restore/bootstrap-and-restore/#restore-to-a-point-in-time). + +### Supported Databases +- MySQL 5.7, 8.0 + +### Notes + +This functionality replaces a legacy functionality, based on binlog servers and transient binary logs. + +## Point in Time Recovery: legacy functionality based on binlog server + ### Supported Databases - MySQL 8.0 diff --git a/content/en/docs/18.0/user-guides/operating-vitess/backup-and-restore/bootstrap-and-restore.md b/content/en/docs/18.0/user-guides/operating-vitess/backup-and-restore/bootstrap-and-restore.md index 5ab721ed7..6ab431628 100644 --- a/content/en/docs/18.0/user-guides/operating-vitess/backup-and-restore/bootstrap-and-restore.md +++ b/content/en/docs/18.0/user-guides/operating-vitess/backup-and-restore/bootstrap-and-restore.md @@ -4,7 +4,8 @@ weight: 3 aliases: ['/docs/user-guides/backup-and-restore/'] --- -## Restoring a backup +Restores can be done automatically by way of seeding/bootstrapping new tablets, or they can be invoked manually on a tablet to restore a full backup or do a point-in-time recovery. +## Auto restoring a backup on startup When a tablet starts, Vitess checks the value of the `--restore_from_backup` command-line flag to determine whether to restore a backup to that tablet. Restores will always be done with whichever engine was used to create the backup. @@ -32,3 +33,41 @@ Bootstrapping a new tablet is almost identical to restoring an existing tablet. ``` The bootstrapped tablet will restore the data from the backup and then apply changes, which occurred after the backup, by restarting replication. + +## Manual restore + +A manual restore is done on a specific tablet. The tablet's MySQL server is shut down and its data is wiped out. + +### Restore a full backup + +To restore the tablet from the most recent full backup, run: + +```shell +vtctldclient --server=: RestoreFromBackup +``` + +Example: + +```shell +vtctldclient --server localhost:15999 --alsologtostderr RestoreFromBackup zone1-0000000101 +``` + +If successful, the tablet's MySQL server rejoins the shard's replication stream, to eventually captch up and be able to serve traffic. + +### Restore to a point-in-time + +`v17` supports incremental restore, or restoring to a specific _position_: + +```shell +vtctlclient -- RestoreFromBackup --restore_to_pos +``` + +Example: + +```shell +vtctlclient -- RestoreFromBackup --restore_to_pos "MySQL56/0d7aaca6-1666-11ee-aeaf-0a43f95f28a3:1-60" zone1-0000000102 +``` + +This restore method assumes backups have been taken that cover the specified position. The restore process will first determine a restore path: a sequence of backups, starting with a full backup followed by zero or more incremental backups, that when combined, include the specified position. See more on [Restore Types](../overview/#restore-types) and on [Taking Incremental Backup](../creating-a-backup/#create-an-incremental-backup-with-vtctl). + +`v18` will supports restore to a given timestamp. diff --git a/content/en/docs/18.0/user-guides/operating-vitess/backup-and-restore/creating-a-backup.md b/content/en/docs/18.0/user-guides/operating-vitess/backup-and-restore/creating-a-backup.md index 665fba502..ff4ffaaba 100644 --- a/content/en/docs/18.0/user-guides/operating-vitess/backup-and-restore/creating-a-backup.md +++ b/content/en/docs/18.0/user-guides/operating-vitess/backup-and-restore/creating-a-backup.md @@ -4,6 +4,12 @@ weight: 2 aliases: ['/docs/user-guides/backup-and-restore/'] --- +## Choosing the backup type + +As described in [Backup types](../overview/#backup-types), you choose to run a full Backup (the default) or an incremental Backup. + +Full backups will use the backup engine chosen in the tablet's [configuration](#configuration). Incremental backups will always copy MySQL's binary logs, irrespective of the configured backup engine. + ## Using xtrabackup The default backup implementation is `builtin`, however we strongly recommend using the `xtrabackup` engine as it is more robust and allows for non-blocking backups. Restores will always be done with whichever engine was used to create the backup. @@ -75,11 +81,11 @@ I0310 12:49:32.279773 215835 backup.go:163] I0310 20:49:32.279485 xtrabackupeng To continue with risk: Set `--xtrabackup_backup_flags=--no-server-version-check`. Note this occurs when your MySQL server version is technically unsupported by `xtrabackup`. -## Create backups with vtctl +## Create a full backup with vtctl __Run the following vtctl command to create a backup:__ -``` sh +```sh vtctldclient --server=: Backup ``` @@ -89,10 +95,31 @@ If the engine is `xtrabackup`, the tablet can continue to serve traffic while th __Run the following vtctl command to backup a specific shard:__ -``` sh +```sh vtctldclient --server=: BackupShard [--allow_primary=false] ``` +## Create an incremental backup with vtctl + +An incremental backup requires additional information: the point from which to start the backup. An incremental backup is taken by supplying `--incremental_from_pos` to the `Backup` command. The argument may either indicate a valid position, or the value `auto`. Examples: + +```sh +vtctlclient -- Backup --incremental_from_pos="MySQL56/0d7aaca6-1666-11ee-aeaf-0a43f95f28a3:1-53" zone1-0000000102 + +vtctlclient -- Backup --incremental_from_pos="auto" zone1-0000000102 +``` + +When `--incremental_from_pos="auto"`, Vitess chooses the position of the last successful backup as the starting point for the incremental backup. This is a convenient way to ensure a sequence of contiguous incremental backups. + +An incremental backup backs up one or more MySQL binary log files. These binary log files may begin with the requested position, or with an earlier position. They will necessarily include the requested position. When the incremental backup begins, Vitess rotates the MySQL binary logs on the tablet, so that it does not back up an active log file. + +An incremental backup fails in these scenarios: + +- It is unable to find binary log files that covers the requested position. This can happen if the binary logs are purged earlier than the incremental backup was taken. It essentially means there's a gap in the changelog events. **Note** that while on one tablet the binary logs may be missing, another tablet may still have binary logs that cover the requested position. +- There is no change to the database since the requested position, i.e. the GTID position has not changed since. + +`v17` only supports `--incremental_from_pos` in the `Backup` command, not in `BackupShard`. Also, only `vtctlclient` supports the flag, where `vtctldclient` does not. `v18` is expected to support incremental backups for `BackupShard` and for `vtctldclient`. + ## Backing up Topology Server The Topology Server stores metadata (and not tablet data). It is recommended to create a backup using the method described by the underlying plugin: diff --git a/content/en/docs/18.0/user-guides/operating-vitess/backup-and-restore/overview.md b/content/en/docs/18.0/user-guides/operating-vitess/backup-and-restore/overview.md index 1f7e98996..322edb1fc 100644 --- a/content/en/docs/18.0/user-guides/operating-vitess/backup-and-restore/overview.md +++ b/content/en/docs/18.0/user-guides/operating-vitess/backup-and-restore/overview.md @@ -26,6 +26,36 @@ The engine is the techology used for generating the backup. Currently Vitess has * Builtin: Shutdown an instance and copy all the database files (default) * XtraBackup: An online backup using Percona's [XtraBackup](https://www.percona.com/software/mysql-database/percona-xtrabackup) +### Backup types + +Vitess supports full backups as well as incremental backups, and their respective counterparts full restores and point-in-time restores. + +* A full backup contains the entire data in the database. The backup represents a consistent state of the data, i.e. it is a snapshot of the data at some point in time. +* An incremental backup contains a changelog, or a transition of data from one state to another. Vitess implements incremental backups by making a copy of MySQL binary logs. + +Generally speaking and on most workloads, the cost of a full backup is higher, and the cost of incremental backups is lower. The time it takes to create a full backup is significant, and it is therefore impractical to take full backups in very small intervals. Moreover, a full backup consumes the disk space needed for the entire dataset. Incremental backups, on the other hand, are quick to run, and have very little impact, if any, to the running servers. They only contain the changes in between two points in time, and on most workloads are more compact. + +Full and incremental backups are expected to be interleaved. For example: one would create a full backup once per day, and incremental backups once per hour. + +Full backups are simply states of the database. Incremental backups, however, need to start with some point and end with some point. The common practice is for an incremental backup to continue from the point of the last good backup, which can be a full or incremental backup. An inremental backup in Vitess end at the point in time of execution. + +The identity of the tablet on which a full backup or an incremental backup is taken is immaterial. It is possible to take a full backup on one tablet and incremental backups on another. It is possible to take full backups on two different tablets. It is also possible to take incremental backups, independently, on two different tablets, even though the contents of those incremental backups overlaps. Vitess uses MySQL GTID sets to determine positioning and prune duplicates. + +### Restores + +Restores are the counterparts of backups. A restore uses the engine utilized to create a backup. One may run a restore from a full backup, or a point-in-time restore (PITR) based on additional incremental backups. + +A Vitess restore operates on a tablet. The restore process completely wipes out the data in the tablet's MySQL server and repopulates the server with the backup(s) data. The MySQL server is shutdown during the process. As a safety mechanism, Vitess by default prevents a restore onto a `PRIMARY` tablet. Any non-`PRIMARY` tablet is otherwise eligible to restore. + +### Restore Types + +Vitess supports full restores and incremental (AKA point-in-time) restores. The two serve different purposes. + +* A full restore loads the dataset from a full backup onto a non-`PRIMARY` tablet. Once the data is loaded, the restore process starts the MySQL service and makes it join the replication stream. It is expected that a freshly restored server will lag behind the shard's `PRIMARY` for a period of time. + The full restore flow is useful for seeding new replica tablets. It may also be used to fix replicas that have been corrupted. +* An incremental, or a point-in-time restore, restores a tablet/MySQL up to a specific position or time. This is done by first loading a full backup dataset, followed by applying the changelog captured in zero or more incremental backups. Once that is complete, the tablet type is set to `DRAINED` and the tablet does _not_ join the replication stream. + The common purpose of point-in-time restore is to recover data from an accidental write/deletion. If the database administrator knows at about what time the accidental write took place, they can restore a replica tablet to a point in time shortly before the accidental write. Since the server does not join the replication stream, its data then remains static, and the administrator may review or copy the data as they please. Finally, it is then possible to change the tablet type back to `REPLICA` and have it join the shard's replication. + ## Vtbackup, VTTablet and Vtctld Vtbackup, VTTablet, and Vtctld may all participate in backups and restores.