Online DDL: tracking issue #6926

shlomi-noach · 2020-10-22T08:51:10Z

This issue will be the tracking space for all things vitess Online DDL. Note that this issue is created after some substantial work is done:

code: Experimental: automated, scheduled, dependency free online DDL via gh-ost/pt-online-schema-change #6547, available in release-8.0
documentation: https://vitess.io/docs/user-guides/schema-changes/, Online DDL: managed online schema changed via gh-ost and pt-online-schema-change website#510

#6547 served as a long running tracking point; pasting some of #6547 content here for background, purpose and intentions.

TL;DR

Automate away all the complexity of schema migrations. Users issue:

alter with 'gh-ost' table example modify id bigint not null;

alter with 'pt-osc' table example modify id bigint not null

or

$ vtctl -topo_implementation etcd2 -topo_global_server_address localhost:2379 -topo_global_root /vitess/global \
    ApplySchema -sql "alter with 'gh-ost' table example modify id bigint unsigned not null" commerce

$ vtctl -topo_implementation etcd2 -topo_global_server_address localhost:2379 -topo_global_root /vitess/global \
    ApplySchema -sql "alter with 'pt-osc' table example modify id bigint unsigned not null" commerce

(syntax subject to change, see #6782 )

and vitess will schedule an online schema change operation to run on all relevant shards, then proceed to apply the change via gh-ost on all shards.

The ALTER TABLE problem

First, to iterate the problem: schema changes have always been a problem with MySQL; a straight ALTER is a blocking operation; a ONLINE ALTER is only "online" on the master/primary, but is effectively blocking on replicas. Online schema change tools like pt-online-schema-change and gh-ost overcome these limitations by emulating an ALTER on a "ghost" table, which is populated from the original table, then swapped in its space.

Traditionally, online schema changes are considered to be "risky". Trigger based migrations add significant load onto the master server, and their cut-over phase is known to be a dangerous point. gh-ost was created at GitHub to address these concerns, and successfully eliminated concerns for operational risks: with gh-ost the load on the master is low, and well controlled, and the cut-over phase is known to cause no locking issues. gh-ost comes with different risks: it applies data changes programmatically, thus the issue of data integrity is of utmost importance. Another note of concern is data traffic: going out from MySQL into gh-ost and back into MySQL (as opposed to all-in MySQL in pt-online-schema-change).

This way or the other, running an online schema change is typically a manual operation. A human being will schedule the migration, kick it running, monitor it, possibly cut-over. In a sharded environment, a developer's request to ALTER TABLE explodes to n different migrations, each needs to be scheduled, kicked, monitored & tracked.

Sharded environments are obviously common for vitess users and so these users feel the pain more than others.

Schema migration cycle & steps

Schema management is a process that begins with the user designing a schema change, and ends with the schema being applied in production. This is a breakdown of schema management steps as I know them:

Design code
Publish changes (pull request)
Review
Formalize migration command (the specific ALTER TABLE or pt-online-schema-change or gh-ost command)
Locate: where in production should this migration run?
Schedule
Execute
Audit/monitor
Cut-over/complete
Cleanup
Notify user
Deploy & merge

What we propose to address

Vitess's architecture uniquely positions it to be able to automate away much of the process. Specifically:

Formalize migration command: turning an ALTER TABLE statement into a gh-ost/pt-osc invocation is super useful if done by vitess, since vitess can not only validate schema/params, but also can provide credentials, apply throttling logic, can instruct gh-ost on how to communicate progress via hooks, etc.
Locate: given schema/table, vitess just knows where the table is located. It knows if the schema is sharded. It knows who the shards are, who the shards masters are. It knows where to run gh-ost. Last, vitess can tell us which replicas we can use for throttling.
Schedule: vitess is again in a unique position to schedule migrations. The fact someone asks for a migration to run does not mean the migration should start right away. For example, a shard may already be running an earlier migration. Running two migrations at a time is less than ideal, and it's best to wait out the first migration before beginning the second. A scheduling mechanism is both useful to running the migrations in optimal order/sequence, as well as providing feedback to the user ("your migration is on hold because this and that", or "your migration is 2nd in queue to run")
Execute: vttablet is the ideal entity to run a migration; can read instructions from topo server and can write progress to topo server. vitess is aware of possible master failovers and can request a re-execute is a migration is so interrupted mid process.
Audit/monitor: vtctld API can offer endpoints to track status of a migration (e.g. "in progress on -80, in queue on 80-"). It may offer progress pct and ETA.
cut-over/complete: in my experience with gh-ost, the cut-over phase is safe to automate away. If running a migration during a resharding operation, then we may need to coordinate cut-over between upstream and downstream migrations.
cleanup: the old table needs to be dropped; vttablet runs a table lifecycle service (aka garbage collector) to clean up those tables.

As mentioned, substantial initial work done in Experimental: automated, scheduled, dependency free online DDL via gh-ost/pt-online-schema-change #6547.
Discussion about syntax in Discussion/opinions: query comments or specialized SQL syntax? #6782
Bugfixes in OnlineDDL bugfix: make sure schema is applied on tablet #6910
Initial followups to Experimental: automated, scheduled, dependency free online DDL via gh-ost/pt-online-schema-change #6547 are in Online DDL: followups in multiple trajectories #6901, notably auto-retry migration after master failover

The text was updated successfully, but these errors were encountered:

shlomi-noach · 2020-10-22T09:12:12Z

Logic for auto-retrying a migration: #6901

If migration is identified to have started in a different tablet, and
has failed - found to be stale
and hasn't been retried yet (temporary restriction)
then, it is automatically retried. The main use case is a failover scenario: if primary fails, tablet and gh-ost alike, then the newly promoted tablet will kick the migration back to life (starting a new migration).

Docs in vitessio/website#571

shlomi-noach · 2020-12-02T10:15:02Z

#7083 supports CREATE and DROP statements in ApplySchema to run as online DDL.

need to add a column ddl_type (CREATE, ALTER, DROP) and which should show up on vtctl OnlineDDL show
need context (see OnlineDDL: request_context/migration_context #7082)

shlomi-noach · 2020-12-09T11:26:47Z

Once #7083 and #7097 are merged:

when ddl_strategy is an online strategy, analyze a DROP TABLE statement to:

Explode into single-table statements (a single DROP TABLE statement can indicate multiple tables)
Turn each one into a RENAME TABLE statement, renaming into a HOLD gc state with 48 hour ETA
vttablet should transition away from HOLD even if ETA is unmet, if table_gc_lifecycle does not include HOLD

shlomi-noach · 2020-12-09T11:27:37Z

support vtctl OnlineDDL <keyspace> show <context>, to fetch status of migrations wit ha specific context, as added in OnlineDDL: request_context/migration_context #7082
Edit: see Support vtctl OnlineDDL <keyspace> show <context> #7145 for suggested solution

shlomi-noach · 2020-12-09T11:57:49Z

support CREATE INDEX statements as online DDL, and convert them to ALTER TABLE syntax -- required by gh-ost.

shlomi-noach · 2021-02-04T16:12:00Z

Online DDL via VReplication

At this time I have clarity as for how this will look like.

On one hand, we will go full native and reuse Vitess existing mechanisms. On the other hand, we break apart from the existing flow in multiple ways.

POC is in #7419 . It is just the beginning of what online DDL via vreplication will look like -- but already has the initial implementation running.

Some design bulletpoints:

We will use vreplication to replicate from a primary tablet onto itself (same keyspace, same shard, same tablet, both as source and target)
Our existing vreplication flows are owned by the user. User e.g. runs a MoveTables, user then runs SwitchReads, then runs SwitchWrites -- based on how the user perceives the process is going on.
However, we want to have schema migrations fully automated. Not only do we want to own the starting of the migration (remember our tablet schedule migrations sequentially), but we also want to reliably automatically cut-over the migration.
To that effect, tabletserver will be the owner of schema migrations/vreplication. As with gh-ost and pt-osc, a shard's primary table has the independece to schedule the next migration, run that migration, potentially cancel or retry it, and follow it to completion. This means no vtctl, no wrangler
Another important difference from normal VReplication flows is that in all other flows, there's source table(s) and target table(s), and they are distinct. In MoveTables we can go as far as set routing rules for new queries to route to the target tables. In a schema migration, we want to replace the original table (via RENAME TABLE). We will do this underneath the feet of VReplication.
As opposed to gh-ost or pt-osc, vreplication migrations can natively survive a failover (in fact, this will be one of the more important advantages of migrations/vreplication); this calls for some redesign in onlineddl.Executor
we import some initial-setup analysis and preparation code from gh-ost. Some of it is redundant, and will be cleaned up. Some of it overlaps with Vitess functionality (parsing) and will be replaces. But for now we know it's stable and working.

Starting a vreplication schema migration

The flow for starting a vreplication based schema migration from tabletserver is:

ensure there's no vreplication on same workflow
create an empty table LIKE the original table
apply the ALTER TABLE statement onto the empty table
analyze both tables, validate some basic constraints ar met (this is largely imported from gh-ost)
evaluate the vreplication source filter query that only selects relevant columns, and takes care of column renames and of generated columns.
reload schema (because a prev migration might have introduced e.g. a new column)
create a new vreplicatoin entry
start vreplication

tracking a running migration

onlineddl.Executor to keep track of vreplication stream liveness by looking at _vt.vreplication entry.
identify stale migrations and remove them
determine that a migration is ready: copy phase is complete, pos is non empty, time updated and transaction time are both up-to-date (small or no lag)

cutting over a vreplication migration

The flow is:

We don't use wrangler. Create our own tablet manager client.
Get hold of Tablet
Get hold of ShardInfo
Get hold of topology server (already exists in onlineddl.Executor)
read vreplication information
lock keyspace
stop writes on source
read up-to-date stream (specifically, pos)
wait for pos
stop vreplication. it is now up to date with original table
swap source and target tables from beneath vreplication
all good
(in the future, this flow will actually continue; more writeup in the future)

More to come.

shlomi-noach · 2021-02-16T08:59:24Z

It should be possible to ReloadTable, like ReloadSchema but for a single table. This is desired for online DDL. Reloading an entire schema takes time and we cannot expect to accomplish it within the timeframe of a cut-over. But reloading a single table should be just fine.

support ReloadTable

shlomi-noach · 2021-03-02T12:27:46Z

Revert for Online DDL is now available (per PR review) via #7478

shlomi-noach self-assigned this Oct 22, 2020

shlomi-noach mentioned this issue Oct 22, 2020

Online DDL: followups in multiple trajectories #6901

Merged

shlomi-noach mentioned this issue Nov 9, 2020

OnlineDDL: migration should fail if parent vttablet dies #7005

Closed

sougou added Component: Cluster management P2 Type: Feature labels Nov 13, 2020

This was referenced Nov 15, 2020

Adding ddl_strategy session variable #7042

Merged

Online DDL: ddl_strategy session variable and vtctl command line argument #7045

Merged

shlomi-noach mentioned this issue Dec 2, 2020

Support CREATE, DROP statements in ApplySchema and online DDL #7083

Merged

This was referenced Dec 2, 2020

Online DDL: ddl_type column #7097

Merged

OnlineDDL: "cancel-all" command to cancel all pending migrations in keyspace #7099

Merged

shlomi-noach mentioned this issue Dec 9, 2020

Support vtctl OnlineDDL <keyspace> show <context> #7145

Merged

6 tasks

This was referenced Dec 10, 2020

Normalizing Online-DDL queries #7153

Merged

Online DDL endtoend tests to support MacOS #7168

Merged

Online DDL: ddl_strategy=direct #7172

Merged

shlomi-noach mentioned this issue Dec 23, 2020

Online DDL: DROP TABLE translated to RENAME TABLE statement #7221

Merged

7 tasks

shlomi-noach mentioned this issue Jan 7, 2021

Adding @@session_uuid to vtgate; used as 'context' by Online DDL #7263

Merged

7 tasks

This was referenced Jan 27, 2021

OnlineDDL: update gh-ost binary to v1.1.1 #7394

Merged

Online DDL via VReplication #7419

Merged

This was referenced Feb 10, 2021

OnlineDDL: Revert for VReplication based migrations #7478

Merged

VReplication based online DDL: mini stress test CI #7492

Merged

shlomi-noach mentioned this issue Mar 3, 2021

ApplySchema: -skip_preflight #7587

Merged

8 tasks

shlomi-noach mentioned this issue Dec 14, 2022

Deprecating VExec part1: removing client-side references #11955

Merged

3 tasks

shlomi-noach mentioned this issue Jan 2, 2023

OnlineDDL: 'mysql' strategy, managed by the scheduler, but executed via normal MySQL statements #12027

Merged

3 tasks

shlomi-noach mentioned this issue Jan 17, 2023

Feature Request: Online DDL, in-order completion of migrations #12112

Closed

This was referenced Feb 6, 2023

onlineddl_vrepl suite: fix auto_increment flakyness #12246

Merged

Online DDL: remove legacy "stowaway table" logic #12288

Merged

Fixing onlineddl_vrepl flakiness, and adding more tests #12325

Merged

This was referenced Feb 22, 2023

Backport to v16: onlineddl_vrepl flakiness and subsequent fixes #12426

Merged

OnlineDDL: mitigate scenario where a migration sees recurring cut-over timeouts #12451

Merged

shlomi-noach mentioned this issue Mar 9, 2023

Online DDL: remove artifact entry upon GC #12592

Merged

4 tasks

ajm188 removed the P2 label Mar 9, 2023

shlomi-noach mentioned this issue Mar 12, 2023

Online DDL: ready_to_complete race fix #12612

Merged

4 tasks

This was referenced Apr 13, 2023

gh-ost migrations: improved error log message #12882

Merged

OnlineDDL: reject partial key coverage in PRIMARY KEY for vitess migrations #12921

Merged

This was referenced Apr 25, 2023

vtctl OnlineDDL: complete command set #12963

Merged

OnlineDDL/vitess: only KILL 'RENAME' statement if not known to be successful #12989

Merged

shlomi-noach mentioned this issue May 24, 2023

Online DDL: better reporting of error message when RENAME fails #13143

Merged

4 tasks

This was referenced Jun 21, 2023

Online DDL: improved row estimation via ANALYE TABLE with --analyze-table strategy flag #13352

Merged

v15 backport: vitess Online DDL atomic cut-over #13376

Merged

This was referenced Aug 1, 2023

[vtctldclient] flags need to be defined to be deprecated #13681

Merged

[OnlineDDL] add label so break works as intended #13691

Merged

shlomi-noach mentioned this issue Aug 22, 2023

v15 backport: Onlineddl: formalize "immediate operations", respect --postpone-completion strategy flag #13832

Merged

3 tasks

shlomi-noach mentioned this issue Jun 30, 2024

Online DDL: remove legacy (and ancient) 'REVERT <uuid>' syntax #16301

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Online DDL: tracking issue #6926

Online DDL: tracking issue #6926

shlomi-noach commented Oct 22, 2020

shlomi-noach commented Oct 22, 2020

shlomi-noach commented Dec 2, 2020 •

edited

Loading

shlomi-noach commented Dec 9, 2020 •

edited

Loading

shlomi-noach commented Dec 9, 2020 •

edited

Loading

shlomi-noach commented Dec 9, 2020 •

edited

Loading

shlomi-noach commented Feb 4, 2021

shlomi-noach commented Feb 16, 2021 •

edited

Loading

shlomi-noach commented Mar 2, 2021

Online DDL: tracking issue #6926

Online DDL: tracking issue #6926

Comments

shlomi-noach commented Oct 22, 2020

TL;DR

The ALTER TABLE problem

Schema migration cycle & steps

What we propose to address

shlomi-noach commented Oct 22, 2020

shlomi-noach commented Dec 2, 2020 • edited Loading

shlomi-noach commented Dec 9, 2020 • edited Loading

shlomi-noach commented Dec 9, 2020 • edited Loading

shlomi-noach commented Dec 9, 2020 • edited Loading

shlomi-noach commented Feb 4, 2021

Online DDL via VReplication

Starting a vreplication schema migration

tracking a running migration

cutting over a vreplication migration

shlomi-noach commented Feb 16, 2021 • edited Loading

shlomi-noach commented Mar 2, 2021

shlomi-noach commented Dec 2, 2020 •

edited

Loading

shlomi-noach commented Dec 9, 2020 •

edited

Loading

shlomi-noach commented Dec 9, 2020 •

edited

Loading

shlomi-noach commented Dec 9, 2020 •

edited

Loading

shlomi-noach commented Feb 16, 2021 •

edited

Loading