-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Online DDL: tracking issue #6926
Comments
Logic for auto-retrying a migration: #6901
Docs in vitessio/website#571 |
#7083 supports
|
Once #7083 and #7097 are merged: when
|
|
|
Online DDL via VReplicationAt this time I have clarity as for how this will look like. On one hand, we will go full native and reuse Vitess existing mechanisms. On the other hand, we break apart from the existing flow in multiple ways. POC is in #7419 . It is just the beginning of what online DDL via vreplication will look like -- but already has the initial implementation running. Some design bulletpoints:
Starting a vreplication schema migrationThe flow for starting a vreplication based schema migration from tabletserver is:
tracking a running migration
cutting over a vreplication migrationThe flow is:
More to come. |
It should be possible to
|
Revert for Online DDL is now available (per PR review) via #7478 |
This issue will be the tracking space for all things vitess Online DDL. Note that this issue is created after some substantial work is done:
release-8.0
#6547 served as a long running tracking point; pasting some of #6547 content here for background, purpose and intentions.
TL;DR
Automate away all the complexity of schema migrations. Users issue:
or
(syntax subject to change, see #6782 )
and vitess will schedule an online schema change operation to run on all relevant shards, then proceed to apply the change via
gh-ost
on all shards.The ALTER TABLE problem
First, to iterate the problem: schema changes have always been a problem with MySQL; a straight
ALTER
is a blocking operation; aONLINE ALTER
is only "online" on the master/primary, but is effectively blocking on replicas. Online schema change tools likept-online-schema-change
andgh-ost
overcome these limitations by emulating anALTER
on a "ghost" table, which is populated from the original table, then swapped in its space.Traditionally, online schema changes are considered to be "risky". Trigger based migrations add significant load onto the master server, and their cut-over phase is known to be a dangerous point.
gh-ost
was created at GitHub to address these concerns, and successfully eliminated concerns for operational risks: withgh-ost
the load on the master is low, and well controlled, and the cut-over phase is known to cause no locking issues.gh-ost
comes with different risks: it applies data changes programmatically, thus the issue of data integrity is of utmost importance. Another note of concern is data traffic: going out from MySQL intogh-ost
and back into MySQL (as opposed to all-in MySQL inpt-online-schema-change
).This way or the other, running an online schema change is typically a manual operation. A human being will schedule the migration, kick it running, monitor it, possibly cut-over. In a sharded environment, a developer's request to
ALTER TABLE
explodes ton
different migrations, each needs to be scheduled, kicked, monitored & tracked.Sharded environments are obviously common for
vitess
users and so these users feel the pain more than others.Schema migration cycle & steps
Schema management is a process that begins with the user designing a schema change, and ends with the schema being applied in production. This is a breakdown of schema management steps as I know them:
ALTER TABLE
orpt-online-schema-change
orgh-ost
command)What we propose to address
Vitess's architecture uniquely positions it to be able to automate away much of the process. Specifically:
ALTER TABLE
statement into agh-ost
/pt-osc
invocation is super useful if done by vitess, since vitess can not only validate schema/params, but also can provide credentials, apply throttling logic, can instructgh-ost
on how to communicate progress via hooks, etc.vitess
just knows where the table is located. It knows if the schema is sharded. It knows who the shards are, who the shards masters are. It knows where to rungh-ost
. Last,vitess
can tell us which replicas we can use for throttling.vttablet
is the ideal entity to run a migration; can read instructions fromtopo
server and can write progress totopo
server.vitess
is aware of possible master failovers and can request a re-execute is a migration is so interrupted mid process.vtctld
API can offer endpoints to track status of a migration (e.g. "in progress on-80
, in queue on80-
"). It may offer progress pct and ETA.gh-ost
, the cut-over phase is safe to automate away. If running a migration during a resharding operation, then we may need to coordinate cut-over between upstream and downstream migrations.vttablet
runs a table lifecycle service (aka garbage collector) to clean up those tables.The text was updated successfully, but these errors were encountered: