-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: NativeDDL cut-over with MySQL 8.0 locking #8573
WIP: NativeDDL cut-over with MySQL 8.0 locking #8573
Conversation
…while tables are locked Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
…or/vcopier/vplayer Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
…s MySQL 8.0 cut-over Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
Very curiously on GitHub CI the test But I am able to reproduce test failures on a dev machine 😬 |
This does not show great promise. We're likely to solve NativeDDL cut-over with vtgate buffering, instead. |
This is not working as one might expect. We've decided that the path forward is by caching queries. Closing this experimental (yet educational) branch. |
Right now this is not working
But I may as well submit the changes as Draft and explain what we're trying to achieve.
Description
What are we trying to achieve
This is an attempt to create an atomic cut-over for NativeDDL (OnlineDDL via VReplication). The current NativeDDL cut-over is done in two steps: writes to the original table are rejected, we wait for the final catchup of binary logs, we then swap original and new tables.
During this wait time, queries on the original table are rejected. We want them to be blocked, instead. That is, we want the queries to wait, and then proceed to execute on the newly instated table. This is what we call a blocking/atomic cut-over. It means queries are not rejected, though they do pile up during the wait time.
gh-ost
uses an elaborate mechanism to achieve that. Recently, a few users found holes in the logic and were able to demonstrate how queries can be lost during the cut-ver time (the queries would execute on the old table after renamed way, rather than on the new table once instated).Also, The
gh-ost
mechanism involves multiple moving parts working in fine synchronization and is not viable for Vitess. Vitess NativeDDL runs in different planes that have one-way gRPC communication (control plane calls vreplication, but not the other way around).At the time, Oracle/MySQL developed a feature for
gh-ost
at my specific request, that would allow a cleaner logic for cut-over. Unfortunately, the feature was implemented in a way that only made it more complex to use withgh-ost
. The feature was implemented in8.0.13
, and this PR attempts to utilize this functionality for NativeDDL.Right now without success.
What's the MySQL 8.0.13 locking feature and what's the implementation problem?
Before MySQL
8.0.13
it was impossible toLOCK TABLES ...
and thenRENAME TABLE ...
from the same connection.My intention was for
gh-ost
's control plane to be able toLOCK TABLES original_table WRITE
, then wait for catchup, and follow up withRENAME TABLE original_table TO old_table, ghost_table TO original_table
. In the same connection.However, the way the feature was implemented, you have to hold locks on all tables in your
RENAME
statement. So, if you're holding any locks at all duringRENAME TABLE original_table TO old_table, ghost_table TO original_table
, then you must specifically hold locks onoriginal_table
and onghost_table
.This changes the logic and viability of the feature dramatically.
I should note that in general the reasoning is inline with the general MySQL
LOCK
theme: when you haveLOCK
s, you may only operate on locked tables (and this gets even more complicated as we'll see below). This is fine, it's just that this wasn't what I wanted to achieve forgh-ost
.So, why is this so dramatic? Because now we can't let the control plane run the
LOCK
. Why? Because it then locksghost_table
. And, the whoile point is that we want to lock the original table, and then apply more changes to the ghost table, based on final binary log events.But if the ghost table is locked, then the only entity that can write to the ghost table is the applier, which is on a different plane than the controller.
Specifically in Vitess, the control plane is
onlineddl.Executor
. The applier isVReplication
itself. Normally, VReplication should be ignorant of the specifics of the operation, and the controller tells it how to run, when to step, etc.But now, it seems VReplication has to be the one that takes the locks. Here's where it gets more complicated, in different trajectories:
RENAME
and finalUNLOCK
tables. This means VReplication now needs to have substantial online DDL logic. Ideally, only the controller should have that logic, but there's no choice here.LOCK
,RENAME
,UNLOCK
have to happen from within the same connection, it also has to be the same connection used by vreplication itself. Speficially, VPlayer must be able to write to the ghost table. It must use the same connection used for locking the table.mysql.Conn
is not thread safe and we need to be able to sync queries invoked from gRPC with queries invoked by VPlayervreplication
and its controller and that also destroys the dbClient/connection. But we stil need that connection open because we yet need toRENAME
andUNLOCK
._vt.vreplication
table. It does so transactionally toVCopier
andVPlayer
. Recall that in MySQL, if you have any locks at all and write to some table, you must lock that table. This means we need to extendLOCK TABLES
to include_vt.vreplication
. This further messes up concurrency and sycnhronization as other moving parts access_vt.vreplication
, and, if that table isLOCK
ed, they can't operate.waitForPos
and `stopping vreplication.Some notable changes in the code:
dbClient
now has a mutex to make it thread safe to be used from various moving parts.controller
now exposesdbClient
so that it can be used from within vreplication'sEngine
Engine
subscribes as an "interested party" for the controller'sdbClient
. The controller will not destroydbClient
(hence, will not close the connection) beforeEngine
completes the cut-over.Tests
I revived an older Online DDL stress test, and adapted it for the new logic. Introducing:
.github/workflows/cluster_endtoend_onlineddl_vrepl_stress80.yml
go/test/endtoend/onlineddl/vrepl_stress_80/onlineddl_vrepl_stress_mysql80_test.go
Together, these:
8.0.13
, which is an "old" version by now)insert
andupdate
statements on the table, with a predictable effect, which is recorder. After the stress workload completes, we compare the table's data with what we've predicted. There must be a match.ALTER TABLE
just to validate the data/prediction logic.alter table... engine=innodb
. We validate that the migration utilizes the new 8.0 locking mechanism.THIS SADLY FAILS
The tests can show inconsistencies_. They're able to show some statements are wrongly executed on the old table but not on the new table. We find rows in the old table that do not exist in the new table (shouldn't happen, only
insert
andupdate
). We keep an increasing counter that's used for eachinsert
andupdate,
and We find rows with higher counter values on the old table.This shows failure of the logic. I am yet unable to explain it. It is likely related to MySQL internal locking queue behavior, and I suspect some similar gotchas as presented by
gh-ost
users.Right now not sure how to proceed
The code is here, it's not pretty, and it's not working. I'm not sure if we have a path to make this work. Meanwhile I hope the above clarifies what we are trying to do.
Related Issue(s)
Checklist
cc @rohit-nayak-ps @deepthi @sougou @rbranson