-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Describing safe, blocking, atomic, pure-mysql cut-over phase #82
Comments
Hello @shlomi-noach , First thanks for opensourcing gh-ost :) It looks very good :)
|
@baloo, thank you for your review!
There is nothing to ensure that, and there is no need to ensure that; the table creation is merely a step in the direction of beginning the cut-over. There is no problem that Moreover, if we reverse the order as you suggest, implication is we waste more time during the
This doesn't work like that in MySQL, see: http://dev.mysql.com/doc/refman/5.7/en/lock-tables.html:
You cannot just grab more locks as you go along, you have to grab them all at once.
The relevant code in MySQL is https://github.com/mysql/mysql-server/blob/a533e2c786164af9bd276660b972d93649434297/sql/mdl.cc#L2312. You are correct this relies on internal behavior. This behavior exists for over 15 years or so, and, having discussed (informally, Safe Harbour etc.) with engineers - has no intention of going away. I'm glad you point it out, I'll make it explicit in the fine print. This entire scheme came to be because of a limitation in MySQL, where you cannot |
Thanks for clarifications :) |
Noteworthy that as of MySQL
|
why need to lock the origin table? as i already know, rename operation is in-place, no need to rebuild table and permit concurrent DML. will that happen? or there are some other reasons? mark @youzipi |
True, but because |
Is it necessary to lock the |
It is necessary but for a different reason than you'd expect: in MySQL, if session A runs a |
but |
You're right. I wasn't paying enough attention.
I don't understand the question. Facebook did not use any of this, they used two My solution avoids that and ensures the table exists at all times. |
hello why can't use rename t1 to t1_del, t1_gho to t1 directly, are there some problems? |
It can’t happen because gh-ost tails the binary log, either on master or on the replica, and there may be still events to handle |
Thanks. |
Hi,has gh-ost already utilize this new functionality? |
It is unfortunate that the implementation actually does not meet gh-ost's requirement, and using the new implementation sadly complicates the logic rather than simplifies it. I therefore made no progress meeting the new implementation. |
Can I ask why the new implementation can't simplify the logic? I don't understand by myself.o(╥﹏╥)o |
It was a couple years ago that I checked. It was about who has to be the
owner of the lock. I honestly don’t remember now what the complexity was,
and I didn’t write anything public. Sorry about that, and I’m not looking
into it now.
…On Fri, Jun 11, 2021 at 17:28 wukongHH ***@***.***> wrote:
Can I ask why the new implementation can't simplify the logic? I don't
understand by myself.o(╥﹏╥)o
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#82 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAT4WPWEZM3PA5MYWA2YOJ3TSIMRTANCNFSM4CH2HVNA>
.
|
This entry should be incorrect. In this case, the table name should be cut-over normally, because after the session is broken, the lock is released on C10 and C20 will get the lock and cut-over table directly. Finally gh-ost will fail with error |
I don't think I have the bandwidth to solve this again. Unless someone else has a definitive solution, consider using |
Final-finally-finalizationally, here's an asynchronous, safe, atomic cut-over phase.
This solution doesn't cause "table outage" (as in #65).
Here are the steps for a safe, atomic cut-over:
The solution we offer is now based on two connections only (as opposed to three, in the optimistic approach). "Our" connections will be C10, C20. The "normal" app connections are C1..C9, C11..C19, C21..C29.
Connections C1..C9 operate on
tbl
with normal DML:INSERT, UPDATE, DELETE
Connection C10:
CREATE TABLE tbl_old (id int primary key) COMMENT='magic-be-here'
Connection C10:
LOCK TABLES tbl WRITE, tbl_old WRITE
Connections C11..C19, newly incoming, issue queries on
tbl
but are blocked due to theLOCK
Connection C20:
RENAME TABLE tbl TO tbl_old, ghost TO tbl
LOCK
, but gets prioritized on top connections C11..C19 and on top C1..C9 or any other connection that attempts DML ontbl
Connections C21..C29, newly incoming, issue queries on
tbl
but are blocked due to theLOCK
and due to theRENAME
, waiting in queueConnection C10: checks that C20's
RENAME
is applied (looks for the blockedRENAME
inshow processlist
)Connection 10:
DROP TABLE tbl_old
Nothing happens yet;
tbl
is still locked. All other connections still blocked.Connection 10:
UNLOCK TABLES
BAM! The
RENAME
is first to execute, ghost table is swapped in place oftbl
, then C1..C9, C11..C19, C21..C29 all get to operate on the new and shinytbl
Some notes
tbl_old
as a blocker for a premature swapDROP
a table it has under aWRITE LOCK
RENAME
is always prioritized over a blockedINSERT/UPDATE/DELETE
, no matter who came firstWhat happens on failures?
Much fun. Just works; no rollback required.
CREATE
we do not proceed.LOCK
statement, we do not proceed. The table is not locked. App continues to operate as normal.RENAME
:tbl
.RENAME
immediately fails becausetbl_old
exists.RENAME
: Mostly similar to the above. Lock released, then C20 fails theRENAME
(becausetbl_old
exists), then all queries resume normal operationDROP
,UNLOCK
. Nothing terrible happens, some queries were blocked for some time. We will need to retryDROP
s the table but before the unlock, same as above.LOCK
is cleared;RENAME
lock is cleared. C1..C9, C11..C19, C21..C29 are free to operate ontbl
.No matter what happens, at the end of operation we look for the
ghost
table. Is it still there? Then we know the operation failed, "atomically". Is it not there? Then it has been renamed totbl
, and the operation worked atomically.A side note on failure is the matter of cleaning up the magic
tbl_old
. Here this is a matter of taste. Maybe just let it live and avoid recreating it, or you can drop it if you like.Impact on app
App connections are guaranteed to be blocked, either until
ghost
is swapped in, or until operation fails. In the former, they proceed to operate on the new table. In the latter, they proceed to operate on the original table.Impact on replication
Replication only sees the
RENAME
. There is noLOCK
in the binary logs. Thus, replication sees an atomic two-table swap. There is no table-outage.The text was updated successfully, but these errors were encountered: