Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Describing safe, blocking, pure-mysql cut-over phase #65

Closed
shlomi-noach opened this issue Jun 14, 2016 · 5 comments
Closed

Describing safe, blocking, pure-mysql cut-over phase #65

shlomi-noach opened this issue Jun 14, 2016 · 5 comments

Comments

@shlomi-noach
Copy link
Contributor

shlomi-noach commented Jun 14, 2016

UPDATE

#82 overrides this. This is no longer in use.


Finally here's a blocking cut-over phase that will, at worst case (connections die throughout cut-over), create a table-outage (easily reversed).

Here are the steps to a safe solution:

We note different connections as C1, C2, ... Cn
We assume original table is tbl, ghost table is ghost.

In the below we note C1, C18, C19 as out own, controlling connections. We first assume no error in the below flow:

  • C1: lock tables tbl write
  • C2, C3, ..., C17: normal app connections, issuing insert, delete, update on tbl. Because of the lock, they are naturally blocked.
  • We apply those last event we need to apply onto ghost. No new events are coming our way because tbl is blocked.
  • C18: checking that C1 is still alive, then rename table tbl to tbl_old. This gets blocked.
  • C19: checking to see that C18's rename is in place (via show processlist), and that C1 is still alive; then issues: rename table ghost to tbl. This gets blocked.
  • (meanwhile more queries approach tbl, it doesn't matter, they all get deprioritized, same as C3...C17)
  • C1: unlock tables

What just happened? Let's first explain some stuff:

  • C18's rename gets prioritized over the DMLs, even though it came later. That is how MySQL prioritizes queries on metadata-locked tables.
  • C18 checks C1 is still alive, but as before, there's always the chance C1 will die just at the wrong time -- we're going to address that.
  • C19 is interested to see that C18 began execution, but potentially C18 will crash by the time C19 actually issues its own rename -- we're going to address that
  • C19's query sounds weird. At that time tbl still exists. You'd expect it to fail immediately -- but it does not. It's valid. This is because tbl's metadata lock is in use.
  • C19 gets prioritized over all the DMLs, but is known to be behind C18. The two stay in same order of arrival. So, C18 is known to execute before C19.
  • When C1 unlocks, C18 executes first.
  • Metadata lock is still in place on tbl even though it doesn't actually exist, because of C19.
  • C19 operates next.
  • Finally all the DMLs execute.

What happens on failures?

  • If C1 dies just as C18 is about to issue the rename, we get an outage: tbl is renamed to tbl_old, and the queries get released and complain the table is just not there.
    • In such case C19 will not initiate because it is executed after C18 and checks that C1 is alive -- which turns to be untrue. So no C19.
    • So we know we have outage, and we quickly rename tbl_old to tbl; and go drink coffee, then give the entire process another try.
    • The outage is unfortunate, but does not put our data in danger.
  • If C1 happens to die just as C19 is about to issue its rename, there's no data integrity issue: at this point we've already asserted the tables are in sync. As C1 dies, C18 will immediately rename tbl to tbl_old. An outage will occur, but not for long, because C19 will next issue rename ghost to tbl, and close the gap. We suffered a minor outage, but no rollback. We roll forward.
  • If C18 happens to die just as C19 is about to issue its rename, nothing bad happens: C19 is still blocking for as long as C1 is running. We find out C18 died, and release C1. C19 attempts to rename ghost to tbl, but tbl exists (we assume C18 failed) and the query fails. The metadata lock is released and all the queries resume operation on the original tbl. The queries suffered a short block, but resume operation automatically. The operation failed but without error. We will need to try the entire cycle again.
  • If both C1 and C18 fail at the time C19 is about to begin its rename, same as above.
  • If C18 fails as C19 is already in place, same as above.
  • If C1 fails as C19 is already in place, it's as good as having it issue the unlock tables. We're happy.
  • If C19 fails at any given point, we suffer outage. We revert by rename tbl_old to tbl.

I'm grateful to reviews over this logic.

@shlomi-noach
Copy link
Contributor Author

An illustrated walk-through:

ghost-cutover-safe 001
ghost-cutover-safe 002
ghost-cutover-safe 003
ghost-cutover-safe 004
ghost-cutover-safe 005
ghost-cutover-safe 006
ghost-cutover-safe 007
ghost-cutover-safe 008
ghost-cutover-safe 009
ghost-cutover-safe 010
ghost-cutover-safe 011
ghost-cutover-safe 012

@shlomi-noach
Copy link
Contributor Author

Worth noting

On replica, this is a non atomic, two step table swap. The lock does not propagate in replication stream.
Two things to observer:

  • on replica, there are no events on tbl between the two renames, because they were synced to binlog on master after we processed the backlog
  • However, there may well be other events from other tables in between. This means some heavyweight operation just might sneak in. This way or the other, we're looking at a table-outage scenario on the replica.

@shlomi-noach
Copy link
Contributor Author

This is implemented in code; so closing.

But: there's now a new suggestion: #82 which I believe to be superior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants
@shlomi-noach and others