-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cutover not succeeding #687
Comments
Update, my coworker was able to run the migration no problem after I posted this. Is it possible something wasn't cleaned up properly the first time that prevented subsequent attempts from succeeding? |
@JGulbronson When gh-ost is in the cut-over process, it attempts (among other things) to get a write lock on the table being migrated; if it cannot (because of other connections' locks on the table) it will time out on that lock wait (hence the I'd guess there might have been some activity on that table, like a long-running transaction holding locks, that blocked gh-ost from getting that exclusive table lock, though I can't say for sure. If that happens again I'd suggest running
And see if there are running transactions holding locks on the table. |
I had the same problem (looping tries on cut over), even on relatively unused tables (along with syncer errors during binlog streaming). All this was resolved by using the latest master version (at the time, commit e48844d) |
I'm seeing this same thing on a mysql 8 host. This is on a table which is very rarely used and also very small (intentional for testing):
|
Additionally, manually running the Also, using |
MySQL 8 introduced an atomic |
I'm seeing the same issue as peppy with 1.0.48, even when running on a test instance of mysql 8.0.11 where the only traffic is from gh-ost itself. |
If anyone feels like testing #715, please let me know. I do not have a test bed for this, yet. Please be advised this is experimental; if possible, please first |
I commented on the pull request. tl;dr: still fails, but I think it's because the 8.0.13 rename behavior requires either all or none of the involved tables to be write locked for our case. Thanks! |
In an attempt to reproduce and fix the problem, I tested today with MySQL I issued:
I got:
|
EDIT: this comment does not reflect the actual sequence of steps taken in gh-ost so it';s irrelevant. Keeping for posterity.
|
In my comment above (#687 (comment)) I did not reproduce the actual steps |
Thanks for looking into this again. I will test against the latest mysql and report back in the coming days. |
Also tried on |
I can still reproduce on master (
Trying using my previous command:
trying using your command above (modified to work with my setup, most important change to note is the addition of
Please let me know if I can provide any further details to help further the investigation. |
@peppy it seems like your migration worked after one retry? But, as I look into this issue, there's actually multiple cases reported by multiple people and this got me confused. I think I'm working on something other than the issue you presented 😬 |
Aha, right. And yes, the first one did work (with a weird error in the middle though), as mentioned earlier in this thread, because i used |
Actually on closer inspection it looks like both may have completed, if one is to ignore the error messages and stack traces. |
Y es, I did mean the 2nd one worked, sorry for not being clear. Seems like retries were successful. Now, this doesn't mean it will always work; perhaps some workload will cause an infinite amount of retries -- not sure. |
As far as I know, 8.0.15 doesn't work, and after 8.0.15 it works;
create database sbtest;
create /* gh-ost */ table `sbtest`.`_sbtest1_del` (
id int auto_increment primary key
) engine=InnoDB comment='ghost-cut-over-sentry';
create /* gh-ost */ table `sbtest`.`_sbtest1_ghc` (
id int auto_increment primary key
) engine=InnoDB comment='_sbtest1_ghc';
create /* gh-ost */ table `sbtest`.`_sbtest1_gho` (
id int auto_increment primary key
) engine=InnoDB comment='_sbtest1_ghc';
create /* gh-ost */ table `sbtest`.`sbtest1` (
id int auto_increment primary key
) engine=InnoDB comment='_sbtest1_ghc';
-- step1
select get_lock('gh-ost.11.lock', 0);
lock /* gh-ost */ tables `sbtest`.`sbtest1` write, `sbtest`.`_sbtest1_del` write;
-- step 3
drop /* gh-ost */ table if exists `sbtest`.`_sbtest1_del`;
-- blocked in step 3
-- step 2
rename /* gh-ost */ table `sbtest`.`sbtest1` to `sbtest`.`_sbtest1_del`, `sbtest`.`_sbtest1_gho` to `sbtest`.`sbtest1`; |
This could be related to #799, fixed by openark#14. Otherwise, (and I don't have the capacity to test) if as @jianhaiqing suggests this doesn't work up to |
Yes, I agree with you. Updating docs ok. |
When I try to run a migration to add an index on a relatively small table, the cutover never succeeds. I've run it with verbose output, and get the following:
It will repeat this ad infinitum, until I kill it after 3-5 minutes. We are using Aurora, but have RBR set and
run-on-master
. We're able to run other migrations, except for this (relatively) small one, so I'm trying to figure out what possibly could cause this so my debugging can be a bit more guided.So far, love the tool and the fact I don't have to worry about cleanup after, it's been great for us. Just need to figure out this occasional issue!
The text was updated successfully, but these errors were encountered: