Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add --test-on-replica-manual-replication-control flag #174

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 8 additions & 7 deletions go/base/context.go
Original file line number Diff line number Diff line change
Expand Up @@ -82,13 +82,14 @@ type MigrationContext struct {
ServeSocketFile string
ServeTCPPort int64

Noop bool
TestOnReplica bool
MigrateOnReplica bool
OkToDropTable bool
InitiallyDropOldTable bool
InitiallyDropGhostTable bool
CutOverType CutOver
Noop bool
TestOnReplica bool
MigrateOnReplica bool
TestOnReplicaSkipReplicaStop bool
OkToDropTable bool
InitiallyDropOldTable bool
InitiallyDropGhostTable bool
CutOverType CutOver

TableEngine string
RowsEstimate int64
Expand Down
8 changes: 8 additions & 0 deletions go/cmd/gh-ost/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,7 @@ func main() {

executeFlag := flag.Bool("execute", false, "actually execute the alter & migrate the table. Default is noop: do some tests and exit")
flag.BoolVar(&migrationContext.TestOnReplica, "test-on-replica", false, "Have the migration run on a replica, not on the master. At the end of migration replication is stopped, and tables are swapped and immediately swap-revert. Replication remains stopped and you can compare the two tables for building trust")
flag.BoolVar(&migrationContext.TestOnReplicaSkipReplicaStop, "test-on-replica-skip-replica-stop", false, "When --test-on-replica is enabled, do not issue commands stop replication (requires --test-on-replica)")
flag.BoolVar(&migrationContext.MigrateOnReplica, "migrate-on-replica", false, "Have the migration run on a replica, not on the master. This will do the full migration on the replica including cut-over (as opposed to --test-on-replica)")

flag.BoolVar(&migrationContext.OkToDropTable, "ok-to-drop-table", false, "Shall the tool drop the old table at end of operation. DROPping tables can be a long locking operation, which is why I'm not doing it by default. I'm an online tool, yes?")
Expand Down Expand Up @@ -149,6 +150,13 @@ func main() {
if migrationContext.SwitchToRowBinlogFormat && migrationContext.AssumeRBR {
log.Fatalf("--switch-to-rbr and --assume-rbr are mutually exclusive")
}
if migrationContext.TestOnReplicaSkipReplicaStop {
if !migrationContext.TestOnReplica {
log.Fatalf("--test-on-replica-skip-replica-stop requires --test-on-replica to be enabled")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will crash the migration at its very end -- after potentially hours of working. Instead, we can verify this in main.go -- at the very beginning of the migration.
Otherwise looking good!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell, the above is happening before the migration. This is in already in main.go, and the migration starts on line 198. Unless I am missing something...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My mistake!

}
log.Warning("--test-on-replica-skip-replica-stop enabled. We will not stop replication before cut-over. Ensure you have a plugin that does this.")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made mention of plugins (prematurely) in the warning because I don't think this feature will really work without them.

}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shlomi-noach , in my opinion the code reads more complicated to me when there are two flags that have the same core functionality, with this one slight variation.

I think the semantics would be more clear if we still used the --test-on-replica flag and had a separate flag that just customized this one aspect of its behavior. So if one wants to have manual replication control, one would use the args:

--test-on-replica --manual-replication-control

Without the extra flag, --test-on-replica would behave as it always has.

What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was my original intention. The flag is good to be named --test-on-replica-skip-stop-replica or whatever, but it would be an addon to --test-on-replica. Not only are the flags not mutually exclusive, the --test-on-replica-skip-stop-replica is only applicable when --test-on-replica exists.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I understand now. My apologies for the confusion. I'll update the logic.


switch *cutOver {
case "atomic", "default", "":
migrationContext.CutOverType = base.CutOverAtomic
Expand Down
1 change: 1 addition & 0 deletions go/logic/applier.go
Original file line number Diff line number Diff line change
Expand Up @@ -574,6 +574,7 @@ func (this *Applier) StopReplication() error {
if err := this.StopSlaveSQLThread(); err != nil {
return err
}

readBinlogCoordinates, executeBinlogCoordinates, err := mysql.GetReplicationBinlogCoordinates(this.db)
if err != nil {
return err
Expand Down
11 changes: 8 additions & 3 deletions go/logic/migrator.go
Original file line number Diff line number Diff line change
Expand Up @@ -477,9 +477,14 @@ func (this *Migrator) cutOver() (err error) {
// the same cut-over phase as the master would use. That means we take locks
// and swap the tables.
// The difference is that we will later swap the tables back.
log.Debugf("testing on replica. Stopping replication IO thread")
if err := this.retryOperation(this.applier.StopReplication); err != nil {
return err

if this.migrationContext.TestOnReplicaSkipReplicaStop {
log.Warningf("--test-on-replica-skip-replica-stop enabled, we are not stopping replication.")
} else {
log.Debugf("testing on replica. Stopping replication IO thread")
if err := this.retryOperation(this.applier.StopReplication); err != nil {
return err
}
}
// We're merly testing, we don't want to keep this state. Rollback the renames as possible
defer this.applier.RenameTablesRollback()
Expand Down