Online DDL, CancelMigration: distinguish user-issued vs. internally-issued cancellation#11011
Merged
deepthi merged 1 commit intovitessio:mainfrom Aug 15, 2022
Merged
Conversation
…ssued cancellation Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Contributor
Review ChecklistHello reviewers! 👋 Please follow this checklist when reviewing this Pull Request. General
Bug fixes
Non-trivial changes
New/Existing features
Backward compatibility
|
Contributor
Author
|
Mentioned in a private channel, tests are complex. I'm not sure right now. Reason is that the scenarios where a migration gets cancelled by the scheduler are a bit extreme and will be difficult to reproduce synthetically. |
Collaborator
You may need to build a unit test framework to simulate vreplication stream errors. We can let this through for now, because that will surely take time to do. |
deepthi
approved these changes
Aug 15, 2022
systay
pushed a commit
to planetscale/vitess
that referenced
this pull request
Aug 19, 2022
…ssued cancellation (vitessio#11011) (vitessio#962) Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
mattlord
added a commit
to planetscale/vitess
that referenced
this pull request
Aug 19, 2022
This work makes the following changes:
- Replaces any time.Sleep calls with a call to wait for the appropriate status(es)
- Use WaitForStatus:Failed,Cancelled for cancel tests
- Use CheckForStatus:Failed,Cancelled for cancel tests (instead of just Cancelled)
- As we see both in the CI but we were requiring cancelled
- See: vitessio#11011
Signed-off-by: Matt Lord <mattalord@gmail.com>
3 tasks
mattlord
added a commit
that referenced
this pull request
Aug 23, 2022
* Address additional causes of OnlineDDL test flakiness
This work makes the following changes:
- Replaces any time.Sleep calls with a call to wait for the appropriate status(es)
- Use WaitForStatus:Failed,Cancelled for cancel tests
- Use CheckForStatus:Failed,Cancelled for cancel tests (instead of just Cancelled)
- As we see both in the CI but we were requiring cancelled
- See: #11011
Signed-off-by: Matt Lord <mattalord@gmail.com>
* Always print status from the wait
Signed-off-by: Matt Lord <mattalord@gmail.com>
* Require Cancelled state for for Cancel issued by user
Signed-off-by: Matt Lord <mattalord@gmail.com>
* Remove unnecessary change
* return error if unable to set 'cancelled_timestamp'
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Co-authored-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
#10900 introduced
cancelled_timestampcolumn, which is populated when a migration is cancelled. Then, based on the value of this column, and assuming a problem in the migration, a migration transitions to eitherfailed(if the column isNULL) or tocancelled(if the column isNOT NULL).As quick recap,
gh-ost,pt-oscand evenvreplicationitself, to some extent, act as 3rd party tools for Online DDL. To cancel a migration is to fail the migration; #10900 was made so that we can make a distinction between a "legitimately failed" migration to a "user cancelled" migration.However, we left out a couple use cases, where the Online DDL executor itself may cancel a migration:
In these scenarios, the executor calls
CancelMigration, but we expect the terminal state to befailed, notcancelled.In this PR we make the distinction between a user-generated
CANCEL(e.g. user issued aALTER VITESS_MIGRATION ... CANCELcommand), and an internal-generated cancellation. the latter now leads tofailedstate.Related Issue(s)
Checklist
Deployment Notes