Fix bug in SwitchTraffic that wasn't respecting --dry_run for readonly and replica tablets during a resharding event#12992
Merged
mattlord merged 2 commits intovitessio:mainfrom Apr 28, 2023
Conversation
Contributor
Review ChecklistHello reviewers! 👋 Please follow this checklist when reviewing this Pull Request. General
If a new flag is being introduced:
If a workflow is added or modified:
Bug fixes
Non-trivial changes
New/Existing features
Backward compatibility
|
rohit-nayak-ps
added a commit
to planetscale/vitess
that referenced
this pull request
Apr 28, 2023
…icSwitch dry run for reads during resharding tries to actually switch reads and fails Signed-off-by: Rohit Nayak <rohit@planetscale.com>
Member
|
Thanks @austenLacy for the bug report and fix! Can you please fix the DCO for your commit so it passes CI? Also I updated the e2e tests to reproduce this failure and confirm your fix, at 0567da2. Can you also cherry-pick that commit into your PR. Previously dry-run for switching read traffic was only tested for MoveTables, I added it for Reshards as well. |
Signed-off-by: austenLacy <austenlacy@gmail.com>
…icSwitch dry run for reads during resharding tries to actually switch reads and fails Signed-off-by: Rohit Nayak <rohit@planetscale.com>
8b842bb to
99540f4
Compare
Contributor
Author
|
Thanks for the e2e test @rohit-nayak-ps. Just signed off my original commit and cherry picked yours in. |
rohit-nayak-ps
approved these changes
Apr 28, 2023
Contributor
|
I was unable to backport this Pull Request to the following branches: |
maksimov
pushed a commit
to slackhq/vitess
that referenced
this pull request
Sep 9, 2024
…donly and replica tablets during a resharding event (vitessio#12992) * use switcher struct when switching shard reads during a reshard event Signed-off-by: austenLacy <austenlacy@gmail.com> * Create failing test for bug reported in vitessio#12992, where a TrafficSwitch dry run for reads during resharding tries to actually switch reads and fails Signed-off-by: Rohit Nayak <rohit@planetscale.com> --------- Signed-off-by: austenLacy <austenlacy@gmail.com> Signed-off-by: Rohit Nayak <rohit@planetscale.com> Co-authored-by: Rohit Nayak <rohit@planetscale.com>
This was referenced Sep 9, 2024
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
There's a bug when resharding and switching traffic that makes it so it does not respect the
--dry_runflag.❌ does not respect
--dry_runwithout the fix intraffic_switcher.govtctlclient Reshard -- --tablet_types=rdonly,replica SwitchTraffic --dry_run customer.cust2cust I0426 14:42:58.221613 9828 main.go:96] I0426 14:42:58.221425 traffic_switcher.go:399] About to switchShardReads: [], [RDONLY REPLICA], 0 E0426 14:42:58.223739 9828 main.go:96] E0426 14:42:58.223117 traffic_switcher.go:401] switchShardReads failed: Code: INVALID_ARGUMENT keyspace customer is not locked (no locksInfo) E0426 14:42:58.224562 9828 main.go:96] E0426 14:42:58.223567 vtctl.go:2264] keyspace customer is not locked (no locksInfo) I0426 14:42:58.236510 9828 main.go:96] I0426 14:42:58.236337 vtctl.go:2266] Workflow Status: Reads Not Switched. Writes Not Switched Following vreplication streams are running for workflow customer.cust2cust: id=1 on -80/zone1-0000000301: Status: Running. VStream Lag: 0s. id=1 on 80-/zone1-0000000400: Status: Running. VStream Lag: 0s. Reshard Error: rpc error: code = Unknown desc = keyspace customer is not locked (no locksInfo) E0426 14:42:58.251351 9828 main.go:105] remote error: rpc error: code = Unknown desc = keyspace customer is not locked (no locksInfo)✅ does respect
--dry_runwith the fix intraffic_switcher.gotesting on the primary
On v15 I wasn't able to replicate the issue with the primary tablets going to
NOT SERVINGbecauseSwitchTrafficdid respect the dry run flag when dealing with the primary.Related Issue(s)
Checklist
Deployment Notes
None