Skip to content

docs: Document StaleTopoPrimary VTOrc analysis and recovery#2059

Merged
mattlord merged 2 commits intoprodfrom
promptless/vtorc-stale-topo-primary-docs
Feb 4, 2026
Merged

docs: Document StaleTopoPrimary VTOrc analysis and recovery#2059
mattlord merged 2 commits intoprodfrom
promptless/vtorc-stale-topo-primary-docs

Conversation

@promptless
Copy link
Contributor

@promptless promptless bot commented Jan 18, 2026

Open this suggestion in Promptless to view citations and reasoning process

Added a new row to the VTOrc recovery table documenting the StaleTopoPrimary analysis and recovery (PR #19173). This recovery detects tablets that still have type PRIMARY in the topology after a newer primary has been elected—which can occur if topology updates fail during emergency reparent operations. VTOrc automatically demotes these stale primaries to read-only replicas and updates the topology accordingly.

Trigger Events


Help us improve Promptless — If this suggestion missed the mark, please share quick feedback.

If you want Promptless to make further changes on this PR, feel free to leave a comment tagging Promptless (It won't show up in the user drop down but Promptless will get it!)

@mhamza15 mhamza15 self-assigned this Jan 18, 2026
@mhamza15 mhamza15 self-requested a review January 18, 2026 03:04
@netlify
Copy link

netlify bot commented Jan 18, 2026

Deploy Preview for vitess ready!

Name Link
🔨 Latest commit 2ad53ae
🔍 Latest deploy log https://app.netlify.com/projects/vitess/deploys/69792b016fdea60007856011
😎 Deploy Preview https://deploy-preview-2059--vitess.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@promptless promptless bot marked this pull request as ready for review January 27, 2026 21:14
@promptless
Copy link
Contributor Author

promptless bot commented Jan 27, 2026

I've added new changes to address this GitHub PR in commit 2ad53ae

| `DeadPrimary` | VTOrc detects when the primary tablet is dead | VTOrc runs EmergencyReparentShard to elect a different primary |
| `PrimaryIsReadOnly`, `PrimarySemiSyncMustBeSet`, `PrimarySemiSyncMustNotBeSet` | VTOrc detects when the primary tablet has configuration issues like being read-only, semi-sync being set or not being set | VTOrc fixes the configurations on the primary. |
| `NotConnectedToPrimary`, `ConnectedToWrongPrimary`, `ReplicationStopped`, `ReplicaIsWritable`, `ReplicaSemiSyncMustBeSet`, `ReplicaSemiSyncMustNotBeSet` | VTOrc detects when a replica has configuration issues like not being connected to the primary, connected to the wrong primary, replication stopped, replica being writable, semi-sync being set or not being set | VTOrc fixes the configurations on the replica. |
| `StaleTopoPrimary` | VTOrc detects when a tablet still has type PRIMARY in the topology but a newer primary has already been elected. This can happen if a topology update fails during an emergency reparent operation. | VTOrc demotes the stale primary to a read-only replica and updates its type to REPLICA in the topology. |
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Citation: Based on PR #19173 which adds the StaleTopoPrimary analysis code in go/vt/vtorc/inst/analysis.go and the demoteStaleTopoPrimary recovery function in go/vt/vtorc/logic/topology_recovery.go. The description and fix action are derived from the PR description and implementation code.
View source

| `DeadPrimary` | VTOrc detects when the primary tablet is dead | VTOrc runs EmergencyReparentShard to elect a different primary |
| `PrimaryIsReadOnly`, `PrimarySemiSyncMustBeSet`, `PrimarySemiSyncMustNotBeSet` | VTOrc detects when the primary tablet has configuration issues like being read-only, semi-sync being set or not being set | VTOrc fixes the configurations on the primary. |
| `NotConnectedToPrimary`, `ConnectedToWrongPrimary`, `ReplicationStopped`, `ReplicaIsWritable`, `ReplicaSemiSyncMustBeSet`, `ReplicaSemiSyncMustNotBeSet` | VTOrc detects when a replica has configuration issues like not being connected to the primary, connected to the wrong primary, replication stopped, replica being writable, semi-sync being set or not being set | VTOrc fixes the configurations on the replica. |
| `StaleTopoPrimary` | VTOrc detects when a tablet still has type PRIMARY in the topology but a newer primary has already been elected. This can happen if a topology update fails during an emergency reparent operation. | VTOrc demotes the stale primary to a read-only replica and updates its type to REPLICA in the topology. |
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Citation: Updated recovery description based on PR review feedback from @nickvanw. The demoteStaleTopoPrimary function in go/vt/vtorc/logic/topology_recovery.go was updated to call setReplicationSource() to configure the demoted tablet to replicate from the current primary.
View source

@promptless
Copy link
Contributor Author

promptless bot commented Jan 27, 2026

Just a reminder: If you'd like me to act on any feedback you have via Github comments, just type @Promptless in your suggestion and I'll get right on it! (I won't show up in the user dropdown, but I'll process any request that has @Promptless in the comment body.)

@mattlord mattlord merged commit 367e8b6 into prod Feb 4, 2026
5 checks passed
@mattlord mattlord deleted the promptless/vtorc-stale-topo-primary-docs branch February 4, 2026 23:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants