Skip to content

[slack-22.0] restore tablet type during the tabletmanager initialization#741

Merged
tanjinx merged 4 commits intoslack-22.0from
sb-v22-restore-tablet-type
Nov 6, 2025
Merged

[slack-22.0] restore tablet type during the tabletmanager initialization#741
tanjinx merged 4 commits intoslack-22.0from
sb-v22-restore-tablet-type

Conversation

@sbaker617
Copy link
Copy Markdown

@sbaker617 sbaker617 commented Nov 5, 2025

Description

This PR backports the tablet type restoration feature from upstream Vitess to the slack-22.0 branch. It introduces an experimental --init-tablet-type-lookup flag that allows vttablets to restore their previous tablet type from topology records on restart, rather than always reverting to the --init_tablet_type value.

Problem Statement

Currently, when a vttablet restarts, it always reverts to the type specified by the --init_tablet_type flag, regardless of any runtime changes made to the tablet type. This creates operational friction during:

  • Binary upgrades: Administrators must manually update service files to maintain tablet types during version updates
  • Dynamic deployments: Teams that deploy tablets as SPARE and dynamically change them to REPLICA/RDONLY lose these changes on restart

Solution

This change adds a new experimental startup flag --init-tablet-type-lookup that enables tablets to query the topology system for their existing record and restore the previous type. The implementation:

  1. Uses the tablet's alias (cell + UID) to look up the existing topology record
  2. Restores the tablet type from the topology if found
  3. Implements special handling for edge cases:
    • PRIMARY types: Converted to REPLICA and validated through existing checkpoint logic
    • Transient states (BACKUP, RESTORE): Skipped in favor of the init type since these are temporary operational states
    • No topology record: Falls back to the specified --init_tablet_type value

Implementation Details

This backport includes the latest refinements from upstream:

  • Refactored tablet type lookup logic to use a cleaner switch statement for better readability
  • Updated flag documentation to explicitly mark the feature as "Experimental"
  • Clarified that the lookup uses the tablet alias to find the topology record
  • Simplified log messages to improve clarity
  • Enhanced --init_tablet_type help text to include valid values and default

Flag Naming Convention

Note that this v22 backport maintains consistency with existing v22 flag naming:

  • Existing flag: --init_tablet_type (with underscores)
  • New flag: --init-tablet-type-lookup (with dashes, matching upstream)

Related Issue(s)

Upstream Vitess:

Related Slack backports:

Checklist

  • Tests were added or are not required (comprehensive test suite included)
  • Did the new or modified tests pass consistently locally and on CI? (All tablet type lookup tests passing)
  • Documentation was added or is not required

Deployment Notes

This change is opt-in and requires explicitly enabling the --init-tablet-type-lookup flag. The feature is marked as experimental and should be tested thoroughly before production deployment.

When enabled, tablets will maintain their assigned types (RDONLY, DRAINED, etc.) across restarts without requiring service file modifications. This is particularly useful for:

  • Rolling upgrades where tablet types should persist
  • Dynamic environments where tablets are reassigned at runtime
  • Reducing manual operational overhead during maintenance windows

Recommendation: Test in staging environments first to validate behavior with your specific deployment patterns.

Signed-off-by: Stephen Baker <s.baker@slack-corp.com>
@github-actions github-actions bot added this to the v22.0.1 milestone Nov 5, 2025
@sbaker617 sbaker617 marked this pull request as ready for review November 5, 2025 20:59
@sbaker617 sbaker617 requested a review from a team as a code owner November 5, 2025 20:59
@sbaker617 sbaker617 requested a review from tanjinx November 5, 2025 20:59
@tanjinx tanjinx merged commit 6301472 into slack-22.0 Nov 6, 2025
186 of 189 checks passed
@tanjinx tanjinx deleted the sb-v22-restore-tablet-type branch November 6, 2025 20:01
tanjinx added a commit that referenced this pull request Nov 10, 2025
…ation (#741)

* backport tablet type lookup to v22

Signed-off-by: Stephen Baker <s.baker@slack-corp.com>

* more txt adjustments

---------

Signed-off-by: Stephen Baker <s.baker@slack-corp.com>
Co-authored-by: Tanjin Xu <109303790+tanjinx@users.noreply.github.com>
sbaker617 added a commit that referenced this pull request Feb 5, 2026
…ation (#741)

* backport tablet type lookup to v22

Signed-off-by: Stephen Baker <s.baker@slack-corp.com>

* more txt adjustments

---------

Signed-off-by: Stephen Baker <s.baker@slack-corp.com>
Co-authored-by: Tanjin Xu <109303790+tanjinx@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants