Skip to content

Fix vtgate_buffer test#5291

Merged
sougou merged 2 commits intovitessio:reparent-refactorfrom
planetscale:fix-reparent-refactor
Oct 11, 2019
Merged

Fix vtgate_buffer test#5291
sougou merged 2 commits intovitessio:reparent-refactorfrom
planetscale:fix-reparent-refactor

Conversation

@enisoc
Copy link
Copy Markdown
Member

@enisoc enisoc commented Oct 10, 2019

The main problem was that we now do a full DemoteMaster on the old master, even in the case of TER, as long as disable_active_reparent is false. This takes a lot longer (because shutting down the queryservice takes 30s before in-progress transactions are cancelled), so the test was timing out at a couple different levels (contexts and waiting for vtgate to find enough healthy tablets).

This fixes the test by fixing the timeouts, but I think we should also avoid doing a full DemoteMaster in the case of TER. I'll optimize that in a future PR though. For now, let's at least get the tests green.

Signed-off-by: Anthony Yeh <enisoc@planetscale.com>
Signed-off-by: Anthony Yeh <enisoc@planetscale.com>
@enisoc enisoc requested a review from deepthi October 10, 2019 22:31
@enisoc enisoc requested a review from sougou as a code owner October 10, 2019 22:31
Copy link
Copy Markdown
Collaborator

@deepthi deepthi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -187,25 +189,34 @@ func syncShardMaster(ctx context.Context, ts *topo.Server, tablet *topodatapb.Ta
// If active reparents are disabled, we don't touch our MySQL.
// We just directly update our tablet type to REPLICA.
func (agent *ActionAgent) abortMasterTerm(ctx context.Context, masterAlias *topodatapb.TabletAlias) error {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at some point we should rename this to endMasterTerm.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name was meant to convey that this is not the "graceful" way for a master term to come to an end. If we hit this code path, it means we should sort of "force demote" ourselves because another master has already taken over behind our backs. Do you think abort is a bad way to describe this?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds right, but it means that we are ending a master term in a not "graceful" way with the current implementation of PRS. Are we ok with that?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PRS should have already called DemoteMaster to gracefully step down before we get to this code path. We will still technically execute this code path, but once we make these idempotent, there will be no net effect from this code path in the case of a successful PRS.

@sougou sougou merged commit d5eee3d into vitessio:reparent-refactor Oct 11, 2019
@enisoc enisoc deleted the fix-reparent-refactor branch October 11, 2019 03:43
@sougou
Copy link
Copy Markdown
Contributor

sougou commented Oct 11, 2019

Oops! Next time I'll make sure to check that it's a master PR before merging :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants