Skip to content

Pass in cell to EnsureVSchema and GetOrCreateShard#5930

Merged
rafael merged 7 commits intovitessio:masterfrom
tinyspeck:zmagg_rebuild_srv_schema_no_multicell
Apr 20, 2020
Merged

Pass in cell to EnsureVSchema and GetOrCreateShard#5930
rafael merged 7 commits intovitessio:masterfrom
tinyspeck:zmagg_rebuild_srv_schema_no_multicell

Conversation

@zmagg
Copy link
Copy Markdown
Contributor

@zmagg zmagg commented Mar 17, 2020

Description

Modifies EnsureVSchema so that we always pass in a cell and it rebuilds SrvVSchema only in the provided cell. This removes its auto-rebuild of all cells automatically when you're creating a new shard or keyspace.

Background

During a setup of a new region, we had a misconfigured topology and spinning up a new shard caused us to corrupt all cells in the topology, just as part of booting up a new tablet. For safety reasons, it seems better to decouple spinning up a new shard from rebuilding SrvVSchema on all cells. As talked about in person a few weeks ago--we'll keep the rebuild for the simple case where there's only one cell.

Signed-off-by: Maggie Zhou mzhou@slack-corp.com

@zmagg zmagg requested a review from sougou as a code owner March 17, 2020 06:18
@zmagg zmagg force-pushed the zmagg_rebuild_srv_schema_no_multicell branch 2 times, most recently from 6109866 to 29b8e36 Compare March 19, 2020 17:56
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we change EnsureVSchema to accept a cells list to rebuild, and we have each tablet rebuild only its own cell? Would that still have caused the problem you had?

Copy link
Copy Markdown
Contributor Author

@zmagg zmagg Mar 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I like that a lot more. It still does the nice thing for single cell setups, with the benefit of being easier to reason about overall. I'll switch to that approach!

That would've been fine. The problem was that the tablet in the init loop wasn't just rebuilding its own cell it was rebuilding all the other cells.

@zmagg zmagg force-pushed the zmagg_rebuild_srv_schema_no_multicell branch 8 times, most recently from a70acbf to 97268cb Compare April 15, 2020 18:01
Copy link
Copy Markdown
Member

@rafael rafael left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maggie this is looking great. I added minor comments.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like it should not be removed right?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit - remove extra new line.

@zmagg zmagg changed the title Skip RebuildSrvVschema in InitTablet for multi-cell configurations Pass in cell to EnsureVSchema and GetOrCreateShard Apr 15, 2020
Copy link
Copy Markdown
Member

@enisoc enisoc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with one optional suggestion.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could also just pass an empty cells list to mean "all cells", since RebuildSrvVSchema already knows how to expand that by doing this same thing.

@zmagg zmagg closed this Apr 15, 2020
@zmagg zmagg reopened this Apr 15, 2020
@zmagg zmagg force-pushed the zmagg_rebuild_srv_schema_no_multicell branch from f927513 to c2deff9 Compare April 16, 2020 06:16
zmagg added 6 commits April 16, 2020 00:00
Signed-off-by: Maggie Zhou <mzhou@slack-corp.com>
Signed-off-by: Maggie Zhou <mzhou@slack-corp.com>
Signed-off-by: Maggie Zhou <mzhou@slack-corp.com>
Remove empty line.

Signed-off-by: Maggie Zhou <mzhou@slack-corp.com>
Signed-off-by: Maggie Zhou <mzhou@slack-corp.com>
Signed-off-by: Maggie Zhou <mzhou@slack-corp.com>
@zmagg zmagg force-pushed the zmagg_rebuild_srv_schema_no_multicell branch from 4f5ded2 to 72e75c1 Compare April 16, 2020 07:00
@deepthi
Copy link
Copy Markdown
Collaborator

deepthi commented Apr 16, 2020

local.sharded_recovery                  	FAIL (3 tries)

This test hasn't been flaky. Does it pass for you locally?

@zmagg zmagg force-pushed the zmagg_rebuild_srv_schema_no_multicell branch from c0f1f4e to aadd900 Compare April 16, 2020 21:58
throw an error

Signed-off-by: Maggie Zhou <mzhou@slack-corp.com>
@zmagg zmagg force-pushed the zmagg_rebuild_srv_schema_no_multicell branch from aadd900 to f8edec4 Compare April 16, 2020 23:04
require.Nil(t, err)

replicaTabletArgs := commonTabletArg

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zmagg and I discussed this. It seems like this was a flaky test. The following change addresses that problem.

@rafael rafael merged commit 924824b into vitessio:master Apr 20, 2020
@rafael rafael deleted the zmagg_rebuild_srv_schema_no_multicell branch April 20, 2020 17:57
ajm188 pushed a commit to tinyspeck/vitess that referenced this pull request Apr 28, 2020
…ema_no_multicell

Pass in cell to EnsureVSchema and GetOrCreateShard
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants