Skip to content

region/cell feature draft#3460

Merged
sougou merged 7 commits intovitessio:masterfrom
tinyspeck:upstream-region
Dec 18, 2017
Merged

region/cell feature draft#3460
sougou merged 7 commits intovitessio:masterfrom
tinyspeck:upstream-region

Conversation

@inexplicable
Copy link
Copy Markdown
Contributor

This is a PR addressing: #3355

The major changes include:

  1. added region field to cell_info, updated vtctld commands to support region
  2. added GetRegionByCell and some helpers to topo.Server to query region of any given cell
  • NOTE that, if no region info is available, it uses cell to return for backward compatibility
  1. modified TabletStatsCache#StatsUpdate to accept cross cell slave tablets
  2. modified discoverygateway#shuffleTablets to favor same cell tablets

@googlebot
Copy link
Copy Markdown

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed, please reply here (e.g. I signed it!) and we'll verify. Thanks.


  • If you've already signed a CLA, it's possible we don't have your GitHub username or you're using a different email address. Check your existing CLA data and verify that your email is set on your git commits.
  • If your company signed a CLA, they designated a Point of Contact who decides which employees are authorized to participate. You may need to contact the Point of Contact for your company and ask to be added to the group of authorized contributors. If you don't know who your Point of Contact is, direct the project maintainer to go/cla#troubleshoot.
  • In order to pass this check, please resolve this problem and have the pull request author add another comment and the bot will run again.

@inexplicable
Copy link
Copy Markdown
Contributor Author

I signed it!

@sougou sougou requested a review from alainjobart December 10, 2017 22:51
Copy link
Copy Markdown
Contributor

@sougou sougou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should wait for @alainjobart to also take a look at this. I've have some initial feedback. I'm still not sure about the shuffle function. I'm wondering if we need to keep 'other cells' in a separate data structure to avoid repeated sorting.

cell string

// cell to region mapping function
cellToRegion func(cell string) string
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can just be region string. The region for a cell should never change. So, it can be precomputed and initialized at construction.

return newTabletStatsCache(nil, cell, cellToRegion, false /* setListener */)
}

// UpdateCellsToRegions is mainly for testing purpose, the `cellsToRegions` mapping should be provided in the constructor
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's a test-only function, it should be moved to a _test.go file. But I think you don't need this function. You should be able directly update the region if you want to change it for testing purposes.

func (tc *TabletStatsCache) StatsUpdate(ts *TabletStats) {
if ts.Target.TabletType != topodatapb.TabletType_MASTER && ts.Tablet.Alias.Cell != tc.cell {
// this is for a non-master tablet in a different cell, drop it
if ts.Target.TabletType != topodatapb.TabletType_MASTER && ts.Tablet.Alias.Cell != tc.cell && tc.cellToRegion(ts.Tablet.Alias.Cell) != tc.cellToRegion(tc.cell) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we have a couple of options:

  1. Extract the region from the cell info
  2. Denormalize the region into the tablet record

@alainjobart any preferences?

index = rand.Intn(i + 1)
tablets[i], tablets[index] = tablets[index], tablets[i]

//move all same cell tablets to the front, this is O(n)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Go thing: need space between // and start of comment. Running gofmt should fix this. Some pedantic people may also insist on punctuation.


func shuffleTablets(tablets []discovery.TabletStats) {
index := 0
func shuffleTablets(cell string, tablets []discovery.TabletStats) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need unit tests for this function, to test all code paths.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 ut added to discoverygateway_test

func (tc *TabletStatsCache) StatsUpdate(ts *TabletStats) {
if ts.Target.TabletType != topodatapb.TabletType_MASTER && ts.Tablet.Alias.Cell != tc.cell {
// this is for a non-master tablet in a different cell, drop it
if ts.Target.TabletType != topodatapb.TabletType_MASTER && ts.Tablet.Alias.Cell != tc.cell && tc.cellToRegion(ts.Tablet.Alias.Cell) != tc.cellToRegion(tc.cell) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You also need to exclude empty regions.

Copy link
Copy Markdown
Contributor

@alainjobart alainjobart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the feature.

The implementation can be a lot simpler though, just put a global map and a global function in go/vt/topo, and use that where you need to.

// to server_address.
string root = 2;

// Region is a group this cell belongs to
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please be a bit more specific in this comment (and end it with a period). Something like:

// Region is a group this cell belongs to. Used by vtgate to route traffic to
// other cells when there is no available tablet in the current region.

return conn, nil
}

// CellToRegionMapper function is a wrapper around topo.Server#GetRegionByCell with caching and error handling
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the benefit of using this pattern, vs just having a global map and a function to get the value from the map directly. Also, the global function backed by a global map would be easy to unit test. And then you wouldn't need to pass it everywhere. Overall, the number of lines of code for this change would go down dramatically.

Also, this needs to be thread-safe, so please protect the map with a mutex.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree that a global map would be simpler, i actually attempted that in the 1st round, but had difficulty collecting all the cells upfront to init the map. i thought i might use GetKnownCells for that purpose, but not sure how to capture new cell add event (though rare enough in practice).
for the reason above, i chose the lambda to make it lazy, and able to handle new cell event uniformly.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need to collect anything upfront? You start with an empty map and a mutex:

var cellToRegionMap map[string]string
var cellToRegionMapMu sync.Mutex

and export a function that does the usual Mutex lock, get from the cache, if not there get from topo service, and populate the cache.

The population only happens the first time you call the function, there is nothing upfront needed... You can also use a RW Mutex and do the lookup holding just the Read lock.

It's not different from what you're doing, except it's a global cache?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@sougou
Copy link
Copy Markdown
Contributor

sougou commented Dec 15, 2017

@alainjobart is this good to go?

@alainjobart
Copy link
Copy Markdown
Contributor

Looking at this, I don't like how all the objects fit together, with topo.Open() remembering the topo.Server, and then the other methods using that later. I think the following would work better:

  • create a RegionCache object, with the mutex, the map, and a *topo.Server. Use the *topo.Server to query the CellInfo when needed.
  • when creating the TableStatsCache, give it the *topo.Server, and have it create a RegionCache. Use it as appropriate.
    I think that would organize the code better, and limit global variables.

Also, the context.Background calls should be avoided, if possible, by passing in the context when necessary. I am not sure how to plumb this one though.

This is a bunch of changes, and in other parts of the code too, let me know if you think that's too much for this PR, and we can do it in another.

Copy link
Copy Markdown
Contributor

@alainjobart alainjobart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks better, thanks for making these changes!

// to maintain the integrity of our cache.
func NewTabletStatsCache(hc HealthCheck, cell string) *TabletStatsCache {
return newTabletStatsCache(hc, cell, true /* setListener */)
func NewTabletStatsCache(hc HealthCheck, cell string, ts *topo.Server) *TabletStatsCache {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor comment here: the parameter order should probably be hc, ts, cell (as topo.Server is more general, and cell a specialization). Would match the order of the other method below too.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 updating PR

}

func newTabletStatsCache(hc HealthCheck, cell string, setListener bool) *TabletStatsCache {
func newTabletStatsCache(hc HealthCheck, cell string, ts *topo.Server, setListener bool) *TabletStatsCache {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same thing here, ts then cell.

}

// UpdateCellsToRegions overwrites the global map built by topo server init, and is meant for testing purpose only.
func UpdateCellsToRegions(cellsToRegions map[string]string) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's only for tests, should be named UpdateCellsToRegionsForTests

…egions` as `UpdateCellsToRegionsForTests`
func (tc *TabletStatsCache) StatsUpdate(ts *TabletStats) {
if ts.Target.TabletType != topodatapb.TabletType_MASTER && ts.Tablet.Alias.Cell != tc.cell {
// this is for a non-master tablet in a different cell, drop it
if ts.Target.TabletType != topodatapb.TabletType_MASTER && ts.Tablet.Alias.Cell != tc.cell && tc.getRegionByCell(ts.Tablet.Alias.Cell) != tc.getRegionByCell(tc.cell) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't you check for blank region here? Otherwise, if region is not specified, it will end up being false always, and end up including all cells in the stats.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually not, as the getRegionByCell won't ever return empty region, it returns the cell value in this case. as agreed for backward compatibility

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

of course :)

@inexplicable
Copy link
Copy Markdown
Contributor Author

I signed it!

@googlebot
Copy link
Copy Markdown

CLAs look good, thanks!

@sougou sougou merged commit 82ab856 into vitessio:master Dec 18, 2017
@rafael rafael deleted the upstream-region branch October 17, 2018 17:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants