Conversation
|
Thanks for your pull request. It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please visit https://cla.developers.google.com/ to sign. Once you've signed, please reply here (e.g.
|
|
I signed it! |
sougou
left a comment
There was a problem hiding this comment.
We should wait for @alainjobart to also take a look at this. I've have some initial feedback. I'm still not sure about the shuffle function. I'm wondering if we need to keep 'other cells' in a separate data structure to avoid repeated sorting.
| cell string | ||
|
|
||
| // cell to region mapping function | ||
| cellToRegion func(cell string) string |
There was a problem hiding this comment.
I think this can just be region string. The region for a cell should never change. So, it can be precomputed and initialized at construction.
| return newTabletStatsCache(nil, cell, cellToRegion, false /* setListener */) | ||
| } | ||
|
|
||
| // UpdateCellsToRegions is mainly for testing purpose, the `cellsToRegions` mapping should be provided in the constructor |
There was a problem hiding this comment.
If it's a test-only function, it should be moved to a _test.go file. But I think you don't need this function. You should be able directly update the region if you want to change it for testing purposes.
| func (tc *TabletStatsCache) StatsUpdate(ts *TabletStats) { | ||
| if ts.Target.TabletType != topodatapb.TabletType_MASTER && ts.Tablet.Alias.Cell != tc.cell { | ||
| // this is for a non-master tablet in a different cell, drop it | ||
| if ts.Target.TabletType != topodatapb.TabletType_MASTER && ts.Tablet.Alias.Cell != tc.cell && tc.cellToRegion(ts.Tablet.Alias.Cell) != tc.cellToRegion(tc.cell) { |
There was a problem hiding this comment.
Here we have a couple of options:
- Extract the region from the cell info
- Denormalize the region into the tablet record
@alainjobart any preferences?
| index = rand.Intn(i + 1) | ||
| tablets[i], tablets[index] = tablets[index], tablets[i] | ||
|
|
||
| //move all same cell tablets to the front, this is O(n) |
There was a problem hiding this comment.
Go thing: need space between // and start of comment. Running gofmt should fix this. Some pedantic people may also insist on punctuation.
|
|
||
| func shuffleTablets(tablets []discovery.TabletStats) { | ||
| index := 0 | ||
| func shuffleTablets(cell string, tablets []discovery.TabletStats) { |
There was a problem hiding this comment.
Need unit tests for this function, to test all code paths.
There was a problem hiding this comment.
👍 ut added to discoverygateway_test
| func (tc *TabletStatsCache) StatsUpdate(ts *TabletStats) { | ||
| if ts.Target.TabletType != topodatapb.TabletType_MASTER && ts.Tablet.Alias.Cell != tc.cell { | ||
| // this is for a non-master tablet in a different cell, drop it | ||
| if ts.Target.TabletType != topodatapb.TabletType_MASTER && ts.Tablet.Alias.Cell != tc.cell && tc.cellToRegion(ts.Tablet.Alias.Cell) != tc.cellToRegion(tc.cell) { |
There was a problem hiding this comment.
You also need to exclude empty regions.
alainjobart
left a comment
There was a problem hiding this comment.
I like the feature.
The implementation can be a lot simpler though, just put a global map and a global function in go/vt/topo, and use that where you need to.
proto/topodata.proto
Outdated
| // to server_address. | ||
| string root = 2; | ||
|
|
||
| // Region is a group this cell belongs to |
There was a problem hiding this comment.
Please be a bit more specific in this comment (and end it with a period). Something like:
// Region is a group this cell belongs to. Used by vtgate to route traffic to
// other cells when there is no available tablet in the current region.
go/vt/topo/server.go
Outdated
| return conn, nil | ||
| } | ||
|
|
||
| // CellToRegionMapper function is a wrapper around topo.Server#GetRegionByCell with caching and error handling |
There was a problem hiding this comment.
I don't see the benefit of using this pattern, vs just having a global map and a function to get the value from the map directly. Also, the global function backed by a global map would be easy to unit test. And then you wouldn't need to pass it everywhere. Overall, the number of lines of code for this change would go down dramatically.
Also, this needs to be thread-safe, so please protect the map with a mutex.
There was a problem hiding this comment.
agree that a global map would be simpler, i actually attempted that in the 1st round, but had difficulty collecting all the cells upfront to init the map. i thought i might use GetKnownCells for that purpose, but not sure how to capture new cell add event (though rare enough in practice).
for the reason above, i chose the lambda to make it lazy, and able to handle new cell event uniformly.
There was a problem hiding this comment.
Why do you need to collect anything upfront? You start with an empty map and a mutex:
var cellToRegionMap map[string]string
var cellToRegionMapMu sync.Mutex
and export a function that does the usual Mutex lock, get from the cache, if not there get from topo service, and populate the cache.
The population only happens the first time you call the function, there is nothing upfront needed... You can also use a RW Mutex and do the lookup holding just the Read lock.
It's not different from what you're doing, except it's a global cache?
|
@alainjobart is this good to go? |
|
Looking at this, I don't like how all the objects fit together, with topo.Open() remembering the topo.Server, and then the other methods using that later. I think the following would work better:
Also, the context.Background calls should be avoided, if possible, by passing in the context when necessary. I am not sure how to plumb this one though. This is a bunch of changes, and in other parts of the code too, let me know if you think that's too much for this PR, and we can do it in another. |
alainjobart
left a comment
There was a problem hiding this comment.
Looks better, thanks for making these changes!
| // to maintain the integrity of our cache. | ||
| func NewTabletStatsCache(hc HealthCheck, cell string) *TabletStatsCache { | ||
| return newTabletStatsCache(hc, cell, true /* setListener */) | ||
| func NewTabletStatsCache(hc HealthCheck, cell string, ts *topo.Server) *TabletStatsCache { |
There was a problem hiding this comment.
One minor comment here: the parameter order should probably be hc, ts, cell (as topo.Server is more general, and cell a specialization). Would match the order of the other method below too.
| } | ||
|
|
||
| func newTabletStatsCache(hc HealthCheck, cell string, setListener bool) *TabletStatsCache { | ||
| func newTabletStatsCache(hc HealthCheck, cell string, ts *topo.Server, setListener bool) *TabletStatsCache { |
There was a problem hiding this comment.
same thing here, ts then cell.
go/vt/topo/server.go
Outdated
| } | ||
|
|
||
| // UpdateCellsToRegions overwrites the global map built by topo server init, and is meant for testing purpose only. | ||
| func UpdateCellsToRegions(cellsToRegions map[string]string) { |
There was a problem hiding this comment.
If it's only for tests, should be named UpdateCellsToRegionsForTests
…egions` as `UpdateCellsToRegionsForTests`
| func (tc *TabletStatsCache) StatsUpdate(ts *TabletStats) { | ||
| if ts.Target.TabletType != topodatapb.TabletType_MASTER && ts.Tablet.Alias.Cell != tc.cell { | ||
| // this is for a non-master tablet in a different cell, drop it | ||
| if ts.Target.TabletType != topodatapb.TabletType_MASTER && ts.Tablet.Alias.Cell != tc.cell && tc.getRegionByCell(ts.Tablet.Alias.Cell) != tc.getRegionByCell(tc.cell) { |
There was a problem hiding this comment.
Shouldn't you check for blank region here? Otherwise, if region is not specified, it will end up being false always, and end up including all cells in the stats.
There was a problem hiding this comment.
actually not, as the getRegionByCell won't ever return empty region, it returns the cell value in this case. as agreed for backward compatibility
|
I signed it! |
|
CLAs look good, thanks! |
This is a PR addressing: #3355
The major changes include:
regionfield tocell_info, updatedvtctldcommands to supportregionGetRegionByCelland some helpers totopo.Serverto queryregionof any givencellregioninfo is available, it usescellto return for backward compatibilityTabletStatsCache#StatsUpdateto accept crosscellslavetabletsdiscoverygateway#shuffleTabletsto favor samecelltablets