Healthcheck - limit healthchecks by keyspace#5815
Conversation
rafael
left a comment
There was a problem hiding this comment.
Serry, this is looking good to me. Added some minor comments.
go/vt/discovery/topology_watcher.go
Outdated
There was a problem hiding this comment.
The check for: fbk.isIncluded(new) might be redundant. A tablet shouldn't be able to change keyspaces. So if it was included in the old it should be included in new.
There was a problem hiding this comment.
In that case, maybe we should check for those failure conditions - isIncluded(new) && !isIncluded(old) and the reverse - and return a FAILED_PRECONDITION error.
There was a problem hiding this comment.
if it was included in the old it should be included in new.
☝️ I'm probably missing some important context here -- can you expand more about what you mean here? My understanding is that given that tablets can't change keyspaces, the fbk.isIncluded(old) is the unnecessary check here (as opposed to fbk.isIncluded(new)).
There was a problem hiding this comment.
Looks like I posted the above comment on a non-refreshed page - just now seeing @deepthi's comment.
@deepthi it looks like the callers of ReplaceTablet do not have mechanisms for handling these types of errors. Do we know what the expected behavior for FAILED_PRECONDITION errors during tablet replacement is?
Given the low likelihood of us hitting this case, would logging be sufficient here?
There was a problem hiding this comment.
Oh I see.. the interface won't let you return an error.
Yes, logging it would be fine.
|
Nicely written PR description!
These two look similar. Maybe the second case was for vtgates that don't use the |
|
Hey @deepthi - good catch. Yes, the second one was meant to say do not use. I will update the PR description now! |
4f58466 to
55f869c
Compare
Signed-off-by: Serry Park <serrypark@slack-corp.com> Signed-off-by: Serry Park <me@serry.co>
87a63c4 to
a4055f1
Compare
Healthcheck - limit healthchecks by keyspace
Overview
Currently, vtgates support a
keyspaces_to_watchflag which can enable data isolation by keyspace. However, vtgates still perform healthchecks against all tablets regardless of the value provided to this flag.This PR introduces changes to healthchecks such that when the
keyspaces_to_watchflag is utilized, vtgates will only maintain healthchecks on tablets in keyspaces that the vtgate has access to.--
This closes #5387.
Relevant PR: #4420
Implementation
FilterByKeyspace, which will only add tablets from keyspaces that the vtgate has access tokeyspaces_to_watchis passed inTesting
In addition to unit testing, I performed the following manual tests:
keyspaces_to_watchflag were only performing healthchecks on tablets from the desired keyspacekeyspaces_to_watchflag performed health checks on the expected tablets