helm: reparent on vttablet preStop hook#4103
Conversation
e78d0d3 to
2eebbdb
Compare
|
Currently if a stop hook fails, kubernetes will allow the resource to be terminated. With this in mind, ideally you would have some retry logic in your stop hook so that it will keep retrying until it actually reparents (or maybe some other form of error handling). For example, on reparent failure you could There have also been a few scenarios where we've seen reparents cause vttablet to crash, leaving the cluster without a master (luckily orchestrator saves us in this case). I wrote our internal stop hook logic and had to think about all this stuff at one point, so I'm very interested to get other's thoughts on the right course of action for both the above cases. |
|
This has been lingering. Can we work on moving this forward? |
|
Yeah, I'm tackling this after backups. |
0da00e1 to
dd45a38
Compare
|
@hmcgonig Here are more detailed thoughts on those two scenarios. 1 - Should we try to detect if the current tablet is healthy? If not, we may want to do an EmergencyReparent
There needs to be some kind of default retry counts because the termination grace period starts right away, with a max of 600 seconds. "The behavior is similar for a PreStop hook. If the hook hangs during execution, the Pod phase stays in a Terminating state and is killed after terminationGracePeriodSeconds of pod ends. If a PostStart or PreStop hook fails, it kills the Container." "If the preStop hook is still running after the grace period expires, processes in the Pod are sent the TERM signal with a small (2 second) extended grace period." Maybe retry 3 times with a 10 second delay between each, and if all 3 attempts fail, then run 2 - hook could possibly get called more than onceGoing through the implementation, we really don't need to be worried about it. In the rare instance that it did get called twice, then either:
|
dd45a38 to
7d1f4ba
Compare
@hmcgonig Which container do you have the |
|
We run it in vttablet. And youre definitely right, there is a race there, you'll want to add a stop hook to the mysql container that polls the vttablet |
7d1f4ba to
c509613
Compare
|
@hmcgonig @acharis @leoxlin I've implemented the changes we discussed on our call and they really work great.
I cleaned up and rebased my commits, so if you think it looks good, I think it's ready to get merged. Thanks so much for your help and suggestions! It would have taken me a lot longer to figure out how to handle the race condition, and I probably wouldn't have gotten to deleting the tablet record. |
c509613 to
f2a7d88
Compare
Signed-off-by: Derek Perkins <derek@derekperkins.com>
Signed-off-by: Derek Perkins <derek@derekperkins.com>
Signed-off-by: Derek Perkins <derek@derekperkins.com>
hmcgonig
left a comment
There was a problem hiding this comment.
I don't know a ton about the current vitess docker image composition, but I only really had a couple minor comments that could potentially speed up build times (assuming youre building in a way that allows layer caching). Otherwise, LGTM!
this should allow for more build layer caching Signed-off-by: Derek Perkins <derek@derekperkins.com>
Signed-off-by: Derek Perkins <derek@derekperkins.com>
f2a7d88 to
448413c
Compare
Attempt to reparent when vttablet stops. I have a couple questions:
EmergencyReparentor just let Orchestrator take overcc @enisoc @acharis