helm: add InitShardMaster Job#3612
Conversation
f45ce67 to
dc5b82b
Compare
|
This ended up being surprisingly difficult, and Go templates don't provide any way for me to add up the replicas on behalf of the user. That's not very good, and the wrong number will cause problems. Otherwise I think this will be super useful to anyone booting up a Vitess cluster for the first time. |
|
@enisoc this is ready for review |
|
Here's a sample of the logs, and everything looks good, including the error retry logic. |
|
My only hesitation with this is that it could cause an infinite loop if the total number of tablets is setup incorrectly or if the tablets are never healthy. I'm not sure the best way to handle a timeout during each of the loops. (wishes he had Go context) |
4326045 to
5b0366a
Compare
|
I just added a 10 minute timeout that will cause the job to fail so it doesn't run indefinitely |
d2ea052 to
8592fec
Compare
|
Nothing is impossible if you believe hard enough. Also Sprig is available in Helm. |
|
I'm astounded by your ingenuity with that. I spent a good 30+ minutes trying to figure out how to append or set inside a map or something to get around that. I'm super impressed, while still being baffled that assigning to variables in Go templates isn't a thing. |
Necessary for conditionally enabling semi sync if greater than one Also used when waiting to run InitShardMaster to make sure we don’t orphan any slaves
eb73a0c to
224a6bd
Compare
helm/vitess/templates/NOTES.txt
Outdated
There was a problem hiding this comment.
In the Operator, I have a few Jobs that stick around as a record that "this was already done". For example, if you delete them, they will run again every time you helm upgrade anyway.
If the Jobs were truly one-off, it would be important to delete them since the number of finished Jobs could grow without bound. However, since we are deterministically creating only one Job per shard, and uninstalling the chart should delete any Jobs we create, I don't think it's harmful to keep the Jobs around.
Note that the Pod GC will clean up the terminated Pods eventually, yet the Job will remember in its status that it already completed.
There was a problem hiding this comment.
Ok, I'll change the comment to reflect that.
helm/vitess/templates/_vttablet.tpl
Outdated
There was a problem hiding this comment.
It's possible that there could temporarily be no tablet of type master, even post-ISM. We need to be careful not to run ISM again in this case (especially with -force), since it can cause data loss.
I think a somewhat more reliable signal would be an empty master_alias entry in the result of GetShard. I think even if we transiently don't have a running master, the shard record should still contain the alias of the last known master. I'm not 100% on that though, so we should still try to think if there's a better way to be sure the shard has never been initialized.
There was a problem hiding this comment.
I can do whatever you think is the most reliable. Should I implement your GetShard suggestion or are you looking for a better way?
helm/vitess/templates/_vttablet.tpl
Outdated
There was a problem hiding this comment.
Should this go inside the else, before the sleep? If the tablets are ready, we don't need to check for timeout.
helm/vitess/templates/_vttablet.tpl
Outdated
| -db-config-filtered-uname "vt_filtered" | ||
| -db-config-filtered-dbname "vt_{{$keyspace.name}}" | ||
| -db-config-filtered-charset "utf8" | ||
| {{ if gt (int $shard.tabletCount) 1 }} |
There was a problem hiding this comment.
Can this use the computed $totalTabletCount?
There was a problem hiding this comment.
Yeah, that was an oversight
Also use calculated totalTabletCount
|
I made all the changes you requested except for the master tablet check, plus I added the semi-sync and heartbeat options. |
|
I added a second master tablet check using @enisoc This is ready for review |
enisoc
left a comment
There was a problem hiding this comment.
Looks good other than one comment.
helm/vitess/templates/_vttablet.tpl
Outdated
| -enable_replication_reporter | ||
| {{ if $orc.enabled }} | ||
| {{ if $defaultVttablet.enableSemisync }} | ||
| {{ if gt $totalTabletCount 1 }} |
There was a problem hiding this comment.
I forgot until you brought it up in another context that rdonly tablets don't ACK. So we should actually check the replica count of only the replica type tablets to avoid getting stuck during ISM.
There was a problem hiding this comment.
What would you think about eliminating that check altogether? It felt somewhat weird to not enable semisync when the user explicitly enabled it.
There was a problem hiding this comment.
Yeah that sounds good to me. But maybe add a comment above enableSemiSync in values.yaml that you need at least 2 replica-type (master-eligible) tablets.
To make the
helm installprocess more seamless, this will automatically initialize the shard master. One job is created per shard.