-
Notifications
You must be signed in to change notification settings - Fork 225
Define health check strategy for MachineSet #632
Comments
Could you explain a bit what "health checking" means in this case? |
The overall discussion would be similar to pod health checking:
Once machines are unhealthy, status should be reported and then a system needs to replace them if the user so chooses - an auto-repair functionality. There are different ways to pursue this, and the item is vague on purpose as we need to list the architecture options here to decide what might be incorporated into the controller and what might not. Let me know your thoughts on this, especially based on your experience managing machines on other platforms. |
I would propose that we rely on the |
I agree with the suggestion to build on node health. It might not be enough, though. Quoting https://kubernetes.io/docs/concepts/architecture/nodes/:
I'd like to understand what "asks the cloud provider if the VM for that node is still available" means and how that could be related to our work.
In the case the node is unhealthy (unreachable), the schedule is smart to evict pods. That means also, that in a machine set, we'll have one fewer machine than the user's intent. Do we do anything automatically at that point? Or do we have user settings to determine when to do something? Thoughts? |
This issue was moved to kubernetes-sigs/cluster-api#47 |
This is to track discussion and documentation on how health checking will be done for machines in a set.
The text was updated successfully, but these errors were encountered: