fix: wait for overprovisioned nodes to be removed from MachinePool#1367
fix: wait for overprovisioned nodes to be removed from MachinePool#1367chewong wants to merge 1 commit into
Conversation
Signed-off-by: Ernest Wong <chuwon@microsoft.com>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
| kubectl wait --for=condition=Ready node --all --timeout=5m | ||
| if [[ "${EXP_MACHINE_POOL:-}" == "true" ]]; then | ||
| echo "Waiting for nodes overprovisioned by the MachinePool to be removed" | ||
| while kubectl get nodes | grep 'NotReady' > /dev/null; do |
There was a problem hiding this comment.
how come kubectl wait --for=condition=Ready node --all doesn't cover this?
There was a problem hiding this comment.
In https://prow.k8s.io/view/gs/kubernetes-jenkins/logs/capz-azure-file-machinepool-1-19/1388742406299455488, I am suspecting that the overprovisioned node was spawned after kubectl wait --for=condition=Ready node --all is finished running
There was a problem hiding this comment.
wouldn't the same thing happen here if that's the case? It would have no NotReady nodes (== all nodes are Ready as checked above) and it would proceed to the next step, making it possible for a NotReady node to appear after
There was a problem hiding this comment.
Perhaps, we should add a condition to Azure machine pool to indicate when it reaches a stable state. In #1332, we introduce a ScaleSetDesiredReplicasCondition which is true when the desired number of ready nodes is reached. We could add another condition which would describe a stable state with only ready machines which match the desired replica count.
|
Closing this PR for now since I don't have bandwidth to work on it |
Signed-off-by: Ernest Wong chuwon@microsoft.com
What type of PR is this?
What this PR does / why we need it:
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)format, will close the issue(s) when PR gets merged):Fixes #1361
Special notes for your reviewer:
Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.
TODOs:
Release note: