-
Notifications
You must be signed in to change notification settings - Fork 275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Testing k8s ha configuration by shutting down the first k8s master node #6
Comments
I didn't test version 1.8.x yet. If it works, I think the problem is keepalived, check the keepalived's log. if it does not work, check the kubelet's log. Then show me the log, pls. |
It turns out the reelection process for the controller-manager and scheduler running on the k8s master nodes worked just fine. Keepalived was working just fine as well. The root cause of the problem is the setting in the configmap 'cluster-info' in the kube-public namespace. So, in addition to the configmap 'kube-proxy' in the kube-system namespace, I had to edit the 'cluster-info' configmap, replaced the host ip address:6443 with the virtual IP address:8443. This is extremely important for any new worker node to bootstrap with the correct configuration setting when joining the cluster, using kubeadm join. For my 2 existing k8s nodes, I just manually updated the /etc/kubernetes/kubelet.conf, restarted the docker and kubelet service on these nodes and everything works as expected :) Thank you so much for your prompt response. |
In my instruction there's "kube-proxy configuration", did you say this config?
|
Keep in mind that on next |
@cookeen, I followed your provided instruction and was able to deploy a HA Kubernetes cluster (with 3 k8s master nodes and 2 k8s nodes) using Kubernetes version 1.8.1
Everything seems working just like you described in instruction.
Next, I focused on testing the high availablity configuration. To do so, I attempted to shutdown the first k8s master. Once the first k8s master is brought down, the keepalived service on this node stopped and the virtual IP address transferred to the second k8s master. However, things start falling apart :(
Specifically, on the second (or third) master, when running the command: 'kubectl get nodes', the output shows something like the following:
NAME STATUS ROLES ...
k8s-master1 NotReady master ...
k8s-master2 Ready ...
k8s-master3 Ready ...
k8s-node1 Ready ...
k8s-node2 Ready ...
Also, on k8s-master2 or k8s-master3, when I ran 'kubectl logs' to check controller-manager and
scheduler, it appeared they did NOT reelect a new leader. As a result, all of the kubernetes services that were exposed before were no longer accessible.
Do you have any idea why the reelection process did NOT occur for the controller-manager and
scheduler on the remaining k8s master nodes?
The text was updated successfully, but these errors were encountered: