Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix a bug when vm hosting master mongodb is down #80

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

xinsfang
Copy link

Pod mongodb-0 (master) was on node master1. After master1 was shut down, mongodb-0's STATUS was in "Unknown", whereas pod.status.phase was "Running". mongodb-0 is repetitively selected as the cluster's master and removed from cluster as it is not reachable. It inhibited other cluster members to become a master. Sidecar needs to check correct field for pod status.
----logs---
$ kubectl logs -n=maglev-system mongodb-2 -c mongo-sidecar
...
Pod has been elected as a secondary to do primary work
Addresses to add: [ 'mongodb-0.mongodb.svc.cluster.local:27017' ]
...

$ kubectl get po mongodb-0 -n=maglev-system
NAME READY STATUS RESTARTS AGE
mongodb-0 3/3 Unknown 0 20h
$ kubectl get po mongodb-0 -n=maglev-system -oyaml | grep phase
phase: Running

Pod mongodb-0 (master) was on node master1. After master1 was shut down, mongodb-0's STATUS was in "Unknown", whereas pod.status.phase was "Running". mongodb-0 is repetitively selected as the cluster's master and removed from cluster as it is not reachable. It inhibited other cluster members to become a master. Sidecar needs to check correct field for pod status.
----logs---
kube@xinsfang-worker-1:~$ kubectl logs -n=maglev-system mongodb-2 -c mongo-sidecar
...
 Pod has been elected as a secondary to do primary work
 Addresses to add:     [ 'mongodb-0.mongodb.maglev-system.svc.cluster.local:27017' ]
...

kube@xinsfang-worker-1:~$ kubectl get po mongodb-0 -n=maglev-system
NAME        READY     STATUS    RESTARTS   AGE
mongodb-0   3/3       Unknown   0          20h
kube@xinsfang-worker-1:~$ kubectl get po mongodb-0 -n=maglev-system -oyaml | grep phase
  phase: Running
@bbbmj
Copy link

bbbmj commented Jun 4, 2018

LGTM
It's also what we meet.
cc @cvallance

@abuzhynsky
Copy link

@xinsfang @bbbmj Thanks for you PR!
But maybe it's better to filter out the pods in NodeLoststate in the beginning of the workLoop? And also existing master can be on the lost node so sidecar will be potentially unable to watch the pods changes.
I've created a new PR #98 with these changes, please take a look at it.
Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants