SDN-662: OVN raft followups#410
SDN-662: OVN raft followups#410alexanderConstantinescu wants to merge 1 commit intoopenshift:masterfrom
Conversation
8bb5097 to
02e3072
Compare
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: alexanderConstantinescu The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
/test e2e-gcp-upgrade |
a0b5d0a to
b1943ab
Compare
|
@squeed: I added a livenessProbe. This one checks that the node itself is up and able to respond, as supposed to the readinessProbe which checks the general cluster member status. |
|
/test e2e-gcp-ovn-upgrade |
b1943ab to
ee14aed
Compare
|
/test e2e-gcp |
|
/test e2e-gcp |
|
/test e2e-ovn-aws |
|
/test e2e-aws-ovn |
| AVAILABLE_NODES=0 | ||
| for node in "${OVN_NODES_ARRAY[@]}"; do | ||
| node_ip=$(getent ahostsv4 "${node}" | grep RAW | awk '{print $1}') | ||
| if ovs-appctl -t /var/run/openvswitch/ovnnb_db.ctl cluster/status OVN_Northbound | grep -q $node_ip; then |
There was a problem hiding this comment.
I just tested this, and it turns out we don't ever remove servers if we're not available. So we need to parse the "connections" line. I'll give you a sample in a bit.
There was a problem hiding this comment.
Yeah, all we need to do is check that the DB port is open. it turns out ovsdb takes care of this for us and only opens the port when consensus is achieved.
|
@alexanderConstantinescu: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
Alexander is looking in to something else, he asked me to get this over the line. |
This PR adds a
PodDisruptionBudgetfor maintaining concensus for the raft cluster + areadinessProbefor thesbdbandnbdbchecking that all members have joined the cluster, and if they have: waits 30 seconds according to what has been mentioned here: https://jira.coreos.com/browse/SDN-662/assign @squeed