-
Notifications
You must be signed in to change notification settings - Fork 1.5k
openstack: remove the service VM #1959
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
openstack: remove the service VM #1959
Conversation
|
/label platform/openstack |
|
Right now, I'm just pushing this out there so we can discuss & collaborate on it in public. Most of the code has been written by @trown, @mandre and @iamemilio. I consolidated it and rebased on master. |
|
/retitle openstack: remove the service VM |
|
Note regarding the e2e-openstack job: it will keep failing until openshift/machine-config-operator#740 is merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @crawford
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, we could just require floating IP support for this first iteration and make this non-optional.
In general, VMs without a floating IP attached are not accessible from the outside (their IP addresses are private). So without a floating IP, the installer or anyone else can't reach the OpenShift API running there.
Some OpenStack environments do not allow any floating IPs at all or limit the amount the user can get.
|
@abhinavdahiya thanks for your comments! They should all be addressed now. Please let me know if there's anything else. |
openshift#740 and openshift/installer#1959 depend on each other. We need to break this dependency to allow them to merge while keeping the e2e-openstack job green. This patch disables the static pods added for the openstack platform in openshift#740 unless the installer provided the needed info set via openshift/installer#1959. It can be safely reverted once openshift/installer#1959 merges.
…tches openshift#740 and openshift/installer#1959 depend on each other. We need to break this dependency to allow them to merge while keeping the e2e-openstack job green. This patch disables the static pods added for the openstack platform in openshift#740 unless the installer provided the needed info set via openshift/installer#1959. It can be safely reverted once openshift/installer#1959 merges.
|
/retest |
|
/test e2e-openstack |
|
/test e2e-openstack |
|
Looks like the current CI failures were caused by: #2086 We're investigating now. This should hopefully fix it: |
|
Okay, I've reproduced the issue locally (by rebasing this PR on top of master) and verified that openshift/machine-config-operator#1050 fixes it. |
The experimental OpenStack backend used to create an extra server running DNS and load balancer services that the cluster needed. OpenStack does not always come with DNSaaS or LBaaS so we had to provide the functionality the OpenShift cluster depends on (e.g. the etcd SRV records, the api-int records & load balancing, etc.). This approach is undesirable for two reasons: first, it adds an extra node that the other IPI platforms do not need. Second, this node is a single point of failure. The Baremetal platform has faced the same issues and they have solved them with a few virtual IP addresses managed by keepalived in combination with coredns static pod running on every node using the mDNS protocol to update records as new nodes are added or removed and a similar static pod haproxy to load balance the control plane internally. The VIPs are defined here in the installer and they use the PlatformStatus field to be passed to the necessary machine-config-operator fields: openshift/api#374 The Bare Metal IPI Networking Infrastructure document is applicable here as well: https://github.com/openshift/installer/blob/master/docs/design/baremetal/networking-infrastructure.md There is also a great opportunity to share some of the configuration files and scripts here. This change needs several other pull requests: Keepalived plus the coredns & haproxy static pods in the MCO: openshift/machine-config-operator#740 Co-authored-by: Emilio Garcia <[email protected]> Co-authored-by: John Trowbridge <[email protected]> Co-authored-by: Martin Andre <[email protected]> Co-authored-by: Tomas Sedovic <[email protected]> Massive thanks to the Bare Metal and oVirt people!
|
Rebased. The release image now has openshift/machine-config-operator#1050 so this should pass e2e-openstack (unless I've messed the rebase up or there's another breakage). |
|
/hold cancel The OpenStack job passed, I've validated it locally and this PR is rabased on top of master. I think we're ready to merge. |
|
/lgtm |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
2 similar comments
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/test e2e-aws-scaleup-rhel7 |
|
"/test e2e-aws-scaleup-rhel7" is broken, no need to retest it |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: abhinavdahiya, iamemilio, mandre, tomassedovic The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
@tomassedovic: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
openstack: Remove the Service VM
The experimental OpenStack backend used to create an extra server
running DNS and load balancer services that the cluster needed.
OpenStack does not always come with DNSaaS or LBaaS so we had to provide
the functionality the OpenShift cluster depends on (e.g. the etcd SRV
records, the api-int records & load balancing, etc.).
This approach is undesirable for two reasons: first, it adds an extra
node that the other IPI platforms do not need. Second, this node is a
single point of failure.
The Baremetal platform has faced the same issues and they have solved
them with a few virtual IP addresses managed by keepalived in
combination with coredns static pod running on every node using the mDNS
protocol to update records as new nodes are added or removed and a
similar static pod haproxy to load balance the control plane internally.
The VIPs are defined here in the installer and they use the
PlatformStatus field to be passed to the necessary
machine-config-operator fields:
openshift/api#374
The Bare Metal IPI Networking Infrastructure document is broadly
applicable here as well:
https://github.com/openshift/installer/blob/master/docs/design/baremetal/networking-infrastructure.md
Notable differences in OpenStack:
routers) our haproxy static pods balance the 80 & 443 pods to the
worker nodes
uses one of the masters for DNS.
These differences are not fundamental to OpenStack and we will be
looking at aligning more closely with the Baremetal provider in the
future.
There is also a great oportunity to share some of the configuration
files and scripts here.
This change needs several other pull requests:
Keepalived plus the coredns & haproxy static pods in the MCO:
openshift/machine-config-operator/pull/740
Passing the API and DNS VIPs through the installer:
#1998
Vendoring the OpenStack PlatformStatus changes in the MCO:
openshift/machine-config-operator#978
Allowing to use PlatformStatus in the MCO templates:
openshift/machine-config-operator#943
Co-authored-by: Emilio Garcia [email protected]
Co-authored-by: John Trowbridge [email protected]
Co-authored-by: Martin Andre [email protected]
Co-authored-by: Tomas Sedovic [email protected]
Massive thanks to the Bare Metal and oVirt people!