-
Notifications
You must be signed in to change notification settings - Fork 585
config/v1: add platform status for OpenStack #374
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
config/v1: add platform status for OpenStack #374
Conversation
|
/cc @coreydaley @ironcladlou @deads2k I would really appreciate a quick review. We need this for OpenStack and since it means updating the installer & MCO deps, I'd rather have this merged sooner if at all possible. |
|
I am not that familiar with OpenStack, so I can't speak to whether or not those are all of the needed fields, but the syntax of what you have there, to me, looks correct. |
config/v1/types_infrastructure.go
Outdated
| APIVIP string `json:"apiVip,omitempty` | ||
|
|
||
| // DNSVIP is the VRRP address for the internal cluster DNS. | ||
| DNSVIP string `json:"dnsVip,omitempty` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tomassedovic can you point to what needs to consume these fields?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't have time to put up proper PRs for this yet (plan to do that tomorrow), but this is how it's intended to be used (from my in-development branches I tested today, the IPs are hardcoded, etc. for now):
openshift/machine-config-operator@0490406#diff-401a0f8b1d8edd607fbaf7c85beba53a
This is more or less analogous to what @rgolangh is doing with oVirt: #369 and I believe the baremetal IPI folks will want to do in their case too. We need these fields to pass VIPs from the installer to static pods running on the nodes.
Since we've got full control over the field names, we can change those if they don't fit the project's style.
config/v1/types_infrastructure.go
Outdated
| APIVIP string `json:"apiVip,omitempty` | ||
|
|
||
| // DNSVIP is the VRRP address for the internal cluster DNS. | ||
| DNSVIP string `json:"dnsVip,omitempty` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #348 (comment)
I have the same question here about what DNS service this IP refers to, who provides and consumes the value, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ironcladlou the answers are pretty much the same (we're facing similar issues and solving them in a similar manner as baremetal and ovirt).
Unlike AWS, none of these platforms can expect a load balancer or DNS-as-a-service and so they have to provide the underlying DNS (e.g. the api & api-int endpoints and the SRV etcd records) some other way. The solution we've all converged on is that each node runs (and relies) on its own coredns static pod with mdns providing dynamic updates and keepalived for IP failover.
You make good points about differentiating between this and the DNS service running inside the cluster as well as the VIP/VRRP implementation details leaking in.
Would something like this work?
// ApiIntIp is an IP address managed by the OpenStack provider backing
// the internal `api-int` record
ApiIntIp string `json:"apiIntIp,omitempty`
// ProviderDnsIp is the IP address for the internal DNS used by the nodes.
// Unlike the one managed by the DNS operator, `ProviderDnsIp` provides
// name resolution for the nodes themselves.
ProviderDnsIp string `json:"providerDnsIp,omitempty`
(I agree with the importance of choosing the right names and descriptions, but I'd rather not spend too long on this as the 4.2 feature freeze is looming over us)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, and the values are consumed by the MCO, which provides the configuration for the static pods we need to run (and it's those extra static pods that need these IPs).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, understood, and I think your suggestion does a good job disambiguating. I just want to make sure I understood what these things mean and whether they implied some missed requirements in other components which might need to consume them.
(I agree with the importance of choosing the right names and descriptions, but I'd rather not spend too long on this as the 4.2 feature freeze is looming over us)
Sure, I don't want to get in the weeds rehashing the same points here. I personally don't want to block either PR on the subjective naming stuff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ironcladlou thanks, that makes perfect sense.
I've updated the PR, please let me know if there's anything else you need.
config/v1/types_infrastructure.go
Outdated
| type OpenStackPlatformStatus struct { | ||
| // apiIntIP is an IP address managed by the OpenStack provider | ||
| // backing the internal `api-int` record. | ||
| APIIntIP string `json:"apiIntIP,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't use abbreviations for internal. See https://github.com/openshift/api/blob/master/config/v1/types_infrastructure.go#L65 . Speaking of which, does this field correspond to this URL in some way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes. This is an IP address that OpenStack generates which that URL will resolve to.
Platforms without a DNSaaS (openstack, baremetal, ovirt) need to do that. I'll update the name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the field name & comment. Hope this is clearer.
config/v1/types_infrastructure.go
Outdated
| // OpenStackPlatformStatus holds the current status of the OpenStack infrastructure provider. | ||
| type OpenStackPlatformStatus struct { | ||
| // apiIntIP is an IP address managed by the OpenStack provider | ||
| // backing the internal `api-int` record. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you describe what this api-int record is used for? kube-apiserver? Some openstack informational URL?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it's the kube API for the internal services
|
The PR is rebased and the field names are identical to the recently-merged baremetal provider: https://github.com/openshift/api/pull/348/files |
|
/retest The CI failures look unrelated to the changes here. |
|
/retest |
|
@ironcladlou @deads2k what are the next steps? Anything else you'd like to see here? This PR now mirrors the already-merged baremetal patch and adds the CloudName field. |
|
/lgtm Looks good for the registry. |
| // InfrastructureStatus' apiServerInternalURL field. OpenStack and | ||
| // other platforms without a readily available DNS-as-a-service | ||
| // use it before the DNS for the nodes themselves is set up. | ||
| APIServerInternalIP string `json:"apiServerInternalIP,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this field parallel, the one for baremetal? if so, make the comments consistent.
| // InfrastructureStatus' apiServerInternalURL field. OpenStack and | ||
| // other platforms without a readily available DNS-as-a-service | ||
| // use it before the DNS for the nodes themselves is set up. | ||
| APIServerInternalIP string `json:"apiServerInternalIP,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
optional?
config/v1/types_infrastructure.go
Outdated
|
|
||
| // nodeDNSIP is the IP address for the internal DNS used by the | ||
| // nodes. Unlike the one managed by the DNS operator, nodeDnsIp | ||
| // provides name resolution for the nodes themselves. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similar to baremetal, who manages this?
|
minor comments for comment updates hold for the doc updates. |
This adds the `CloudName` field for specifying which OpenStack cloud to use as well as the fields necessary for the coredns and loadbalancer static pods running on the OpenStack nodes. Fixes: openshift/installer#1986
|
@deads2k The situation with OpenStack is identical to baremetal so I've used the same wording. Is that okay? |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: adambkaplan, deads2k, tomassedovic The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
openstack: point master nodes to themselves for DNS
openstack: render coredns Corefile directly
openstack: add test_data for openstack mdns files
openstack: add A records for api(-int)
hacks: forward all non-cluster traffic to 8.8.8.8
Add *.apps wildcard entry
don't use /tmp for clustervars
hacks: send *.apps to all masters
hacks: don't add bootstrap to round robin api-int
run mdns on workers?
openshift-metalkube -> openshift-metal3
Runs HaProxy in static pods on master nodes
haproxy static pod almost working
Move haproxy-watcher to standalone script and excape go templating
More good stuff for haproxy
Use clustervars instead of a function in
`pkg/controller/template/render.go` that only works when MCO is running
on the bootstrap node.
Only add workers to haproxy config when there are workers in the
cluster.
Reinitialize the variables for each iteration of the haproxy-watcher.sh
script.
Fix worker hostnames in mdns config
Hostnames for workers should be like: host-10-0-128-22
WIP Current status for keepalived
Add VIP for DNS
Fix keepalived config
vrrp_instance keyword for ${CLUSTER_NAME}_API was duplicated.
Make the infra object available for template rendering
oVirt and possibly other platforms needs access the InfrastructureStatus
to render properly config files with stuff like API VIP, DNS VIP etc.
By embedding the infra object its possibly to directly get that in
tremplates and reder for example a coredns config file like this:
```
{{ .Infra.Status.PlatformStatus.Ovirt.DnsVIP }} api-int.{{.EtcdDiscoveryDomain}} api.{{.EtcdDiscoveryDomain}} ns1.{{.EtcdDiscoveryDomain}}
```
Signed-off-by: Roy Golan <[email protected]>
HACK: vendor: vendor openshift API PlatformStatus changes
This is a manual workaround for testing until this PR merges:
openshift/api#374
WIP: Add worker DNS static pod and read VIPs from PlatformStatus
vendor: update openshift/api to the latest changes
Use openshift namespaced images where possible
Update the latest fields from PlatformStatus
WIP replace 127.0.0.1 in DHCP config with the node's fixed IP
This uses this the runtimecfg tool:
github.com/openshift/baremetal-runtimecfg
Eventually, we'll want to replace all our other scripts that gather
IPs for keepalived etc. with it too.
escape
Render /etc/dhcp/dhclient.conf and /etc/resolv.conf with a static pod
This doesn't seem to work, because the discovery pods running on the
masters keep failing because they use DNS from before the resolv.conf
change.
Resolv.conf is now updated at runtime once kubelet launches sattic
pods, see.
Also, the render config pods keep restarting -- they succeed but
kubernetes thinks they should be kept running.
Use the DNS VIP for DNS for all nodes
This should provide DNS for every node, relying on teh master DNS VIP.
It might cause delays, because there will be a time when the VIP is
not active because coredns did not come up yet on any master.
We should really run coredns on the bootstrap node too.
Fix the copy-pasted filename for keepalived
The keepalived manifest was being created as `haproxy.yaml` which
kept overwriting the actual haproxy static pod, resulting in the LB
never being run.
Disable worker's coredns & mdns-publisher temporarily
The scripts are currently not working and I think they may be blocking
the worker kubelet readiness.
We're already setting the DNS VIP so they should be able to resolve the cluster.
The only thing that won't happen is the workers' own hostnames won't
be available. It's still not clear to me whether that's an issue or
not. Going to disable this for now to get a stable
deployment (hopefully) and then see.
Add initial haproxy configuration
This means haproxy should always start with this config as opposed to
relying on haproxy-watcher to create it.
Since this uses the API VIP, it should always work (so long as the VIP
is active).
Make the haxproxy-watcher script more stable
First, it uses API VIP unconditionally. That means the API access
should always work (as long as the VIP is assigned to a node that
responds to it -- which is handled by keepalived).
Second, it runs with `set -e` and `set -o pipefail` so as not to mess
known-good configuration. If anything fails, the script won't break
API access. I suspect this has been causing a lot of the weird racey
issues and just general unpleasantness.
Third, the master control plane no longer depends on this script
running successfully even once. There is a separate file with the
initial haproxy config, so even if this keeps failing, the control
plane should be unaffected.
Of course this script failing would mean the worker endpoints never
being added so we don't want that either.
And finally, it prints some messages to show what's going on as well
as setting `-x` for now to ease debugging. I expect that to go away
but it's helpful now.
Fix haproxy-watcher bash indent syntax error
Fix the worker server haproxy section as well
More indent fixes because yaml and bash heredoc interop
Enable mdns-publisher on the workers again
This time bringing in the necessary mdns config too.
openstack: point master nodes to themselves for DNS
openstack: render coredns Corefile directly
openstack: add test_data for openstack mdns files
openstack: add A records for api(-int)
hacks: forward all non-cluster traffic to 8.8.8.8
Add *.apps wildcard entry
don't use /tmp for clustervars
hacks: send *.apps to all masters
hacks: don't add bootstrap to round robin api-int
run mdns on workers?
openshift-metalkube -> openshift-metal3
Runs HaProxy in static pods on master nodes
haproxy static pod almost working
Move haproxy-watcher to standalone script and excape go templating
More good stuff for haproxy
Use clustervars instead of a function in
`pkg/controller/template/render.go` that only works when MCO is running
on the bootstrap node.
Only add workers to haproxy config when there are workers in the
cluster.
Reinitialize the variables for each iteration of the haproxy-watcher.sh
script.
Fix worker hostnames in mdns config
Hostnames for workers should be like: host-10-0-128-22
WIP Current status for keepalived
Add VIP for DNS
Fix keepalived config
vrrp_instance keyword for ${CLUSTER_NAME}_API was duplicated.
Make the infra object available for template rendering
oVirt and possibly other platforms needs access the InfrastructureStatus
to render properly config files with stuff like API VIP, DNS VIP etc.
By embedding the infra object its possibly to directly get that in
tremplates and reder for example a coredns config file like this:
```
{{ .Infra.Status.PlatformStatus.Ovirt.DnsVIP }} api-int.{{.EtcdDiscoveryDomain}} api.{{.EtcdDiscoveryDomain}} ns1.{{.EtcdDiscoveryDomain}}
```
Signed-off-by: Roy Golan <[email protected]>
HACK: vendor: vendor openshift API PlatformStatus changes
This is a manual workaround for testing until this PR merges:
openshift/api#374
WIP: Add worker DNS static pod and read VIPs from PlatformStatus
vendor: update openshift/api to the latest changes
Use openshift namespaced images where possible
Update the latest fields from PlatformStatus
WIP replace 127.0.0.1 in DHCP config with the node's fixed IP
This uses this the runtimecfg tool:
github.com/openshift/baremetal-runtimecfg
Eventually, we'll want to replace all our other scripts that gather
IPs for keepalived etc. with it too.
escape
Render /etc/dhcp/dhclient.conf and /etc/resolv.conf with a static pod
This doesn't seem to work, because the discovery pods running on the
masters keep failing because they use DNS from before the resolv.conf
change.
Resolv.conf is now updated at runtime once kubelet launches sattic
pods, see.
Also, the render config pods keep restarting -- they succeed but
kubernetes thinks they should be kept running.
Use the DNS VIP for DNS for all nodes
This should provide DNS for every node, relying on teh master DNS VIP.
It might cause delays, because there will be a time when the VIP is
not active because coredns did not come up yet on any master.
We should really run coredns on the bootstrap node too.
Fix the copy-pasted filename for keepalived
The keepalived manifest was being created as `haproxy.yaml` which
kept overwriting the actual haproxy static pod, resulting in the LB
never being run.
Disable worker's coredns & mdns-publisher temporarily
The scripts are currently not working and I think they may be blocking
the worker kubelet readiness.
We're already setting the DNS VIP so they should be able to resolve the cluster.
The only thing that won't happen is the workers' own hostnames won't
be available. It's still not clear to me whether that's an issue or
not. Going to disable this for now to get a stable
deployment (hopefully) and then see.
Add initial haproxy configuration
This means haproxy should always start with this config as opposed to
relying on haproxy-watcher to create it.
Since this uses the API VIP, it should always work (so long as the VIP
is active).
Make the haxproxy-watcher script more stable
First, it uses API VIP unconditionally. That means the API access
should always work (as long as the VIP is assigned to a node that
responds to it -- which is handled by keepalived).
Second, it runs with `set -e` and `set -o pipefail` so as not to mess
known-good configuration. If anything fails, the script won't break
API access. I suspect this has been causing a lot of the weird racey
issues and just general unpleasantness.
Third, the master control plane no longer depends on this script
running successfully even once. There is a separate file with the
initial haproxy config, so even if this keeps failing, the control
plane should be unaffected.
Of course this script failing would mean the worker endpoints never
being added so we don't want that either.
And finally, it prints some messages to show what's going on as well
as setting `-x` for now to ease debugging. I expect that to go away
but it's helpful now.
Fix haproxy-watcher bash indent syntax error
Fix the worker server haproxy section as well
More indent fixes because yaml and bash heredoc interop
Enable mdns-publisher on the workers again
This time bringing in the necessary mdns config too.
The experimental OpenStack backend used to create an extra server running DNS and load balancer services that the cluster needed. OpenStack does not always come with DNSaaS or LBaaS so we had to provide the functionality the OpenShift cluster depends on (e.g. the etcd SRV records, the api-int records & load balancing, etc.). This approach is undesirable for two reasons: first, it adds an extra node that the other IPI platforms do not need. Second, this node is a single point of failure. The Baremetal platform has faced the same issues and they have solved them with a few virtual IP addresses managed by keepalived in combination with coredns static pod running on every node using the mDNS protocol to update records as new nodes are added or removed and a similar static pod haproxy to load balance the control plane internally. The VIPs are defined here in the installer and they use the PlatformStatus field to be passed to the necessary machine-config-operator fields: openshift/api#374 The Bare Metal IPI Networking Infrastructure document is broadly applicable here as well: https://github.com/openshift/installer/blob/master/docs/design/baremetal/networking-infrastructure.md Notable differences in OpenStack: * We only use the API and DNS VIPs right now * Instead of Baremetal's Ingress VIP (which is attached to the OpenShift routers) our haproxy static pods balance the 80 & 443 pods to the worker nodes * We do not run coredns on the bootstrap node. Instead, bootstrap itself uses one of the masters for DNS. These differences are not fundamental to OpenStack and we will be looking at aligning more closely with the Baremetal provider in the future. There is also a great oportunity to share some of the configuration files and scripts here. This change needs several other pull requests: Keepalived plus the coredns & haproxy static pods in the MCO: openshift/machine-config-operator/pull/740 Passing the API and DNS VIPs through the installer: openshift#1998 Vendoring the OpenStack PlatformStatus changes in the MCO: openshift/machine-config-operator#978 Allowing to use PlatformStatus in the MCO templates: openshift/machine-config-operator#943 Co-authored-by: Emilio Garcia <[email protected]> Co-authored-by: John Trowbridge <[email protected]> Co-authored-by: Martin Andre <[email protected]> Co-authored-by: Tomas Sedovic <[email protected]> Massive thanks to the Bare Metal and oVirt people!
The experimental OpenStack backend used to create an extra server running DNS and load balancer services that the cluster needed. OpenStack does not always come with DNSaaS or LBaaS so we had to provide the functionality the OpenShift cluster depends on (e.g. the etcd SRV records, the api-int records & load balancing, etc.). This approach is undesirable for two reasons: first, it adds an extra node that the other IPI platforms do not need. Second, this node is a single point of failure. The Baremetal platform has faced the same issues and they have solved them with a few virtual IP addresses managed by keepalived in combination with coredns static pod running on every node using the mDNS protocol to update records as new nodes are added or removed and a similar static pod haproxy to load balance the control plane internally. The VIPs are defined here in the installer and they use the PlatformStatus field to be passed to the necessary machine-config-operator fields: openshift/api#374 The Bare Metal IPI Networking Infrastructure document is applicable here as well: https://github.com/openshift/installer/blob/master/docs/design/baremetal/networking-infrastructure.md There is also a great opportunity to share some of the configuration files and scripts here. This change needs several other pull requests: Keepalived plus the coredns & haproxy static pods in the MCO: openshift/machine-config-operator#740 Passing the API and DNS VIPs through the installer: openshift#1998 Co-authored-by: Emilio Garcia <[email protected]> Co-authored-by: John Trowbridge <[email protected]> Co-authored-by: Martin Andre <[email protected]> Co-authored-by: Tomas Sedovic <[email protected]> Massive thanks to the Bare Metal and oVirt people!
The experimental OpenStack backend used to create an extra server running DNS and load balancer services that the cluster needed. OpenStack does not always come with DNSaaS or LBaaS so we had to provide the functionality the OpenShift cluster depends on (e.g. the etcd SRV records, the api-int records & load balancing, etc.). This approach is undesirable for two reasons: first, it adds an extra node that the other IPI platforms do not need. Second, this node is a single point of failure. The Baremetal platform has faced the same issues and they have solved them with a few virtual IP addresses managed by keepalived in combination with coredns static pod running on every node using the mDNS protocol to update records as new nodes are added or removed and a similar static pod haproxy to load balance the control plane internally. The VIPs are defined here in the installer and they use the PlatformStatus field to be passed to the necessary machine-config-operator fields: openshift/api#374 The Bare Metal IPI Networking Infrastructure document is applicable here as well: https://github.com/openshift/installer/blob/master/docs/design/baremetal/networking-infrastructure.md There is also a great opportunity to share some of the configuration files and scripts here. This change needs several other pull requests: Keepalived plus the coredns & haproxy static pods in the MCO: openshift/machine-config-operator#740 Passing the API and DNS VIPs through the installer: openshift#1998 Co-authored-by: Emilio Garcia <[email protected]> Co-authored-by: John Trowbridge <[email protected]> Co-authored-by: Martin Andre <[email protected]> Co-authored-by: Tomas Sedovic <[email protected]> Massive thanks to the Bare Metal and oVirt people!
The experimental OpenStack backend used to create an extra server running DNS and load balancer services that the cluster needed. OpenStack does not always come with DNSaaS or LBaaS so we had to provide the functionality the OpenShift cluster depends on (e.g. the etcd SRV records, the api-int records & load balancing, etc.). This approach is undesirable for two reasons: first, it adds an extra node that the other IPI platforms do not need. Second, this node is a single point of failure. The Baremetal platform has faced the same issues and they have solved them with a few virtual IP addresses managed by keepalived in combination with coredns static pod running on every node using the mDNS protocol to update records as new nodes are added or removed and a similar static pod haproxy to load balance the control plane internally. The VIPs are defined here in the installer and they use the PlatformStatus field to be passed to the necessary machine-config-operator fields: openshift/api#374 The Bare Metal IPI Networking Infrastructure document is applicable here as well: https://github.com/openshift/installer/blob/master/docs/design/baremetal/networking-infrastructure.md There is also a great opportunity to share some of the configuration files and scripts here. This change needs several other pull requests: Keepalived plus the coredns & haproxy static pods in the MCO: openshift/machine-config-operator#740 Co-authored-by: Emilio Garcia <[email protected]> Co-authored-by: John Trowbridge <[email protected]> Co-authored-by: Martin Andre <[email protected]> Co-authored-by: Tomas Sedovic <[email protected]> Massive thanks to the Bare Metal and oVirt people!
The two fields OpenStack needs are the API and DNS VIPs which are
needed by the coredns and loadbalancer static pods running on the
nodes.