Skip to content

Conversation

@trown
Copy link

@trown trown commented Jul 16, 2019

For platforms without a DNS as a Service available, there is a
chicken/egg scenario. This is because we cannot setup our own DNS
solution on the nodes before they are booted via ignition.

Instead, we get the ignition config via a virtual IP that is
conifgured and maintained by keepalived. The setup of keepalived
will be done via machine-config-operator, so it is not part of
this patch.

This patch only sets up the plumbing so that we can merge the
keepalived work while keeping CI green. Then we can put up a patch
that will switch to actually using this functionality.

This patch also adds an interface for other platforms that need this
functionality to work from. Specifically, these PRs can be rebased
on this work:
#1873
#1948

@openshift-ci-robot openshift-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jul 16, 2019
@trown trown force-pushed the ignition-via-ip branch from b8f39ac to f5f4125 Compare July 16, 2019 12:57
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI the values that the (merged) Baremetal PlatfomStatus support uses APIServerInternalIP and NodeDNSIP names:

https://github.com/openshift/api/pull/348/files

And so does the current (unmerged) OpenStack equivalent: openshift/api#374

(not sure how much sense does it make for these to align)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I needed to rebase this on what the baremetal platform landed I named the vars similar to theirs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@tomassedovic tomassedovic Jul 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that PR is linked from the commit message here (as is the oVirt's equivalent: #1948).

The intention here is to merge this smallish patch so that all three platforms (baremetal, ovirt, openstack) can validate this is an acceptable solution and that they can build on.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

baremetal does similar here too

@trown
Copy link
Author

trown commented Jul 16, 2019

/retest

Copy link
Contributor

@abhinavdahiya abhinavdahiya Jul 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be calculated when these fields exist in InstallConfig.
f5f4125#diff-26eeddc74b12e6afb9e2932066a9b49fR33

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@abhinavdahiya
Copy link
Contributor

based on f5f4125#diff-d7037682a68150f4dbbbf5cc0d6e8f54R21

please move the PR to WIP...

@trown
Copy link
Author

trown commented Jul 17, 2019

/hold
WIP

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 17, 2019
@trown trown force-pushed the ignition-via-ip branch from f5f4125 to 602bf6f Compare July 17, 2019 18:43
@trown
Copy link
Author

trown commented Jul 17, 2019

/hold cancel
I addressed the comment which moved this to WIP.

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 17, 2019
@trown trown changed the title openstack: get ignition via IP WIP openstack: get ignition via IP Jul 17, 2019
@openshift-ci-robot openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 17, 2019
@trown trown force-pushed the ignition-via-ip branch from 602bf6f to f2c9cf6 Compare July 18, 2019 00:12
@openshift-ci-robot openshift-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jul 18, 2019
@trown trown force-pushed the ignition-via-ip branch 3 times, most recently from 22b71db to 1bc6dba Compare July 18, 2019 02:21
@openshift-ci-robot openshift-ci-robot removed the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Jul 18, 2019
For platforms without a DNS as a Service available, there is a
chicken/egg scenario. This is because we cannot setup our own DNS
solution on the nodes before they are booted via ignition.

Instead, we get the ignition config via a virtual IP that is
conifgured and maintained by keepalived. The setup of keepalived
will be done via machine-config-operator, so it is not part of
this patch.

This patch only sets up the plumbing so that we can merge the
keepalived work while keeping CI green. Then we can put up a patch
that will switch to actually using this functionality.

This patch also adds an interface for other platforms that need this
functionality to work from. Specifically, these PRs can be rebased
on this work:
openshift#1873
openshift#1948
@trown trown force-pushed the ignition-via-ip branch from 1bc6dba to deb3a94 Compare July 18, 2019 02:34
@trown trown changed the title WIP openstack: get ignition via IP openstack: get ignition via IP Jul 18, 2019
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 18, 2019
@tomassedovic
Copy link
Contributor

@openshift-ci-robot openshift-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 18, 2019
@openshift-ci-robot openshift-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 18, 2019
@iamemilio
Copy link

/retest

1 similar comment
@tomassedovic
Copy link
Contributor

/retest

@tomassedovic
Copy link
Contributor

/lgtm
/approve

From the OpenStack side, this works great and we need it. Bare Metal broke the ground with this approach so I'm all in favour of merging it.

As it touches files not owned by OpenStack, it needs someone else to approve though.

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jul 19, 2019
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: tomassedovic, trown
To complete the pull request process, please assign wking
You can assign the PR to them by writing /assign @wking in a comment when ready.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

mandre pushed a commit to mandre/installer that referenced this pull request Jul 26, 2019
The experimental OpenStack backend used to create an extra server
running DNS and load balancer services that the cluster needed.
OpenStack does not always come with DNSaaS or LBaaS so we had to provide
the functionality the OpenShift cluster depends on (e.g. the etcd SRV
records, the api-int records & load balancing, etc.).

This approach is undesirable for two reasons: first, it adds an extra
node that the other IPI platforms do not need. Second, this node is a
single point of failure.

The Baremetal platform has faced the same issues and they have solved
them with a few virtual IP addresses managed by keepalived in
combination with coredns static pod running on every node using the mDNS
protocol to update records as new nodes are added or removed and a
similar static pod haproxy to load balance the control plane internally.

The VIPs are defined here in the installer and they use the
PlatformStatus field to be passed to the necessary
machine-config-operator fields:

openshift/api#374

The Bare Metal IPI Networking Infrastructure document is broadly
applicable here as well:

https://github.com/openshift/installer/blob/master/docs/design/baremetal/networking-infrastructure.md

Notable differences in OpenStack:

* We only use the API and DNS VIPs right now
* Instead of Baremetal's Ingress VIP (which is attached to the OpenShift
  routers) our haproxy static pods balance the 80 & 443 pods to the
  worker nodes
* We do not run coredns on the bootstrap node. Instead, bootstrap itself
  uses one of the masters for DNS.

These differences are not fundamental to OpenStack and we will be
looking at aligning more closely with the Baremetal provider in the
future.

There is also a great oportunity to share some of the configuration
files and scripts here.

This change needs several other pull requests:

Keepalived plus the coredns & haproxy static pods in the MCO:
openshift/machine-config-operator/pull/740

Passing the API and DNS VIPs through the installer:
openshift#1998

Vendoring the OpenStack PlatformStatus changes in the MCO:
openshift/machine-config-operator#978

Allowing to use PlatformStatus in the MCO templates:
openshift/machine-config-operator#943

Co-authored-by: Emilio Garcia <[email protected]>
Co-authored-by: John Trowbridge <[email protected]>
Co-authored-by: Martin Andre <[email protected]>
Co-authored-by: Tomas Sedovic <[email protected]>

Massive thanks to the Bare Metal and oVirt people!
@tomassedovic
Copy link
Contributor

/retest

iamemilio pushed a commit to iamemilio/installer that referenced this pull request Jul 26, 2019
The experimental OpenStack backend used to create an extra server
running DNS and load balancer services that the cluster needed.
OpenStack does not always come with DNSaaS or LBaaS so we had to provide
the functionality the OpenShift cluster depends on (e.g. the etcd SRV
records, the api-int records & load balancing, etc.).

This approach is undesirable for two reasons: first, it adds an extra
node that the other IPI platforms do not need. Second, this node is a
single point of failure.

The Baremetal platform has faced the same issues and they have solved
them with a few virtual IP addresses managed by keepalived in
combination with coredns static pod running on every node using the mDNS
protocol to update records as new nodes are added or removed and a
similar static pod haproxy to load balance the control plane internally.

The VIPs are defined here in the installer and they use the
PlatformStatus field to be passed to the necessary
machine-config-operator fields:

openshift/api#374

The Bare Metal IPI Networking Infrastructure document is applicable here as
well:

https://github.com/openshift/installer/blob/master/docs/design/baremetal/networking-infrastructure.md

There is also a great opportunity to share some of the configuration
files and scripts here.

This change needs several other pull requests:

Keepalived plus the coredns & haproxy static pods in the MCO:
openshift/machine-config-operator#740

Passing the API and DNS VIPs through the installer:
openshift#1998

Co-authored-by: Emilio Garcia <[email protected]>
Co-authored-by: John Trowbridge <[email protected]>
Co-authored-by: Martin Andre <[email protected]>
Co-authored-by: Tomas Sedovic <[email protected]>

Massive thanks to the Bare Metal and oVirt people!
@tomassedovic
Copy link
Contributor

/retest

2 similar comments
@tomassedovic
Copy link
Contributor

/retest

@tomassedovic
Copy link
Contributor

/retest

@tomassedovic
Copy link
Contributor

/test e2e-openstack

@tomassedovic
Copy link
Contributor

/retest

@openshift-ci-robot
Copy link
Contributor

@trown: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
ci/prow/e2e-openstack e7d3761 link /test e2e-openstack
ci/prow/e2e-aws-scaleup-rhel7 e7d3761 link /test e2e-aws-scaleup-rhel7

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

mandre pushed a commit to mandre/installer that referenced this pull request Aug 5, 2019
The experimental OpenStack backend used to create an extra server
running DNS and load balancer services that the cluster needed.
OpenStack does not always come with DNSaaS or LBaaS so we had to provide
the functionality the OpenShift cluster depends on (e.g. the etcd SRV
records, the api-int records & load balancing, etc.).

This approach is undesirable for two reasons: first, it adds an extra
node that the other IPI platforms do not need. Second, this node is a
single point of failure.

The Baremetal platform has faced the same issues and they have solved
them with a few virtual IP addresses managed by keepalived in
combination with coredns static pod running on every node using the mDNS
protocol to update records as new nodes are added or removed and a
similar static pod haproxy to load balance the control plane internally.

The VIPs are defined here in the installer and they use the
PlatformStatus field to be passed to the necessary
machine-config-operator fields:

openshift/api#374

The Bare Metal IPI Networking Infrastructure document is applicable here as
well:

https://github.com/openshift/installer/blob/master/docs/design/baremetal/networking-infrastructure.md

There is also a great opportunity to share some of the configuration
files and scripts here.

This change needs several other pull requests:

Keepalived plus the coredns & haproxy static pods in the MCO:
openshift/machine-config-operator#740

Passing the API and DNS VIPs through the installer:
openshift#1998

Co-authored-by: Emilio Garcia <[email protected]>
Co-authored-by: John Trowbridge <[email protected]>
Co-authored-by: Martin Andre <[email protected]>
Co-authored-by: Tomas Sedovic <[email protected]>

Massive thanks to the Bare Metal and oVirt people!
@tomassedovic
Copy link
Contributor

This change is now out of date and it is carried in #1959 which we expected to merge soon.

I'm going to close this PR but its spirit will live on! Thanks, John!

/close

@openshift-ci-robot
Copy link
Contributor

@tomassedovic: Closed this PR.

Details

In response to this:

This change is now out of date and it is carried in #1959 which we expected to merge soon.

I'm going to close this PR but its spirit will live on! Thanks, John!

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants