[DNM] [test only] Test open sgs #2180

tomassedovic · 2019-08-08T10:38:48Z

another investigative PR for the bizzare CI stuff we're seeing for #1959

This adds opens all SGs (maybe we've missed some?) as well as adds SSH keys so we can log in to the bootstrap node and investigate what's wrong.

For platforms without a DNS as a Service available, there is a chicken/egg scenario. This is because we cannot setup our own DNS solution on the nodes before they are booted via ignition. Instead, we get the ignition config via a virtual IP that is conifgured and maintained by keepalived. The setup of keepalived will be done via machine-config-operator, so it is not part of this patch. This patch only sets up the plumbing so that we can merge the keepalived work while keeping CI green. Then we can put up a patch that will switch to actually using this functionality. This patch also adds an interface for other platforms that need this functionality to work from. Specifically, these PRs can be rebased on this work: #1873 #1948

The experimental OpenStack backend used to create an extra server running DNS and load balancer services that the cluster needed. OpenStack does not always come with DNSaaS or LBaaS so we had to provide the functionality the OpenShift cluster depends on (e.g. the etcd SRV records, the api-int records & load balancing, etc.). This approach is undesirable for two reasons: first, it adds an extra node that the other IPI platforms do not need. Second, this node is a single point of failure. The Baremetal platform has faced the same issues and they have solved them with a few virtual IP addresses managed by keepalived in combination with coredns static pod running on every node using the mDNS protocol to update records as new nodes are added or removed and a similar static pod haproxy to load balance the control plane internally. The VIPs are defined here in the installer and they use the PlatformStatus field to be passed to the necessary machine-config-operator fields: openshift/api#374 The Bare Metal IPI Networking Infrastructure document is applicable here as well: https://github.com/openshift/installer/blob/master/docs/design/baremetal/networking-infrastructure.md There is also a great opportunity to share some of the configuration files and scripts here. This change needs several other pull requests: Keepalived plus the coredns & haproxy static pods in the MCO: openshift/machine-config-operator#740 Passing the API and DNS VIPs through the installer: #1998 Co-authored-by: Emilio Garcia <[email protected]> Co-authored-by: John Trowbridge <[email protected]> Co-authored-by: Martin Andre <[email protected]> Co-authored-by: Tomas Sedovic <[email protected]> Massive thanks to the Bare Metal and oVirt people!

This adds validations that verify the OpenStack VIPs have the expected values. These are currently not expected to be configured by the user so they're just compared to the expected values. Since the MachineCIDR is not part of the Platform struct, we need to pass it to OpenStack's ValidatePlatform.

Previously the service VM would proxy all traffic to the masters and workers. With this new architecture the worker nodes are addressed via an ingress VIP local to the cluster, and need to be reachable from the outside in order to serve the *.apps. We're attaching a floating IP to the port, that maps to the ingress VIP shared among the workers.

The `APIVIP`, `DNSVIP` and `IngressVIP` values not supposed to be user-configurable. They're set by the installer and setting them to a value other than the expected one will likely result in an error. Therefore, this removes them from `InstallConfig.Platform` in favour of free functions that return the expected values.

We'd removed the VIPs from the install-config, but not the test data itself.

This should let us log into the bootstrap node to see what's going on. This is a workaround until we've got must-gather implemented.

openshift-ci-robot · 2019-08-08T10:39:00Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: tomassedovic
To complete the pull request process, please assign smarterclayton
You can assign the PR to them by writing /assign @smarterclayton in a comment when ready.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

tomassedovic · 2019-08-08T10:46:08Z

/label platform/openstack

tomassedovic · 2019-08-08T10:46:19Z

/hold

tomassedovic · 2019-08-08T11:18:40Z

/close

openshift-ci-robot · 2019-08-08T11:18:41Z

@tomassedovic: Closed this PR.

Details

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-ci-robot · 2019-08-08T11:34:07Z

@tomassedovic: The following tests failed, say /retest to rerun them all:

Test name	Commit	Details	Rerun command
ci/prow/e2e-openstack	`f1c4756`	link	`/test e2e-openstack`
ci/prow/e2e-aws-scaleup-rhel7	`f1c4756`	link	`/test e2e-aws-scaleup-rhel7`

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

trown and others added 9 commits August 5, 2019 16:19

terraform style

55f3b77

openstack: fix the failing tests

461fe70

We'd removed the VIPs from the install-config, but not the test data itself.

Open all security groups

ec665d4

Hardcode shadower and mandre SSH keys into the bootstrap node

f1c4756

This should let us log into the bootstrap node to see what's going on. This is a workaround until we've got must-gather implemented.

openshift-ci-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Aug 8, 2019

openshift-ci-robot requested review from jcpowermac and jstuever August 8, 2019 10:39

openshift-ci-robot added the platform/openstack label Aug 8, 2019

openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 8, 2019

openshift-ci-robot closed this Aug 8, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[DNM] [test only] Test open sgs #2180

[DNM] [test only] Test open sgs #2180

Uh oh!

tomassedovic commented Aug 8, 2019

Uh oh!

openshift-ci-robot commented Aug 8, 2019

Uh oh!

tomassedovic commented Aug 8, 2019

Uh oh!

tomassedovic commented Aug 8, 2019

Uh oh!

tomassedovic commented Aug 8, 2019

Uh oh!

openshift-ci-robot commented Aug 8, 2019

Uh oh!

openshift-ci-robot commented Aug 8, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[DNM] [test only] Test open sgs #2180

[DNM] [test only] Test open sgs #2180

Uh oh!

Conversation

tomassedovic commented Aug 8, 2019

Uh oh!

openshift-ci-robot commented Aug 8, 2019

Uh oh!

tomassedovic commented Aug 8, 2019

Uh oh!

tomassedovic commented Aug 8, 2019

Uh oh!

tomassedovic commented Aug 8, 2019

Uh oh!

openshift-ci-robot commented Aug 8, 2019

Uh oh!

openshift-ci-robot commented Aug 8, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants