Skip to content

Conversation

@cybertron
Copy link
Member

Initially this was only enabled for baremetal in the interest of
limiting the impacts of any new bugs introduced by the change. We've
now been running with unicast on baremetal since 4.6 and should have
worked out the kinks. Additionally, other on-prem platforms (notably
vsphere) are running into similar problems to what prompted the
change on baremetal.

In the interest of simplifying the support matrix for on-prem
platforms, this enables unicast on all of them. For the moment that
entails only hard-coding the ENABLE_UNICAST env var and moving the
flip-mode file template from baremetal-specific to all of on-prem.
I'm leaving the render functions for the moment in the interest of
avoiding issues on upgrade if an old template gets rendered by a
newer MCO. They can be cleaned up in 4.12.

- What I did

- How to verify it

- Description for the changelog
Enable unicast keepalived for all on-prem platforms.

Initially this was only enabled for baremetal in the interest of
limiting the impacts of any new bugs introduced by the change. We've
now been running with unicast on baremetal since 4.6 and should have
worked out the kinks. Additionally, other on-prem platforms (notably
vsphere) are running into similar problems to what prompted the
change on baremetal.

In the interest of simplifying the support matrix for on-prem
platforms, this enables unicast on all of them. For the moment that
entails only hard-coding the ENABLE_UNICAST env var and moving the
flip-mode file template from baremetal-specific to all of on-prem.
I'm leaving the render functions for the moment in the interest of
avoiding issues on upgrade if an old template gets rendered by a
newer MCO. They can be cleaned up in 4.12.
@openshift-ci openshift-ci bot requested review from cgwalters and mandre March 14, 2022 20:23
@cybertron
Copy link
Member Author

fatal: unable to access 'https://github.com/openshift/machine-config-operator.git/': Could not resolve host: github.com

seems to be the reason all the jobs insta-failed. I am confused why make update made no changes here. Was there a change to how the manifests directory is handled?

@cybertron
Copy link
Member Author

/retest
/cc @jcpowermac @rvanderp3

@openshift-ci openshift-ci bot requested review from jcpowermac and rvanderp3 March 14, 2022 20:39
@cgwalters
Copy link
Member

/approve

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 14, 2022
@jcpowermac
Copy link
Contributor

@cybertron what is the plan for upgrade? Is that an issue?

@cybertron
Copy link
Member Author

@cybertron what is the plan for upgrade? Is that an issue?

There is code for handling the transition to unicast on upgrade, but we discovered that it has a bug[0] as part of the discussion that led to this change. However, all that means is existing clusters stay multicast and new clusters get unicast. Since clusters that were installed multicast were obviously working in that mode, I don't see it as a blocker to switching new clusters (although we do want to fix it for consistency, of course).

We wanted to push ahead with this change first so we can get a lot of soak time on the change in the 4.11 cycle just in case there are any platform-specific issues that come up. If we can fix the mode change bug in 4.11 too then that's a bonus, but either way this will fix the vrid collision problem for new clusters.

0: https://bugzilla.redhat.com/show_bug.cgi?id=2053309

@cybertron
Copy link
Member Author

/test e2e-metal-ipi
/test e2e-vsphere-upgrade

@jcpowermac
Copy link
Contributor

/test e2e-vsphere

Copy link
Member

@mandre mandre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as far as OpenStack is concerned.

@jcpowermac
Copy link
Contributor

vSphere install and upgrade failure unrelated

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Mar 15, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 15, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cgwalters, cybertron, jcpowermac

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

1 similar comment
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@kikisdeliveryservice
Copy link
Contributor

/test e2e-vsphere-upgrade

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 16, 2022

@cybertron: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-upgrade-single-node 84d0bae link false /test e2e-aws-upgrade-single-node
ci/prow/e2e-aws-disruptive 84d0bae link false /test e2e-aws-disruptive
ci/prow/e2e-ovn-step-registry 84d0bae link false /test e2e-ovn-step-registry
ci/prow/e2e-vsphere 84d0bae link false /test e2e-vsphere
ci/prow/e2e-vsphere-upgrade 84d0bae link false /test e2e-vsphere-upgrade

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants