Skip to content

[WIP] Two stage update hypershift IC #1857

Closed
JacobTanenbaum wants to merge 11 commits intoopenshift:masterfrom
JacobTanenbaum:two-stage-update
Closed

[WIP] Two stage update hypershift IC #1857
JacobTanenbaum wants to merge 11 commits intoopenshift:masterfrom
JacobTanenbaum:two-stage-update

Conversation

@JacobTanenbaum
Copy link
Contributor

@JacobTanenbaum JacobTanenbaum commented Jun 30, 2023

This combines ricky-rav's PR #1846 and my PR that just deploys hypershift with ovn-IC #1832 and tries to do the two phase update with ovn-IC to minimize downtime

action items moving forward:

  • run ovn-e2e tests to ensure this works (I verified that PR1832 seems to work with some basic e2e-testing on a deployed cluster)
  • look into using a third directory for yaml files
    • set one is single-zone ovn-IC on hypershift
    • set two is the intermediate multizone node rollout with the master and routes required for single-zone
    • set three is the final step with only the multizone master and the multizon node yamls in order to delete the routes safely
  • look into changing the statefulset of the multizone master into a deployment

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 30, 2023
@openshift-ci openshift-ci bot requested review from danwinship and tssurya June 30, 2023 15:35
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 30, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: JacobTanenbaum

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 30, 2023
@JacobTanenbaum JacobTanenbaum force-pushed the two-stage-update branch 2 times, most recently from 78ce3bf to f6b8080 Compare June 30, 2023 16:48
ricky-rav and others added 11 commits June 30, 2023 12:48
[work in progress!]

- Determine OVN interconnect zone mode by inspecting an (optional) configMap; apply the desired zone mode.
- upgrade from non-IC to IC OVN-K by going through an intermediate step with 1-zone
- Switch from IC single zone to IC multizone (as in upgrades) and back (not fully supported yet, for internal use only)
Signed-off-by: Riccardo Ravaioli <rravaiol@redhat.com>
Signed-off-by: Riccardo Ravaioli <rravaiol@redhat.com>
Signed-off-by: Riccardo Ravaioli <rravaiol@redhat.com>
Signed-off-by: Riccardo Ravaioli <rravaiol@redhat.com>
In the very last step of the 2-phase upgrade to OVN interconnect, we remove the IC configmap.
At this point, SetFromPods from pod_status.go won't be called any more, because all changes to the daemonsets have been processed. Patch the ovnk master daemonset with a dummy annotation to trigger status recalculation.

TODO: find a better way to run SetFromPods instead of updating ovnk master annotations

Signed-off-by: Riccardo Ravaioli <rravaiol@redhat.com>
Signed-off-by: Riccardo Ravaioli <rravaiol@redhat.com>
Avoid clashes between single-zone ovnkube-master (using ports 9102, 9641, 9642, 29102) and multizone ovnkube-node (initially using ports 9103, 9105, 9102, 29102, 29103) during upgrade from 4.13 and avoid using ports reserved for the storage components.

In a previous commit I added 100 to all ports in multizone ovnkube-node, but the ports in the 9200-9219 range are reserved for CSI drivers (storage team), as described in https://github.com/openshift/enhancements/blob/master/dev-guide/host-port-registry.md  This caused the storage operator to never be available after installation of or upgrade to 4.14.

So in multizone ovnkube-node let's now have:
- 9103, 9105, 29103 (which don't collide with single-zone ovnkube-master)
- 9112, 9112 9113, 29113 so as to not collide with single-zone ovnkube-master

Signed-off-by: Riccardo Ravaioli <rravaiol@redhat.com>
here I rename the yaml files and change them to work for hypershift-IC
The major difference between managed and self hosted IC upgrade is the
stateful set on the master side. patch the update so the commit works
for this too.

HACK to get the stateful set updated was to remove it when safe to do so
in order to allow the CNO to create a version updated as we need to in
phase 2
@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 12, 2023
@openshift-merge-robot
Copy link
Contributor

PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 23, 2024
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Apr 25, 2024

@JacobTanenbaum: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-gcp-sdn 2f12a1d link true /test e2e-gcp-sdn
ci/prow/e2e-ovn-ipsec-step-registry 2f12a1d link false /test e2e-ovn-ipsec-step-registry
ci/prow/e2e-vsphere-ovn-dualstack 2f12a1d link false /test e2e-vsphere-ovn-dualstack
ci/prow/e2e-openstack-ovn 2f12a1d link false /test e2e-openstack-ovn
ci/prow/e2e-metal-ipi-ovn-ipv6-ipsec 2f12a1d link false /test e2e-metal-ipi-ovn-ipv6-ipsec
ci/prow/e2e-aws-ovn-windows 2f12a1d link true /test e2e-aws-ovn-windows
ci/prow/e2e-network-mtu-migration-ovn-ipv4 2f12a1d link false /test e2e-network-mtu-migration-ovn-ipv4
ci/prow/e2e-ovn-hybrid-step-registry 2f12a1d link false /test e2e-ovn-hybrid-step-registry
ci/prow/e2e-gcp-ovn 2f12a1d link true /test e2e-gcp-ovn
ci/prow/e2e-network-mtu-migration-ovn-ipv6 2f12a1d link false /test e2e-network-mtu-migration-ovn-ipv6
ci/prow/e2e-aws-ovn-serial 2f12a1d link false /test e2e-aws-ovn-serial
ci/prow/e2e-aws-sdn-network-reverse-migration 2f12a1d link true /test e2e-aws-sdn-network-reverse-migration
ci/prow/e2e-hypershift-ovn 2f12a1d link true /test e2e-hypershift-ovn
ci/prow/e2e-azure-ovn 2f12a1d link false /test e2e-azure-ovn
ci/prow/e2e-aws-ovn-single-node 2f12a1d link false /test e2e-aws-ovn-single-node
ci/prow/e2e-aws-sdn-network-migration-rollback 2f12a1d link true /test e2e-aws-sdn-network-migration-rollback
ci/prow/e2e-aws-ovn-network-migration 2f12a1d link true /test e2e-aws-ovn-network-migration
ci/prow/e2e-vsphere-ovn 2f12a1d link false /test e2e-vsphere-ovn
ci/prow/e2e-vsphere-ovn-windows 2f12a1d link true /test e2e-vsphere-ovn-windows
ci/prow/e2e-gcp-ovn-upgrade 2f12a1d link false /test e2e-gcp-ovn-upgrade
ci/prow/unit 2f12a1d link true /test unit
ci/prow/e2e-ovn-step-registry 2f12a1d link false /test e2e-ovn-step-registry
ci/prow/e2e-metal-ipi-ovn-ipv6 2f12a1d link true /test e2e-metal-ipi-ovn-ipv6
ci/prow/4.15-upgrade-from-stable-4.14-images 2f12a1d link true /test 4.15-upgrade-from-stable-4.14-images
ci/prow/4.16-upgrade-from-stable-4.15-images 2f12a1d link true /test 4.16-upgrade-from-stable-4.15-images
ci/prow/e2e-aws-live-migration-sdn-ovn 2f12a1d link true /test e2e-aws-live-migration-sdn-ovn
ci/prow/e2e-aws-ovn-hypershift-conformance 2f12a1d link true /test e2e-aws-ovn-hypershift-conformance
ci/prow/e2e-aws-ovn-upgrade 2f12a1d link true /test e2e-aws-ovn-upgrade
ci/prow/4.16-upgrade-from-stable-4.15-e2e-gcp-ovn-rt-upgrade 2f12a1d link true /test 4.16-upgrade-from-stable-4.15-e2e-gcp-ovn-rt-upgrade
ci/prow/e2e-azure-ovn-upgrade 2f12a1d link true /test e2e-azure-ovn-upgrade

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-bot
Copy link
Contributor

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci openshift-ci bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants