Skip to content

Conversation

@wking
Copy link
Member

@wking wking commented May 3, 2023

Discussion in OPNET-296 has some details, but trying to unpack "all on-prem IPI except baremetal IPI" into specifics, this template is in an on-prem directory configuring keepalived, and it switches on onPremPlatformAPIServerInternalIP for enabled vs. disabled. onPremPlatformAPIServerInternalIP is true (enabling the keepalived configuration) for:

Before 4.11, ENABLE_UNICAST was conditional on onPremPlatformKeepalivedEnableUnicast, but since 4.11, it has always been yes. The platforms that were unicast on 4.10's onPremPlatformKeepalivedEnableUnicast were BareMetal and KubeVirt.

Putting this all together, AWS and other platforms that don't match the onPremPlatformAPIServerInternalIP logic aren't impacted, because they don't enable the keepalived configuration. BareMetal is not impacted by 4.10-to-4.11 updates, because any to-unicast transition issues will already have been resolved by 4.10. Remaining onPremPlatformAPIServerInternalIP platforms which occur in both 4.10 and 4.11 are interested, and I match them here.

Generated by writing the 4.11.0 declaration by hand, and then copying out to other 4.11 releases with:

$ curl -s 'https://api.openshift.com/api/upgrades_info/graph?channel=candidate-4.11' | jq -r '.nodes[].version' | grep '^4[.]11[.]' | grep -v '^4[.]11[.]0$' | while read V; do sed "s/4[.]11[.]0/${V}/g" blocked-edges/4.11.0-KeepalivedMulticastSkew.yaml > "blocked-edges/${V}-KeepalivedMulticastSkew.yaml"; done
$ git add blocked-edges/4.11.*KeepalivedMulticastSkew.yaml

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label May 3, 2023
@openshift-ci-robot
Copy link

openshift-ci-robot commented May 3, 2023

@wking: This pull request references OPNET-296 which is a valid jira issue.

Details

In response to this:

Discussion in OPNET-296 has some details, but trying to unpack "all on-prem IPI except baremetal IPI" into specifics, this template is in an on-prem directory configuring keepalived, and it switches on onPremPlatformAPIServerInternalIP for enabled vs. disabled. onPremPlatformAPIServerInternalIP is true (enabling the keepalived configuration) for:

Before 4.11, ENABLE_UNICAST was conditional on onPremPlatformKeepalivedEnableUnicast, but since 4.11, it has always been yes. The platforms that were unicast on 4.10's onPremPlatformKeepalivedEnableUnicast were BareMetal and KubeVirt.

Putting this all together, AWS and other platforms that don't match the onPremPlatformAPIServerInternalIP logic aren't impacted, because they don't enable the keepalived configuration. BareMetal is not impacted by 4.10-to-4.11 updates, because any to-unicast transition issues will already have been resolved by 4.10. Remaining onPremPlatformAPIServerInternalIP platforms which occur in both 4.10 and 4.11 are interested, and I match them here.

Generated by writing the 4.11.0 declaration by hand, and then copying out to other 4.11 releases with:

$ curl -s 'https://api.openshift.com/api/upgrades_info/graph?channel=candidate-4.11' | jq -r '.nodes[].version' | grep '^4[.]11[.]' | grep -v '^4[.]11[.]0$' | while read V; do sed "s/4[.]11[.]0/${V}/g" blocked-edges/4.11.0-KeepalivedMulticastSkew.yaml > "blocked-edges/${V}-KeepalivedMulticastSkew.yaml"; done
$ git add blocked-edges/4.11.*KeepalivedMulticastSkew.yaml

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 3, 2023
@wking
Copy link
Member Author

wking commented May 4, 2023

Testing various PromQL in an launch 4.10.58 ovirt cluster-bot cluster:

Impacted infrastructure platform, has at least one machine-config-controlled compute node

Real query

max(
  cluster_infrastructure_provider{type=~"OpenStack|oVirt|VSphere"}
  or
  0 * cluster_infrastructure_provider
)
*
(
  group(topk by (node) (1, mcd_update_state{config!~"|rendered-master-.*"}))
  or
  0 * group(mcd_update_state)
)

shows the risk matching with a 1:

image

Other platforms

With oVirtNot, simulating an AWS or other non-impacted cluster, the query drops to zero as a non-match:

image

No MachineConfig compute

And with .* matching every MachineConfig to simulate a cluster with no MachineConfig-managed compute, the query drops to zero as a non-match:

image

Discussion in [1] has some details, but trying to unpack "all on-prem
IPI except baremetal IPI" into specifics, [2] is in an on-prem
directory configuring keepalived, and it switches on
onPremPlatformAPIServerInternalIP for enabled vs. disabled.
onPremPlatformAPIServerInternalIP is true (enabling the keepalived
configuration) for:

* BareMetal (4.10 [3] and 4.11 [4])
* oVirt (4.10 [3] and 4.11 [4])
* OpenStack (4.10 [3] and 4.11 [4])
* VSphere (4.10 [3] and 4.11 [4]),
* KubeVirt (4.10 [3], dropped in 4.11 [4,5])
* Nutanix (new in 4.11 [4,6,7]).

Before 4.11, ENABLE_UNICAST was conditional on
onPremPlatformKeepalivedEnableUnicast [8], but since 4.11, it has
always been 'yes' [9].  The platforms that were unicast on 4.10's
onPremPlatformKeepalivedEnableUnicast were BareMetal and KubeVirt
[10].

Putting this all together, AWS and other platforms that don't match
the onPremPlatformAPIServerInternalIP logic aren't impacted, because
they don't enable the keepalived configuration.  BareMetal is not
impacted by 4.10-to-4.11 updates, because any to-unicast transition
issues will already have been resolved by 4.10.  Remaining
onPremPlatformAPIServerInternalIP platforms which occur in both 4.10
and 4.11 are interested, and I match them here.

Generated by writing the 4.11.0 declaration by hand, and then copying
out to other 4.11 releases with:

  $ curl -s 'https://api.openshift.com/api/upgrades_info/graph?channel=candidate-4.11' | jq -r '.nodes[].version' | grep '^4[.]11[.]' | grep -v '^4[.]11[.]0$' | while read V; do sed "s/4[.]11[.]0/${V}/g" blocked-edges/4.11.0-KeepalivedMulticastSkew.yaml > "blocked-edges/${V}-KeepalivedMulticastSkew.yaml"; done
  $ git add blocked-edges/4.11.*KeepalivedMulticastSkew.yaml

[1]: https://issues.redhat.com/browse/OPNET-296
[2]: https://github.com/openshift/machine-config-operator/blame/8fa0b7e8903226b3cfb76e6c6f49409cfc0dd0e7/templates/common/on-prem/files/keepalived.yaml#L2
[3]: https://github.com/openshift/machine-config-operator/blob/afb47c916680dd5870e48e5c9cf819f59e12ff4d/pkg/operator/render.go#L282-L294
[4]: https://github.com/openshift/machine-config-operator/blob/8fa0b7e8903226b3cfb76e6c6f49409cfc0dd0e7/pkg/operator/render.go#L282-L294
[5]: openshift/machine-config-operator#3084
[6]: openshift/machine-config-operator#2942
[7]: https://docs.openshift.com/container-platform/4.11/release_notes/ocp-4-11-release-notes.html#ocp-4-11-nutanix
[8]: https://github.com/openshift/machine-config-operator/blob/afb47c916680dd5870e48e5c9cf819f59e12ff4d/templates/common/on-prem/files/keepalived.yaml#L155-L156
[9]: openshift/machine-config-operator@84d0bae#diff-c4a27bc4c14847dd581f495e992f67cf49b430644e8f113aabfa879de076564dL156
[10]: https://github.com/openshift/machine-config-operator/blob/afb47c916680dd5870e48e5c9cf819f59e12ff4d/pkg/operator/render.go#L249-L250
@wking wking force-pushed the 4.11-KeepalivedMulticastSkew branch from 1949c00 to bbc4fb9 Compare May 4, 2023 00:10
Copy link
Member

@LalatenduMohanty LalatenduMohanty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label May 4, 2023
@openshift-ci
Copy link
Contributor

openshift-ci bot commented May 4, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: LalatenduMohanty, wking

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [LalatenduMohanty,wking]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci
Copy link
Contributor

openshift-ci bot commented May 4, 2023

@wking: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants