Skip to content

Conversation

@wking
Copy link
Member

@wking wking commented Nov 12, 2024

Checking 4.13 OVN CI in PromeCIeus:

group by (__name__) ({__name__=~".*ipsec.*"})

only turns up ovnkube_master_ipsec_enabled, which we'd used previously, e.g. in 2797989 (#5334). But checking 4.14 OVN CI, that same __name__ search turns up:

  • openshift:openshift_network_operator_ipsec_state:info,
  • openshift_network_operator_ipsec_state, and
  • ovnkube_controller_ipsec_enabled,

but not 4.13's ovnkube_master_ipsec_enabled. The PromQL I'm adding here looks for the 4.14 ovnkube_controller_ipsec_enabled, if it can't find that it falls back to the 4.13 ovnkube_master_ipsec_enabled, and if it can't find that it falls back to the Kube-API standard apiserver_storage_objects we'd been using before for "am I OVN or not?".

Checking 4.13 OVN CI [1] in [2]:

  group by (__name__) ({__name__=~".*ipsec.*"})

only turns up ovnkube_master_ipsec_enabled, which we'd used
previously, e.g. in 2797989
(blocked-edges/4.14.*-OVNInterConnectTransitionIPsec: Declare risk,
2024-05-31, openshift#5334).  But checking 4.14 OVN CI [3], that same __name__
search turns up:

* openshift:openshift_network_operator_ipsec_state:info,
* openshift_network_operator_ipsec_state, and
* ovnkube_controller_ipsec_enabled,

but not 4.13's ovnkube_master_ipsec_enabled.  The PromQL I'm adding
here looks for the 4.14 ovnkube_controller_ipsec_enabled, if it can't
find that it falls back to the 4.13 ovnkube_master_ipsec_enabled, and
if it can't find that it falls back to the Kube-API standard
apiserver_storage_objects we'd been using before for "am I OVN or
not?".

[1]: https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-master-ci-4.13-e2e-azure-ovn-upgrade/1851915878388469760
[2]: https://promecieus.dptools.openshift.org/
[3]: https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-master-nightly-4.14-e2e-aws-ovn-serial/1851940515621113856
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Nov 12, 2024
@openshift-ci-robot
Copy link

openshift-ci-robot commented Nov 12, 2024

@wking: This pull request references SDN-5477 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the spike to target the "4.18.0" version, but no target version was set.

Details

In response to this:

Checking 4.13 OVN CI in PromeCIeus:

group by (__name__) ({__name__=~".*ipsec.*"})

only turns up ovnkube_master_ipsec_enabled, which we'd used previously, e.g. in 2797989 (#5334). But checking 4.14 OVN CI, that same __name__ search turns up:

  • openshift:openshift_network_operator_ipsec_state:info,
  • openshift_network_operator_ipsec_state, and
  • ovnkube_controller_ipsec_enabled,

but not 4.13's ovnkube_master_ipsec_enabled. The PromQL I'm adding here looks for the 4.14 ovnkube_controller_ipsec_enabled, if it can't find that it falls back to the 4.13 ovnkube_master_ipsec_enabled, and if it can't find that it falls back to the Kube-API standard apiserver_storage_objects we'd been using before for "am I OVN or not?".

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 12, 2024
@wking
Copy link
Member Author

wking commented Nov 12, 2024

I'm not aware of IPsec-enabled 4.13 or 4.14 CI for testing this PromQL against, but here are:

4.13.0-0.nightly-2024-11-11-123738 's aws-sdn-serial, which knows it isn't exposed:

Screenshot 2024-11-11 9 07 23 PM

4.14.0-0.nightly-2024-11-11-222143's aws-sdn, which knows it isn't exposed:

Screenshot 2024-11-11 9 04 37 PM

4.13.53's azure-ovn-upgrade-4.13-micro, which knows it's OVN and not exposed:

Screenshot 2024-11-11 8 59 10 PM

4.14.40's aws-ovn-serial, which knows it's OVN and not exposed:

Screenshot 2024-11-11 8 56 55 PM

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 12, 2024

@wking: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@zshi-redhat
Copy link

LGTM @pperiyasamy @huiran0826 PTAL

@huiran0826
Copy link

lgtm

promql: |
group by (resource) (max_over_time(apiserver_storage_objects{_id="",resource="egressips.k8s.ovn.org"}[1h]))
or on ()
group by (ipsec) (label_replace(max_over_time(ovnkube_controller_ipsec_enabled{_id=""}[1h]), "ipsec", "enabled (4.14)", "", "") == 1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the ovnkube_controller_ipsec_enabled gauge metric is at zone/node level in 4.14, so this is set from every node. are we ok with this ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

group by (ipsec) (...) will collapse matching ovnkube_controller_ipsec_enabled per ipsec label, and the label_replace ensures only a single enabled (4.14) value for that label. So that means that a single node reporting a value of 1 for that metric will get the cluster treated as IPsec-enabled as far as this risk is concerned. Which sounds like the semantics we want.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok @wking , that's great.

@pperiyasamy
Copy link
Member

LGTM

Copy link
Contributor

@PratikMahajan PratikMahajan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Nov 12, 2024
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 12, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: PratikMahajan, wking

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [PratikMahajan,wking]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-bot openshift-merge-bot bot merged commit c9b0647 into openshift:master Nov 12, 2024
@wking wking deleted the OVNlibreswan-IPsec-PromQL branch November 12, 2024 19:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants