Skip to content

Conversation

@alebedev87
Copy link
Contributor

@alebedev87 alebedev87 commented Jul 7, 2023

The recently added Ingress capability has to remain enabled at all times for standalone OpenShift installations. As a solution, a new flag (--always-enable-capabilities) has been introduced to allow the OpenShift engineers to set a list of the capabilities which have to be always enabled. This flag with Ingress capability enabled is added to the CVO's deployment manifest by default. This doesn't impact HyperShift cluster installations because the CVO is not managed via the manifest on HyperShift.
For more info, read Standalone OpenShift section of Make ingress operator optional on HyperShift EP.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jul 7, 2023
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Jul 7, 2023

@alebedev87: This pull request references NE-1318 which is a valid jira issue.

Details

In response to this:

EP: openshift/enhancements#1415

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 7, 2023
@openshift-ci openshift-ci bot requested review from DavidHurta and wking July 7, 2023 09:57
@alebedev87 alebedev87 changed the title [WIP] NE-1318: Add required capabilities flag and set Ingress as required [WIP] NE-1318: Add always-enable-capabilities flag and set Ingress as always enabled Jul 13, 2023
@alebedev87 alebedev87 force-pushed the required-capabilities branch 2 times, most recently from 03c72b3 to 85ad503 Compare July 25, 2023 08:37
@alebedev87 alebedev87 changed the title [WIP] NE-1318: Add always-enable-capabilities flag and set Ingress as always enabled NE-1318: Add always-enable-capabilities flag and set Ingress as always enabled Jul 25, 2023
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 25, 2023
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Jul 25, 2023

@alebedev87: This pull request references NE-1318 which is a valid jira issue.

Details

In response to this:

This PR adds Ingress capability which has to be always enabled for the standalone OpenShift. Therefore the new flag (--always-enable-capabilities) which allows the OpenShift engineers to set a list of the capabilities which have to be always enabled. The new flag with Ingress capability is added to the CVO deployment manifest by default. This doesn't impact HyperShift cluster deployments because the CVO is not managed via the manifest on HyperShift.
For more info, read Standalone OpenShift section of Make ingress operator optional on HyperShift EP.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@petr-muller
Copy link
Member

[sig-node] should override timeoutGracePeriodSeconds when annotation is set [Suite:openshift/conformance/parallel]

Does not look related
/test e2e-agnostic-avn

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jul 25, 2023

@petr-muller: The specified target(s) for /test were not found.
The following commands are available to trigger required jobs:

  • /test e2e-agnostic-operator
  • /test e2e-agnostic-ovn
  • /test e2e-agnostic-ovn-upgrade-into-change
  • /test e2e-agnostic-ovn-upgrade-out-of-change
  • /test gofmt
  • /test images
  • /test lint
  • /test unit

Use /test all to run all jobs.

Details

In response to this:

[sig-node] should override timeoutGracePeriodSeconds when annotation is set [Suite:openshift/conformance/parallel]

Does not look related
/test e2e-agnostic-avn

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@alebedev87
Copy link
Contributor Author

alebedev87 commented Jul 25, 2023

/retest e2e-agnostic-ovn

Failures around unsupported.do-not-use.openshift.io/override-liveness-grace-period-seconds annotation behavior (sig-node). Let's retry to see if it's constant.

UPD: didn't see Petr's comment above.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jul 25, 2023

@alebedev87: The /retest command does not accept any targets.
The following commands are available to trigger required jobs:

  • /test e2e-agnostic-operator
  • /test e2e-agnostic-ovn
  • /test e2e-agnostic-ovn-upgrade-into-change
  • /test e2e-agnostic-ovn-upgrade-out-of-change
  • /test gofmt
  • /test images
  • /test lint
  • /test unit

Use /test all to run all jobs.

Details

In response to this:

/retest e2e-agnostic-ovn

Failures around unsupported.do-not-use.openshift.io/override-liveness-grace-period-seconds annotation behavior (sig-node). Let's retry to see if it's constant.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@alebedev87
Copy link
Contributor Author

/test e2e-agnostic-ovn

@alebedev87
Copy link
Contributor Author

alebedev87 commented Jul 25, 2023

e2e-agnostic-ovn-upgrade-out-of-change fails because Ingress capability implicitly enabled by this change stays in the status of ClusterVersion and is unknown to the CVO version without this change:

I0725 12:24:29.339077 1 cvo.go:601] Error handling openshift-cluster-version/version: ClusterVersion.config.openshift.io "version" is invalid: status.capabilities.enabledCapabilities[2]: Unsupported value: "Ingress": supported values: "openshift-samples", "baremetal", "marketplace", "Console", "Insights", "Storage", "CSISnapshot", "NodeTuning", "MachineAPI"

I think it's the same case when any new capability is implicitly enabled (e.g. while upgrading from 4.12 to 4.13 with None capability set) and then the cluster is downgraded.

@alebedev87
Copy link
Contributor Author

/test e2e-agnostic-ovn

@alebedev87
Copy link
Contributor Author

/assign @wking @petr-muller

@alebedev87
Copy link
Contributor Author

/hold

The ingress capability feature has been deferred to OCP 4.15. Reviews are still welcome though.

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 31, 2023
@LalatenduMohanty
Copy link
Member

@alebedev87 Can you please link the jira for this feature work? It will help with understanding the context behind the change.

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 24, 2023
@openshift-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 26, 2023
@alebedev87
Copy link
Contributor Author

/remove-lifecycle stale

@openshift-ci openshift-ci bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 2, 2024
@lihongan
Copy link

cc @melvinjoseph86

@candita
Copy link

candita commented Apr 1, 2024

@alebedev87 this needs a rebase. Otherwise
/lgtm

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 1, 2024
@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Apr 1, 2024
@alebedev87 alebedev87 force-pushed the required-capabilities branch from 4b7ba61 to ade334a Compare April 2, 2024 10:12
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Apr 2, 2024
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 2, 2024
Copy link
Member

@wking wking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Apr 19, 2024
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Apr 19, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alebedev87, candita, wking

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 19, 2024
@jiajliu
Copy link

jiajliu commented Apr 19, 2024

Hi @alebedev87

Thank you for notifying us for a test from cvo side. I just had a pre-merge test to check the new flag in cvo deployment. Following is my test:

  1. a regression test with default caps config, cvo works well with new deployment including always-enable-capabilities flag.
// launch 4.16,openshift/cluster-version-operator#946 aws
# ./oc -n openshift-cluster-version get deployment cluster-version-operator -oyaml|grep Ingress
        - --always-enable-capabilities=Ingress
# ./oc get clusterversion version -ojson| jq .status.capabilities
{
  "enabledCapabilities": [
    "Build",
    "CSISnapshot",
    "CloudControllerManager",
    "CloudCredential",
    "Console",
    "DeploymentConfig",
    "ImageRegistry",
    "Ingress",
    "Insights",
    "MachineAPI",
    "NodeTuning",
    "OperatorLifecycleManager",
    "Storage",
    "baremetal",
    "marketplace",
    "openshift-samples"
  ],
  "knownCapabilities": [
    "Build",
    "CSISnapshot",
    "CloudControllerManager",
    "CloudCredential",
    "Console",
    "DeploymentConfig",
    "ImageRegistry",
    "Ingress",
    "Insights",
    "MachineAPI",
    "NodeTuning",
    "OperatorLifecycleManager",
    "Storage",
    "baremetal",
    "marketplace",
    "openshift-samples"
  ]
}
  1. a test with none cap config, cvo also works well with the flag, and ingress operator is enabled as an implicitly enabled cap
//launch 4.16,openshift/cluster-version-operator#946 aws,no-capabilities
# ./oc get clusterversion version -ojson| jq -r '.status.conditions[]|select(.type=="ImplicitlyEnabledCapabilities")'
{
  "lastTransitionTime": "2024-04-19T09:58:36Z",
  "message": "The following capabilities could not be disabled: Ingress",
  "reason": "CapabilitiesImplicitlyEnabled",
  "status": "True",
  "type": "ImplicitlyEnabledCapabilities"
}

# ./oc get clusterversion version -ojson| jq .status.capabilities
{
  "enabledCapabilities": [
    "CloudControllerManager",
    "CloudCredential",
    "Ingress",
    "MachineAPI"
  ],
  "knownCapabilities": [
    "Build",
    "CSISnapshot",
    "CloudControllerManager",
    "CloudCredential",
    "Console",
    "DeploymentConfig",
    "ImageRegistry",
    "Ingress",
    "Insights",
    "MachineAPI",
    "NodeTuning",
    "OperatorLifecycleManager",
    "Storage",
    "baremetal",
    "marketplace",
    "openshift-samples"
  ]
}
# ./oc get co ingress
NAME      VERSION                                                AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
ingress   4.16.0-0.test-2024-04-19-093919-ci-ln-w4l7bm2-latest   True        False         False      23m

cc @evakhoni for some upgrade test from cvo side.
cc @jinyunma @shellyyang1989 for awareness

@evakhoni
Copy link
Contributor

@jiajliu as far as upgrade is concerned, I expect no much difference between Ingress, and any other previously introduced capability, since the standard behavior for any previously non-optional cluster resource becoming an optional capability on the version we're upgrading to, is ImplicitlyEnabled in any case where it is not explicitly enabled via either baseline, additional, or defaulted to enabled by the lack of thereof. in this case a presence of --always-enable-capabilities is expected to manifest itself with the same effect, thus being identical in terms of expected observed behavior.

the only thing comes to my mind, is a basic sanity test verifying that the behavior explained above experienced no difference compared to standard capability introduction during upgrade, in which case any deviation from the standard behavior could be considered a regression. perhaps as well as to verify a presence of the aforementioned container argument following an upgrade. correct me if you think I missed something.

@jiajliu
Copy link

jiajliu commented Apr 22, 2024

@evakhoni SGTM, thank you!

@shellyyang1989
Copy link
Contributor

cc @sunzhaohua2 for removing the workaround in upgrade ci after it's merged.

@jiajliu
Copy link

jiajliu commented Apr 24, 2024

pre-merge upgrade test pass.

before upgrade, ingress installed by default in v4.15 as expected.
# ./oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.15.10   True        False         6m15s   Cluster version is 4.15.10

# ./oc get clusterversion version -ojson|jq .spec
{
  "capabilities": {
    "additionalEnabledCapabilities": [
      "openshift-samples",
      "CloudCredential"
    ],
    "baselineCapabilitySet": "None"
  },
  "channel": "stable-4.15",
  "clusterID": "403cbc5c-4eee-4530-a264-3c7820d1b9c2"
}
# ./oc get clusterversion version -ojson| jq -r '.status.conditions[]|select(.type=="ImplicitlyEnabledCapabilities")'
{
  "lastTransitionTime": "2024-04-24T00:08:52Z",
  "message": "Capabilities match configured spec",
  "reason": "AsExpected",
  "status": "False",
  "type": "ImplicitlyEnabledCapabilities"
}
# ./oc -n openshift-cluster-version get deployment -oyaml|grep Ingress
#
# ./oc get co
NAME                            VERSION   AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
...
ingress                         4.15.10   True        False         False      14m     
...

build an image with the pr and trigger an upgrade against it. upgrade succeed with ingress still enabled as implicitly enabled cap, cvo deployment is updated to include always-enable-capabilities flag.

# ./oc get co ingress
NAME      VERSION                                                   AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
ingress   4.16.0-0.ci.test-2024-04-24-002608-ci-ln-jgmx62k-latest   True        False         False      21m

# ./oc -n openshift-cluster-version get deployment -oyaml|grep Ingress
          - --always-enable-capabilities=Ingress
# ./oc get clusterversion version -ojson|jq .spec
{
  "capabilities": {
    "additionalEnabledCapabilities": [
      "openshift-samples",
      "CloudCredential"
    ],
    "baselineCapabilitySet": "None"
  },
  "channel": "stable-4.15",
  "clusterID": "403cbc5c-4eee-4530-a264-3c7820d1b9c2",
  "desiredUpdate": {
    "architecture": "",
    "force": true,
    "image": "registry.build03.ci.openshift.org/ci-ln-jgmx62k/release:latest",
    "version": ""
  }
}
# ./oc get clusterversion version -ojson| jq -r '.status.conditions[]|select(.type=="ImplicitlyEnabledCapabilities")'
{
  "lastTransitionTime": "2024-04-24T00:53:52Z",
  "message": "The following capabilities could not be disabled: CloudControllerManager, Ingress",
  "reason": "CapabilitiesImplicitlyEnabled",
  "status": "True",
  "type": "ImplicitlyEnabledCapabilities"
}

@jiajliu
Copy link

jiajliu commented Apr 24, 2024

pre-merge test pass.

/label qe-approved

@openshift-ci openshift-ci bot added the qe-approved Signifies that QE has signed off on this PR label Apr 24, 2024
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Apr 24, 2024

@alebedev87: This pull request references NE-1318 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.16.0" version, but no target version was set.

Details

In response to this:

The recently added Ingress capability has to remain enabled at all times for standalone OpenShift installations. As a solution, a new flag (--always-enable-capabilities) has been introduced to allow the OpenShift engineers to set a list of the capabilities which have to be always enabled. This flag with Ingress capability enabled is added to the CVO's deployment manifest by default. This doesn't impact HyperShift cluster installations because the CVO is not managed via the manifest on HyperShift.
For more info, read Standalone OpenShift section of Make ingress operator optional on HyperShift EP.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD 5e73deb and 2 for PR HEAD ade334a in total

@alebedev87
Copy link
Contributor Author

/retest

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Apr 24, 2024

@alebedev87: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot bot merged commit 1acac06 into openshift:master Apr 24, 2024
@openshift-bot
Copy link
Contributor

[ART PR BUILD NOTIFIER]

This PR has been included in build cluster-version-operator-container-v4.17.0-202404240511.p0.g1acac06.assembly.stream.el9 for distgit cluster-version-operator.
All builds following this will include this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. qe-approved Signifies that QE has signed off on this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.