OTA-854: Add configurable CVO knobs for risk-evaluation PromQL target #926

DavidHurta · 2023-04-17T13:42:42Z

This pull request will add new flags and some minor logic to enable the CVO to evaluate conditional updates in HyperShift.

This pull request references the https://issues.redhat.com/browse/OTA-854

openshift-ci-robot · 2023-04-17T13:42:46Z

@Davoska: This pull request references OTA-854 which is a valid jira issue.

Details

In response to this:

This pull request references the https://issues.redhat.com/browse/OTA-854

The copied text from the commit explaining the pull request:

This commit will introduce a new flag to the CVO regarding its PromQL target for risk evaluation of conditional updates.

For the CVO to successfully access a service that provides metrics (in the case of the CVO it's the thanos-querier service), it needs three things.

It needs the service's address, a CA bundle to verify the certificate provided by the service to allow secure communication using TLS between the actors [1], and the authorization credentials of the CVO. Currently, the CVO hardcodes the address, the path to the CA bundle, and the path to the credentials file.

This is not ideal, as CVO is starting to be used in other repositories such as HyperShift [2]. This forces other developers to look into the depths of the CVO to find these paths and forces them to put the respective files in these hardcoded paths.

A path to the service CA bundle was added because there exist a lot of CAs in the Hypershift, and the location of the corresponding CA bundle files may vary.

A flag for the address of the service was not added because it is not needed for the HyperShift to function properly at the moment. The CVO in the standalone OpenShift accesses the thanos-querier for metrics. More precisely, the CVO connects to the service called thanos-querier in the openshift-monitoring namespace. The same service is also present in the Hypershift. This service is accessible to the CVOs in the hosted control planes in the Hypershift.

A flag to specify the path to the credentials file was not added because the CVO hardcodes the same path to the credentials file as is the path to the credentials file used in other service accounts in Kubernetes [3].

[1] https://docs.openshift.com/container-platform/4.12/security/certificate_types_descriptions/service-ca-certificates.html
[2] https://github.com/openshift/hypershift
[3] https://kubernetes.io/docs/reference/access-authn-authz/service-accounts-admin/#serviceaccount-admission-controller

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

DavidHurta · 2023-04-17T13:48:01Z

It's a simple change but I still need to test it with a CVO that has some conditional updates and evaluates them.

petr-muller

LGTM with a nit

/hold
Holding to you can perform the testing you mentioned

install/0000_00_cluster-version-operator_03_deployment.yaml

DavidHurta · 2023-06-12T09:54:26Z

Making multiple changes. Converting back to a draft.

openshift-ci-robot · 2023-07-11T16:51:42Z

@Davoska: This pull request references OTA-854 which is a valid jira issue.

Details

In response to this:

This pull request will add new flags and some minor logic to enable the CVO to evaluate conditional updates in HyperShift.

This pull request references the https://issues.redhat.com/browse/OTA-854

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

DavidHurta · 2023-07-12T22:16:22Z

Steps I have taken to test this.

Follow prerequisite steps on https://hypershift-docs.netlify.app/getting-started/. Some of my notes:

Have AWS credentials available. When developing I have used the openshift-dev account.
The pull secret file will depend on the image repositories being used.
The Route53 public zone step may be skipped as we will use the devcluster.openshift.com base domain.
Note the S3 bucket step. In my case the envsubst command is not working as expected, and the variable is not substituted. Make sure the final file policy.json has a correct format with the environment variable being substituted.
I have used a locally built and pushed HyperShift image (by running a following command in the hypershift repository directory RUNTIME=podman IMG=quay.io/dhurta/hypershift:latest make docker-build docker-push while having locally the OTA-855: Enable CVO to evaluate conditional updates on self-managed HyperShift deployed on OpenShift hypershift#2807 changes) and a built release image from the Cluster Bot (by running build https://github.com/openshift/cluster-version-operator/pull/926). I am not sure whether running build https://github.com/openshift/cluster-version-operator/pull/926, https://github.com/openshift/hypershift/pull/2807 will work to provide a single image as HyperShift is not part of the release image (I could not see any mentions of hypershift in the build logs).

Prepare some environment variables appropriately, for example:

#BUCKET_NAME=dhurta-hypershift-test #BUCKET_NAME is already set as part of the prerequisite steps
REGION="us-east-1"
AWS_CREDS="$HOME/.aws/credentials"
PULL_SECRET="$HOME/.docker/config.json"
BASE_DOMAIN="devcluster.openshift.com"
HYPERSHIFT_IMAGE="quay.io/dhurta/hypershift:latest"
RELEASE_IMAGE="registry.build05.ci.openshift.org/ci-ln-bjxgnck/release:latest"

Login to an OpenShift cluster

oc login <url> --username <username> --password <password>

To test the managed HyperShift:

Follow steps on https://polarion.engineering.redhat.com/polarion/#/project/OSE/workitem?id=OCP-57234 for Observability Operator installation
Follow steps on https://polarion.engineering.redhat.com/polarion/#/project/OSE/workitem?id=OCP-57236 to create MonitoringStack CR to collect HyperShift hosted-control-plane metrics. The secret and the spec.prometheusConfig.remoteWrite field can be omitted.
Install HyperShift operator with RHOBS monitoring enabled. I am running the commands using a local binary in the following examples.

RHOBS_MONITORING="1" ./bin/hypershift install \
  --oidc-storage-provider-s3-bucket-name $BUCKET_NAME \
  --oidc-storage-provider-s3-credentials $AWS_CREDS \
  --oidc-storage-provider-s3-region $REGION \
  --enable-uwm-telemetry-remote-write \
  --platform-monitoring=OperatorOnly \
  --metrics-set=All \
  --hypershift-image "$HYPERSHIFT_IMAGE" \
  --rhobs-monitoring

Create a hosted cluster.

HOSTED_CLUSTER_NAME=dhurta-test-aws

./bin/hypershift create cluster aws  \
  --name="$HOSTED_CLUSTER_NAME" \
  --pull-secret=$PULL_SECRET \
  --node-pool-replicas=1 \
  --release-image="$RELEASE_IMAGE" \
  --aws-creds=$AWS_CREDS \
  --region=$REGION \
  --base-domain="$BASE_DOMAIN" \
  --control-plane-operator-image "$HYPERSHIFT_IMAGE"

Wait for the hosted cluster to finish installation. The scraping manifests (such as ServiceMonitor) are applied after the hosted cluster has completed its installation. Wait for the PROGRESS to be COMPLETED. This may take a while.

oc get hostedcluster -n clusters --watch

Scale the control-plane-operator (CPO) to zero. We will be modifying some resources that would be overwritten by the CPO. We are using oc annotate to tell the hypershift operator to scale down the CPO and not reconcile it, neat!

oc annotate -n clusters hostedcluster "$HOSTED_CLUSTER_NAME" hypershift.openshift.io/debug-deployments="control-plane-operator" --overwrite

Scale down the hosted-cluster-config-operator. The operator reconciles the hosted cluster version and would overwrite our following changes.

oc scale -n "clusters-$HOSTED_CLUSTER_NAME" deployments/hosted-cluster-config-operator --replicas=0

Extract the admin kubeconfig file for the hosted cluster. Make sure to specify the file appropriately!

KUBECONFIG_HOSTED_CLUSTER=kubeconfig
./bin/hypershift create kubeconfig --name "$HOSTED_CLUSTER_NAME" > "$KUBECONFIG_HOSTED_CLUSTER"

View the status of the hosted cluster (note the flag --kubeconfig).

oc adm upgrade --kubeconfig="$KUBECONFIG_HOSTED_CLUSTER"

Set a custom upstream. Note that the used version in the specified JSON file will need to be same as the version of the hosted cluster version. For example:

oc patch clusterversion version --type json -p '[{"op": "add", "path": "/spec/upstream", "value": "https://raw.githubusercontent.com/Davoska/cincinnati-graph-data/test-promql/test/cincinnati-graph-data.json"}]' --kubeconfig="$KUBECONFIG_HOSTED_CLUSTER"

Set a custom channel for the hosted cluster to start fetching the updates.

oc adm upgrade channel test --kubeconfig="$KUBECONFIG_HOSTED_CLUSTER"

Wait for the evaluation of conditional updates (as of this moment one PromQL query per ~10 minutes):

$ oc adm upgrade --include-not-recommended  --kubeconfig="$KUBECONFIG_HOSTED_CLUSTER"
Cluster version is 4.14.0-0.ci.test-2023-07-12-150458-ci-ln-bjxgnck-latest

Upstream: https://raw.githubusercontent.com/Davoska/cincinnati-graph-data/test-promql/test/cincinnati-graph-data.json
Channel: test
No updates available. You may force an upgrade to a specific release image, but doing so may not be supported and may result in downtime or data loss.

Supported but not recommended updates:

  Version: 4.12.23
  Image: quay.io/openshift-release-dev/ocp-release@sha256:3333333333333333333333333333333333333333333333333333333333333333
  Recommended: False
  Reason: Youngest
  Message: Risk to 4.12.23 - Hosted OpenShift clusters will explode! https://example.com/youngest

  Version: 4.12.22
  Image: quay.io/openshift-release-dev/ocp-release@sha256:1111111111111111111111111111111111111111111111111111111111111111
  Recommended: Unknown
  Reason: EvaluationFailed
  Message: Exposure to Oldest is unknown due to an evaluation failure: client-side throttling: only 10.765µs has elapsed since the last match call completed for this cluster condition backend; this cached cluster condition request has been queued for later execution
  Risk to 4.12.22 - Non-Hosted OpenShift clusters will explode! https://example.com/oldest

Wait for the next evaluation:

$  oc adm upgrade --include-not-recommended  --kubeconfig="$KUBECONFIG_HOSTED_CLUSTER"
Cluster version is 4.14.0-0.ci.test-2023-07-12-150458-ci-ln-bjxgnck-latest

Upstream: https://raw.githubusercontent.com/Davoska/cincinnati-graph-data/test-promql/test/cincinnati-graph-data.json
Channel: test

Recommended updates:

  VERSION     IMAGE
  4.12.22     quay.io/openshift-release-dev/ocp-release@sha256:1111111111111111111111111111111111111111111111111111111111111111

Supported but not recommended updates:

  Version: 4.12.23
  Image: quay.io/openshift-release-dev/ocp-release@sha256:3333333333333333333333333333333333333333333333333333333333333333
  Recommended: False
  Reason: Youngest
  Message: Risk to 4.12.23 - Hosted OpenShift clusters will explode! https://example.com/youngest

Clean up

Scale up the CPO:

oc annotate -n clusters hostedcluster "$HOSTED_CLUSTER_NAME" hypershift.openshift.io/debug-deployments="" --overwrite

Destroy the hosted cluster:

./bin/hypershift destroy cluster aws --name "$HOSTED_CLUSTER_NAME" --aws-creds $AWS_CREDS

Uninstall the HyperShift operator

./bin/hypershift install render --format=yaml | oc delete -f -

Testing the self-managed HyperShift

Just like testing the managed HyperShift with a few modifications.
Omit the Observability Operator installation and the creation of the MonitoringStack.
Install HyperShift operator without the RHOBS monitoring being enabled.

./bin/hypershift install \
  --oidc-storage-provider-s3-bucket-name $BUCKET_NAME \
  --oidc-storage-provider-s3-credentials $AWS_CREDS \
  --oidc-storage-provider-s3-region $REGION \
  --enable-uwm-telemetry-remote-write \
  --platform-monitoring=OperatorOnly \
  --metrics-set=All \
  --hypershift-image "$HYPERSHIFT_IMAGE"

Repeat the steps...
View not recommended updates

$  oc adm upgrade --include-not-recommended  --kubeconfig="$KUBECONFIG_HOSTED_CLUSTER"
Cluster version is 4.14.0-0.ci.test-2023-07-12-150458-ci-ln-bjxgnck-latest

Upstream: https://raw.githubusercontent.com/Davoska/cincinnati-graph-data/test-promql/test/cincinnati-graph-data.json
Channel: test
No updates available. You may force an upgrade to a specific release image, but doing so may not be supported and may result in downtime or data loss.

Supported but not recommended updates:

  Version: 4.12.23
  Image: quay.io/openshift-release-dev/ocp-release@sha256:3333333333333333333333333333333333333333333333333333333333333333
  Recommended: False
  Reason: Youngest
  Message: Risk to 4.12.23 - Hosted OpenShift clusters will explode! https://example.com/youngest

  Version: 4.12.22
  Image: quay.io/openshift-release-dev/ocp-release@sha256:1111111111111111111111111111111111111111111111111111111111111111
  Recommended: Unknown
  Reason: EvaluationFailed
  Message: Exposure to Oldest is unknown due to an evaluation failure: client-side throttling: only 14.852µs has elapsed since the last match call completed for this cluster condition backend; this cached cluster condition request has been queued for later execution
  Risk to 4.12.22 - Non-Hosted OpenShift clusters will explode! https://example.com/oldest

Wait for the next evaluation:

$  oc adm upgrade --include-not-recommended  --kubeconfig="$KUBECONFIG_HOSTED_CLUSTER"
Cluster version is 4.14.0-0.ci.test-2023-07-12-150458-ci-ln-bjxgnck-latest

Upstream: https://raw.githubusercontent.com/Davoska/cincinnati-graph-data/test-promql/test/cincinnati-graph-data.json
Channel: test

Recommended updates:

  VERSION     IMAGE
  4.12.22     quay.io/openshift-release-dev/ocp-release@sha256:1111111111111111111111111111111111111111111111111111111111111111

Supported but not recommended updates:

  Version: 4.12.23
  Image: quay.io/openshift-release-dev/ocp-release@sha256:3333333333333333333333333333333333333333333333333333333333333333
  Recommended: False
  Reason: Youngest
  Message: Risk to 4.12.23 - Hosted OpenShift clusters will explode! https://example.com/youngest

Repeat the cleanup steps

pkg/start/start.go

petr-muller · 2023-07-13T10:44:26Z

pkg/cvo/cvo.go

 		requiredFeatureSet: requiredFeatureSet,
 		clusterProfile:     clusterProfile,
-		conditionRegistry:  standard.NewConditionRegistry(kubeClient),
+		conditionRegistry:  standard.NewConditionRegistry(kubeClientMgmtCluster, promqlTarget),


I think the naming would be a little clearer if we avoided the hypershift lingo and called kubeClientMgmtCluster something that indicates "cluster that we use to evaluate upgrade conditions" instead, because at this place it is not necessarily a "hypershift management" cluster client.

Maybe this kubeclient should be part of promqlTarget... They are only used together.

We should change the name of the variable from conditionRegistry to remoteMetricserver or just metricServer

We should change the name of the variable from conditionRegistry to remoteMetricserver or just metricServer

That depends on the level of abstraction being used. The whole package clusterconditions is using the word condition to convey the meanings. There are conditionRegistries where you can Register a conditionType. You can Match clusterCondition against a clusterRegistry.

We could change the name of the Operator struct's field conditionRegistry to metricServer.

But using the metricServer as a variable of ConditionRegistry interface would imply that we register something at the metric server or we prune something at the server IMO.

type ConditionRegistry interface { // Register registers a condition type, and panics on any name collisions. Register(conditionType string, condition Condition) // PruneInvalid returns a new slice with recognized, valid conditions. // The error complains about any unrecognized or invalid conditions. PruneInvalid(ctx context.Context, matchingRules []configv1.ClusterCondition) ([]configv1.ClusterCondition, error) // Match returns whether the cluster matches the given rules (true), // does not match (false), or the rules fail to evaluate (error). Match(ctx context.Context, matchingRules []configv1.ClusterCondition) (bool, error) }

pkg/cvo/availableupdates.go

petr-muller · 2023-07-13T10:57:18Z

pkg/cvo/availableupdates.go

 		upstream = ""
 	}

+	if optr.isHCPModeEnabled {


we may want to test for misconfigured empty clusterID?

Also, are the hosted custer admins able to spoof their clusters ClusterID and read other hosted cluster metrics this way? If yes then that seems to be somewhat security-sensitive...

we may want to test for misconfigured empty clusterID?

Is the question regarding whether the substitution of _id works for empty clusterId? Or to see whether something breaks and potential side effects?

Also, are the hosted custer admins able to spoof their clusters ClusterID and read other hosted cluster metrics this way? If yes then that seems to be somewhat security-sensitive...

Good question. The clusterId and similar things in the cluster version are being reconciled from the HCP (hosted control plane) and should be overwritten.

https://github.com/openshift/hypershift/blob/main/control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.go#L956-L960

func (r *reconciler) reconcileClusterVersion(ctx context.Context, hcp *hyperv1.HostedControlPlane) error { clusterVersion := &configv1.ClusterVersion{ObjectMeta: metav1.ObjectMeta{Name: "version"}} if _, err := r.CreateOrUpdate(ctx, r.client, clusterVersion, func() error { clusterVersion.Spec.ClusterID = configv1.ClusterID(hcp.Spec.ClusterID) clusterVersion.Spec.Capabilities = nil clusterVersion.Spec.Upstream = "" clusterVersion.Spec.Channel = hcp.Spec.Channel clusterVersion.Spec.DesiredUpdate = nil return nil }); err != nil { return fmt.Errorf("failed to reconcile clusterVersion: %w", err) } return nil }

But I am not sure which roles are binded to the hosted cluster admins at the moment. The admin-kubeconfig secret that is available in the HCP does provide the capability to change the cluster version. Although I need to check whether these kind of permissions are binded to a normal hosted cluster admin...

For example, when I deploy a hosted cluster using a Cluster Bot running rosa create 4.12.22 6h, I am not able to modify the cluster's cluster version. But that is a managed cluster...

oc patch clusterversion version --type json -p '[{"op": "add", "path": "/spec/upstream", "value": "https://raw.githubusercontent.com/Davoska/cincinnati-graph-data/test-promql/test/cincinnati-graph-data.json"}]'
Error from server (Prevented from accessing Red Hat managed resources. This is in an effort to prevent harmful actions that may cause unintended consequences or affect the stability of the cluster. If you have any questions about this, please reach out to Red Hat support at https://access.redhat.com/support): admission webhook "regular-user-validation.managed.openshift.io" denied the request: Prevented from accessing Red Hat managed resources. This is in an effort to prevent harmful actions that may cause unintended consequences or affect the stability of the cluster. If you have any questions about this, please reach out to Red Hat support at https://access.redhat.com/support

The question I think Petr was asking if the clusterID is empty for some reason(because of a bug in the code) do you feel confident that it is handled properly with right information in the log.

pkg/clusterconditions/promql/promql.go

petr-muller · 2023-07-13T11:04:57Z

pkg/clusterconditions/promql/promql.go

+	}
+	scheme := "https"
+	if p.QueryNamespace == "openshift-observability-operator" && p.QueryService == "hypershift-monitoring-stack-prometheus" {
+		scheme = "http"


🧐 why don't we have TLS in hypershift?

Yes, I am not sure. The monitoring stack deployed in the managed OpenShift exposes the Prometheus server's port that serves the HTTP API via the service hypershift-monitoring-stack-prometheus. But it seems it's not configured to use the https scheme.

Or I am using a wrong configuration, or the service is not expected to be queried. Investigating...

Was initially going from the comment:

Follow steps on https://polarion.engineering.redhat.com/polarion/#/project/OSE/workitem?id=OCP-57234 for Observability Operator installation

Follow steps on https://polarion.engineering.redhat.com/polarion/#/project/OSE/workitem?id=OCP-57236 to create MonitoringStack CR to collect HyperShift hosted-control-plane metrics. The secret and the spec.prometheusConfig.remoteWrite field can be omitted.

Http looks odd. @wking Do you know if we have SSL TLS auth for the monitoring stack?

This was discussed a little bit in https://redhat-internal.slack.com/archives/C0VMT03S5/p1689696199941319

...we only create a http service...

This code itself was removed. The user now configures the scheme via the --metrics-url flag. Using HTTPS in managed would be needed to be discussed with appropriate folks.

cmd/start.go

DavidHurta · 2023-07-18T14:05:00Z

/hold
Modifying after the comments from Petr.

DavidHurta · 2023-07-18T16:51:46Z

I have tried to address @petr-muller comments resulting in the bac4941 commit. I haven't tested new changes so far as they require building a release image and modifying the code in the HyperShift repository, and thus I have marked the commit as wip as of this moment.

DavidHurta · 2023-07-24T12:06:01Z

/hold
Addressing feedback and working on new changes due to the feedback on the HyperShift pull request.

petr-muller

This is shaping up really nicely! Some comments inline.

pkg/clusterconditions/clusterconditions.go

pkg/clusterconditions/promql/promql.go

This commit will introduce new flags and logic to the CVO regarding its PromQL target for risk evaluation of conditional updates with the CVO being run in a hosted control plane in mind. For the CVO to successfully access a service that provides metrics (in the case of the CVO in a standalone OpenShift cluster it's the thanos-querier service), it needs three things. It needs the service's address, a CA bundle to verify the certificate provided by the service to allow secure communication using TLS between the actors [1], and the authorization credentials of the CVO. Currently, the CVO hardcodes the address, the path to the CA bundle, and the path to the credentials file. This is not ideal, as CVO is starting to be used in other repositories such as HyperShift [2]. A path to the service CA bundle file is added to allow explicitly setting the CA bundle file of the given query service. Currently, the CVO is using a kube-api server to resolve the IP address of a specific service in the cluster [3]. Add new flags that allow configuring the CVO to resolve the IP address via DNS when DNS is available to the CVO. This is the case for hosted CVOs in HyperShift. The alternative in HyperShift would be to use the management cluster kube-api server and give the hosted CVOs additional permissions. Add flags to specify the PromQL target URL. This URL contains the used scheme, port, and the server's name used for TLS configuration. In the case, DNS is enabled, the URL is also used to query the Prometheus server. In the case, the DNS is disabled, the server can be specified via the Kubernetes service in the cluster that exposes the server. A flag to specify the path to the credentials file was added for more customizability. This flag also enables the CVO to not set the token when it's not needed. A CVO can communicate with a Prometheus server over HTTP. In the case, the token is not needed, it would be undesirable for the CVO to send its token without reason over HTTP. This commit also adds a new flag to specify whether the CVO resides in a hosted control plane. In this case, the CVO will inject its cluster Id into PromQL queries to differentiate between multiple time series belonging to different hosted clusters [4]. [1] https://docs.openshift.com/container-platform/4.12/security/certificate_types_descriptions/service-ca-certificates.html [2] https://github.com/openshift/hypershift [3] openshift#920 [4] openshift/cincinnati-graph-data#3591

DavidHurta · 2023-08-07T14:33:19Z

The new commit utilizes the DNS in HyperShift and adds more configurability of the CVO.

I still need to verify that no regression in this code has happened for managed HyperShift.

petr-muller

/hold

LGTM. I am setting a hold to allow reviews by people more knowledgeable about what is the current state of Hypershift effort. I think the generalization added in this PR is not that costly in the terms of code complexity and should be safe to merge even if we do not have all details on Hypershift side sorted out.

openshift-ci · 2023-08-08T12:48:34Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Davoska, petr-muller

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [Davoska,petr-muller]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

DavidHurta · 2023-08-09T17:19:47Z

Oh, I would like to test this against a standalone OCP cluster. I want to make sure there is no regression. I don't think we have a test for the evaluation of conditional updates, and I can't remember whether I have tested these new changes this way. Although the PR just propagates flags.

petr-muller · 2023-08-10T13:23:14Z

I don't think we have a test for the evaluation of conditional updates

Can we invent one?

DavidHurta · 2023-08-15T15:26:19Z

Can we invent one?

Since we are planning to have an e2e test for evaluation of conditional updates for HyperShift https://issues.redhat.com/browse/OTA-986 we should at least discuss the creation of one for standalone OCP. The code is pretty stable but the frequency of changes is increasing a little bit and testing this manually takes a little bit of time.

I have tested this PR against a standalone OCP cluster and no regression seems to be present.

The PromQL queries got evaluated

[dhurta@fedora ~]$ oc adm upgrade --include-not-recommended
Cluster version is 4.14.0-0.ci.test-2023-08-15-130559-ci-ln-35lnhkt-latest

Upstream: https://raw.githubusercontent.com/Davoska/cincinnati-graph-data/test-promql/test/cincinnati-graph-data-new.json
Channel: test

Recommended updates:

  VERSION     IMAGE
  4.12.23     quay.io/openshift-release-dev/ocp-release@sha256:3333333333333333333333333333333333333333333333333333333333333333

Supported but not recommended updates:

  Version: 4.12.22
  Image: quay.io/openshift-release-dev/ocp-release@sha256:1111111111111111111111111111111111111111111111111111111111111111
  Recommended: False
  Reason: Oldest
  Message: Risk to 4.12.22 - Non-Hosted OpenShift clusters will explode! https://example.com/oldest

@wking, @LalatenduMohanty, feel free to have a quick look after the new changes. I'll wait a little bit and then I'll unhold the PR.

petr-muller · 2023-08-17T12:46:50Z

We may want to unhold this only after the branching though, no reason to increase the overall risk before that

petr-muller · 2023-08-22T17:32:25Z

/retitle OTA-854: Add configurable CVO knobs for risk-evaluation PromQL target

petr-muller · 2023-09-19T10:24:35Z

/hold cancel

petr-muller · 2023-09-19T10:26:55Z

Merge gate was lifted, we can now merge this one

petr-muller · 2023-09-20T08:53:44Z

/test e2e-agnostic-ovn-upgrade-into-change

openshift-ci · 2023-09-20T11:26:16Z

@Davoska: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Apr 17, 2023

openshift-ci bot requested review from LalatenduMohanty and petr-muller April 17, 2023 13:43

openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 17, 2023

petr-muller approved these changes Apr 18, 2023

View reviewed changes

install/0000_00_cluster-version-operator_03_deployment.yaml Outdated Show resolved Hide resolved

openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 18, 2023

openshift-ci bot assigned petr-muller Apr 18, 2023

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Apr 18, 2023

DavidHurta changed the title ~~OTA-854: Add a new flag to specify the path to the service CA bundle~~ [WIP] OTA-854: Add a new flag to specify the path to the service CA bundle Jun 12, 2023

openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 12, 2023

DavidHurta marked this pull request as draft June 12, 2023 09:54

openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Jun 13, 2023

DavidHurta force-pushed the ota-854-configurable-cvo-knobs-for-promql branch from 7c20218 to 8dfcbf8 Compare July 3, 2023 09:57

DavidHurta force-pushed the ota-854-configurable-cvo-knobs-for-promql branch from 420cd9e to 8cc2f51 Compare July 10, 2023 15:56

DavidHurta changed the title ~~[WIP] OTA-854: Add a new flag to specify the path to the service CA bundle~~ [WIP] OTA-854: Add risk evaluation of conditional updates in HyperShift Jul 10, 2023

DavidHurta force-pushed the ota-854-configurable-cvo-knobs-for-promql branch 2 times, most recently from 6e743f4 to 93a6e5d Compare July 12, 2023 12:34

DavidHurta mentioned this pull request Jul 12, 2023

OTA-855: Enable CVO to evaluate conditional updates on self-managed HyperShift deployed on OpenShift openshift/hypershift#2807

Merged

4 tasks

DavidHurta changed the title ~~[WIP] OTA-854: Add risk evaluation of conditional updates in HyperShift~~ OTA-854: Add risk evaluation of conditional updates in HyperShift Jul 12, 2023

DavidHurta marked this pull request as ready for review July 12, 2023 22:17

openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 12, 2023

openshift-ci bot requested a review from wking July 12, 2023 22:18

petr-muller reviewed Jul 13, 2023

View reviewed changes

DavidHurta force-pushed the ota-854-configurable-cvo-knobs-for-promql branch 7 times, most recently from e3fbb24 to 2d62d75 Compare August 2, 2023 19:33

petr-muller reviewed Aug 3, 2023

View reviewed changes

pkg/clusterconditions/clusterconditions.go Outdated Show resolved Hide resolved

pkg/clusterconditions/promql/promql.go Outdated Show resolved Hide resolved

pkg/clusterconditions/promql/promql.go Outdated Show resolved Hide resolved

DavidHurta force-pushed the ota-854-configurable-cvo-knobs-for-promql branch from 2d62d75 to b1e69af Compare August 7, 2023 13:56

petr-muller approved these changes Aug 8, 2023

View reviewed changes

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Aug 8, 2023

openshift-ci bot changed the title ~~OTA-854: Add risk evaluation of conditional updates in HyperShift~~ OTA-854: Add configurable CVO knobs for risk-evaluation PromQL target Aug 22, 2023

openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 19, 2023

openshift-merge-robot merged commit 70a778c into openshift:master Sep 20, 2023

This was referenced Sep 20, 2023

OCPBUGS-19512: pkg/clusterconditions/promql: Warm cache with 1s delay #939

Merged

OCPBUGS-19737: pkg/clusterconditions/promql: Warm cache with 1s delay #973

Merged

wking mentioned this pull request Jul 22, 2025

CNTRLPLANE-1110: Issue HCPServiceHealthCheckDisruption for 4.18 -> 4.19.[0-4] openshift/cincinnati-graph-data#7619

Merged

OTA-854: Add configurable CVO knobs for risk-evaluation PromQL target #926

OTA-854: Add configurable CVO knobs for risk-evaluation PromQL target #926

Uh oh!

Conversation

DavidHurta commented Apr 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

openshift-ci-robot commented Apr 17, 2023 • edited by openshift-ci bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DavidHurta commented Apr 17, 2023

Uh oh!

petr-muller left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

DavidHurta commented Jun 12, 2023

Uh oh!

openshift-ci-robot commented Jul 11, 2023 • edited by openshift-ci bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DavidHurta commented Jul 12, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Steps I have taken to test this.

To test the managed HyperShift:

Clean up

Testing the self-managed HyperShift

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

DavidHurta commented Jul 18, 2023

Uh oh!

DavidHurta commented Jul 18, 2023

Uh oh!

DavidHurta commented Jul 24, 2023

Uh oh!

petr-muller left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

DavidHurta commented Aug 7, 2023

Uh oh!

petr-muller left a comment

Choose a reason for hiding this comment

Uh oh!

openshift-ci bot commented Aug 8, 2023

Uh oh!

DavidHurta commented Aug 9, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

DavidHurta commented Apr 17, 2023 •

edited

Loading

openshift-ci-robot commented Apr 17, 2023 •

edited by openshift-ci bot

Loading

openshift-ci-robot commented Jul 11, 2023 •

edited by openshift-ci bot

Loading

DavidHurta commented Jul 12, 2023 •

edited

Loading

DavidHurta commented Aug 9, 2023 •

edited

Loading

DavidHurta commented Aug 15, 2023 •

edited

Loading

petr-muller commented Aug 22, 2023 •

edited by openshift-ci bot

Loading