Skip to content

Conversation

@petr-muller
Copy link
Member

@petr-muller petr-muller commented Jun 27, 2023

This PR is a part of the same effort as #40703

OTA maintains staging (identical to production) and integration (running on engineering candidate OCP clusters) Cincinnati instances. We are searching for traffic that we could route to especially the integration one, so that we can find possible problems with engineering candidate early.

All the instances are serving identical data (up to some minimal skew coming from when individual instance scrape their source data) so we should be able to easily use the integration instance in CI clusters that are running a released OCP version. Most CI clusters are not running such versions, and these need to not query OSUS, otherwise
they would trip an alert, causing noise in CI jobs. CI clusters are not querying OSUS since #8631

We can enhance the logic to validate whether the version we are going to install is known to OSUS. If it is, we know we are installing a published OCP version and it is safe to let the cluster query OSUS. We still clear the channel to prevent the cluster from querying OSUS if we the version we install is not known to OSUS.

Changing ipi-install-install, this has a huge blast radius with these three potential failure cases:

  1. Non-zero exit code of my bash will cause steps to fail. I ran a ton of rehearsals.
  2. Integration Cincinnati breaks and clusters will start to alert about failing to query OSUS. That's the point of this change. Integration OSUS runs on Engineering Candidate OCP cluster and if ECs break real-world workloads, we want to know that.
  3. Some clusters will query for updates when they should not. There can be corner cases that I missed, like some rogue update jobs that do not use ipi-install-install-stableinitial steps. I plan to monitor search.ci for an uptick of jobs where CannotRetrieveUpdates alert fires

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 27, 2023
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 27, 2023

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@petr-muller
Copy link
Member Author

/pj-rehearse pull-ci-maistra-envoy-maistra-2.4-maistra-envoy-unit-2-4 pull-ci-openshift-cluster-version-operator-master-e2e-agnostic-operator

@petr-muller petr-muller force-pushed the use-integration-cincinnati-in-more-ci-clusters branch from b37c898 to 3056b83 Compare June 27, 2023 16:51
@petr-muller
Copy link
Member Author

/pj-rehearse pull-ci-maistra-envoy-maistra-2.4-maistra-envoy-unit-2-4 pull-ci-openshift-cluster-version-operator-master-e2e-agnostic-operator

@petr-muller petr-muller force-pushed the use-integration-cincinnati-in-more-ci-clusters branch from 3056b83 to e8ef6f9 Compare June 27, 2023 17:04
@petr-muller
Copy link
Member Author

/pj-rehearse pull-ci-maistra-envoy-maistra-2.4-maistra-envoy-unit-2-4 pull-ci-openshift-cluster-version-operator-master-e2e-agnostic-operator

@petr-muller
Copy link
Member Author

/test all

@petr-muller petr-muller force-pushed the use-integration-cincinnati-in-more-ci-clusters branch from e8ef6f9 to 77f0c0a Compare June 27, 2023 17:19
@petr-muller
Copy link
Member Author

/pj-rehearse pull-ci-maistra-envoy-maistra-2.4-maistra-envoy-unit-2-4 pull-ci-openshift-cluster-version-operator-master-e2e-agnostic-operator

@petr-muller petr-muller force-pushed the use-integration-cincinnati-in-more-ci-clusters branch from 77f0c0a to b6719a0 Compare June 27, 2023 17:34
@petr-muller
Copy link
Member Author

/pj-rehearse pull-ci-maistra-envoy-maistra-2.4-maistra-envoy-unit-2-4 pull-ci-openshift-cluster-version-operator-master-e2e-agnostic-operator

1 similar comment
@petr-muller
Copy link
Member Author

/pj-rehearse pull-ci-maistra-envoy-maistra-2.4-maistra-envoy-unit-2-4 pull-ci-openshift-cluster-version-operator-master-e2e-agnostic-operator

@petr-muller
Copy link
Member Author

This is what I wanted to see (from https://console-openshift-console.apps.build05.l9oh.p1.openshiftapps.com/k8s/ns/ci-op-zvddhzch/pods/maistra-envoy-unit-2-4-ipi-install-install/logs)

Release payload version: 4.10.62
Original channel from CVO manifest: stable-4.10
Matching candidate channel: candidate-4.10
Version 4.10.62 is available in candidate-4.10: cluster can query OSUS

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dunno why folks have copy/pasted this around so much, but:

$ git --no-pager grep cvo-overrides.yaml ci-operator/step-registry/
ci-operator/step-registry/ipi/install/install/aws/ipi-install-install-aws-commands.sh:sed -i '/^  channel:/d' "${dir}/manifests/cvo-overrides.yaml"
ci-operator/step-registry/ipi/install/install/ipi-install-install-commands.sh:sed -i '/^  channel:/d' "${dir}/manifests/cvo-overrides.yaml"
ci-operator/step-registry/ipi/install/libvirt/install/ipi-install-libvirt-install-commands.sh:sed -i '/^  channel:/d' ${dir}/manifests/cvo-overrides.yaml
ci-operator/step-registry/ipi/install/powervs/install/ipi-install-powervs-install-commands.sh:sed -i '/^  channel:/d' "${dir}/manifests/cvo-overrides.yaml"
ci-operator/step-registry/upi/conf/vsphere/platform-external/upi-conf-vsphere-platform-external-commands.sh:sed -i '/^  channel:/d' "manifests/cvo-overrides.yaml"
ci-operator/step-registry/upi/conf/vsphere/upi-conf-vsphere-commands.sh:sed -i '/^  channel:/d' "manifests/cvo-overrides.yaml"
ci-operator/step-registry/upi/conf/vsphere/zones/upi-conf-vsphere-zones-commands.sh:sed -i '/^  channel:/d' "manifests/cvo-overrides.yaml"
ci-operator/step-registry/upi/install/aws/cluster/upi-install-aws-cluster-commands.sh:sed -i '/^  channel:/d' ${ARTIFACT_DIR}/installer/manifests/cvo-overrides.yaml
ci-operator/step-registry/upi/install/azure/upi-install-azure-commands.sh:sed -i '/^  channel:/d' manifests/cvo-overrides.yaml

I'm completely fine leaving those other steps alone, and letting their maintainers continue to copy/paste or find some new way to DRY things up, but wanted to point out the issue in case other folks wanted to do something more proactive.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I will try to update them all, but I will do that in separate PRs otherwise the rehearsals will be pretty hard to track.

@petr-muller
Copy link
Member Author

/test build04-dry

@petr-muller petr-muller force-pushed the use-integration-cincinnati-in-more-ci-clusters branch from b6719a0 to 5b5ca77 Compare July 20, 2023 18:13
@petr-muller petr-muller changed the title ipi install: use integration cincinnati in MORE CI clusters ipi install: use integration cincinnati in CI clusters, where possible Jul 20, 2023
@petr-muller petr-muller marked this pull request as ready for review July 20, 2023 18:14
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 20, 2023
@openshift-ci openshift-ci bot requested review from dgoodwin and neisw July 20, 2023 18:15
@petr-muller
Copy link
Member Author

I cleaned up the change, added proper commit / pr description and I want to see some rehearsals.

/hold
I still need to address Trevor's comment about querying OSUS with arch parameter (#40711 (comment))

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 20, 2023
@petr-muller
Copy link
Member Author

petr-muller commented Jul 20, 2023

A total of 11049 jobs have been affected by this change.

I fear nothing!

/pj-rehearse

@openshift-ci-robot
Copy link
Contributor

@petr-muller, pj-rehearse: failed to create rehearsal jobs ERROR:

failed to ensure imagestreamtags in cluster build01: failed waiting for imagestreamtag openshift/knative-v0.17.1:knative-eventing-contrib-src to appear: timed out waiting for the condition

If the problem persists, please contact Test Platform.

@petr-muller petr-muller force-pushed the use-integration-cincinnati-in-more-ci-clusters branch from b12d197 to 232f7dc Compare August 9, 2023 16:54
@petr-muller
Copy link
Member Author

/pj-rehearse pull-ci-maistra-envoy-maistra-2.4-maistra-envoy-unit-2-4 periodic-ci-dora-metrics-pelorus-master-4.13-e2e-openshift-test-scenario-1-periodic pull-ci-opendatahub-io-kubeflow-master-odh-notebook-controller-e2e pull-ci-3scale-3scale-operator-master-test-e2e pull-ci-openshift-coredns-master-e2e-aws-ovn pull-ci-medik8s-machine-deletion-remediation-main-4.14-openshift-e2e periodic-ci-openshift-release-master-ci-4.14-upgrade-from-stable-4.13-e2e-aws-ovn-upgrade periodic-ci-openshift-release-master-ci-4.14-upgrade-from-stable-4.13-e2e-gcp-ovn-upgrade periodic-ci-openshift-multiarch-master-nightly-4.14-ocp-e2e-ibmcloud-ovn-multi-s390x periodic-ci-openshift-multiarch-master-nightly-4.14-ocp-e2e-aws-sdn-arm64

1 similar comment
@petr-muller
Copy link
Member Author

/pj-rehearse pull-ci-maistra-envoy-maistra-2.4-maistra-envoy-unit-2-4 periodic-ci-dora-metrics-pelorus-master-4.13-e2e-openshift-test-scenario-1-periodic pull-ci-opendatahub-io-kubeflow-master-odh-notebook-controller-e2e pull-ci-3scale-3scale-operator-master-test-e2e pull-ci-openshift-coredns-master-e2e-aws-ovn pull-ci-medik8s-machine-deletion-remediation-main-4.14-openshift-e2e periodic-ci-openshift-release-master-ci-4.14-upgrade-from-stable-4.13-e2e-aws-ovn-upgrade periodic-ci-openshift-release-master-ci-4.14-upgrade-from-stable-4.13-e2e-gcp-ovn-upgrade periodic-ci-openshift-multiarch-master-nightly-4.14-ocp-e2e-ibmcloud-ovn-multi-s390x periodic-ci-openshift-multiarch-master-nightly-4.14-ocp-e2e-aws-sdn-arm64

This PR is a part of the same effort as openshift#40703

OTA maintains staging (identical to production) and integration (running
on engineering candidate OCP clusters) Cincinnati instances. We are
searching for traffic that we could route to especially the integration
one, so that we can find possible problems with engineering candidate
early.

All the instances are serving identical data (up to some minimal skew
coming from when individual instance scrape their source data) so we
should be able to easily use the integration instance in CI clusters
that are running a released OCP version. Most CI clusters are not
running such versions, and these must *not* query OSUS, otherwise they
would trip an alert, causing noise in CI jobs. CI clusters are not
querying OSUS since openshift#8631

We can enhance the logic to validate whether the version we are going
to install is known to OSUS. If it is, we know we are installing a
published OCP version and it is safe to let the cluster query OSUS. We
still clear the channel to prevent the cluster from querying OSUS if we
the version we install is not known to OSUS, or if this job involves
upgrading the cluster to another version.
@petr-muller petr-muller force-pushed the use-integration-cincinnati-in-more-ci-clusters branch from 232f7dc to 157b4b1 Compare August 10, 2023 14:07
@petr-muller
Copy link
Member Author

/hold cancel

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 10, 2023
@petr-muller
Copy link
Member Author

/pj-rehearse pull-ci-maistra-envoy-maistra-2.4-maistra-envoy-unit-2-4 periodic-ci-dora-metrics-pelorus-master-4.13-e2e-openshift-test-scenario-1-periodic pull-ci-opendatahub-io-kubeflow-master-odh-notebook-controller-e2e pull-ci-3scale-3scale-operator-master-test-e2e pull-ci-openshift-coredns-master-e2e-aws-ovn pull-ci-medik8s-machine-deletion-remediation-main-4.14-openshift-e2e periodic-ci-openshift-release-master-ci-4.14-upgrade-from-stable-4.13-e2e-aws-ovn-upgrade periodic-ci-openshift-release-master-ci-4.14-upgrade-from-stable-4.13-e2e-gcp-ovn-upgrade periodic-ci-openshift-multiarch-master-nightly-4.14-ocp-e2e-ibmcloud-ovn-multi-s390x periodic-ci-openshift-multiarch-master-nightly-4.14-ocp-e2e-aws-sdn-arm64

@openshift-ci-robot
Copy link
Contributor

[REHEARSALNOTIFIER]
@petr-muller: the pj-rehearse plugin accommodates running rehearsal tests for the changes in this PR. Expand 'Interacting with pj-rehearse' for usage details. The following rehearsable tests have been affected by this change:

Test name Repo Type Reason
pull-ci-openshift-lvm-operator-release-4.13-lvm-operator-bundle-e2e-aws openshift/lvm-operator presubmit Registry content changed
pull-ci-openshift-lvm-operator-release-4.12-lvm-operator-bundle-e2e-aws openshift/lvm-operator presubmit Registry content changed
pull-ci-openshift-lvm-operator-release-4.11-lvm-operator-bundle-e2e-aws openshift/lvm-operator presubmit Registry content changed
pull-ci-openshift-lvm-operator-main-lvm-operator-e2e-aws openshift/lvm-operator presubmit Registry content changed
pull-ci-openshift-machine-api-provider-ibmcloud-main-e2e-aws openshift/machine-api-provider-ibmcloud presubmit Registry content changed
pull-ci-openshift-machine-api-provider-ibmcloud-release-4.15-e2e-aws openshift/machine-api-provider-ibmcloud presubmit Registry content changed
pull-ci-openshift-machine-api-provider-ibmcloud-release-4.14-e2e-aws openshift/machine-api-provider-ibmcloud presubmit Registry content changed
pull-ci-openshift-machine-api-provider-ibmcloud-release-4.13-e2e-aws openshift/machine-api-provider-ibmcloud presubmit Registry content changed
pull-ci-openshift-machine-api-provider-ibmcloud-release-4.12-e2e-aws openshift/machine-api-provider-ibmcloud presubmit Registry content changed
pull-ci-openshift-machine-api-provider-ibmcloud-release-4.12-e2e-aws-serial openshift/machine-api-provider-ibmcloud presubmit Registry content changed
pull-ci-openshift-machine-api-provider-ibmcloud-main-e2e-ibmcloud openshift/machine-api-provider-ibmcloud presubmit Registry content changed
pull-ci-openshift-machine-api-provider-ibmcloud-release-4.15-e2e-ibmcloud openshift/machine-api-provider-ibmcloud presubmit Registry content changed
pull-ci-openshift-machine-api-provider-ibmcloud-release-4.14-e2e-ibmcloud openshift/machine-api-provider-ibmcloud presubmit Registry content changed
pull-ci-openshift-machine-api-provider-ibmcloud-release-4.13-e2e-ibmcloud openshift/machine-api-provider-ibmcloud presubmit Registry content changed
pull-ci-openshift-machine-api-provider-ibmcloud-release-4.12-e2e-ibmcloud openshift/machine-api-provider-ibmcloud presubmit Registry content changed
pull-ci-openshift-kubernetes-autoscaler-master-e2e-aws-operator openshift/kubernetes-autoscaler presubmit Registry content changed
pull-ci-openshift-kubernetes-autoscaler-release-4.15-e2e-aws-operator openshift/kubernetes-autoscaler presubmit Registry content changed
pull-ci-openshift-kubernetes-autoscaler-release-4.14-e2e-aws-operator openshift/kubernetes-autoscaler presubmit Registry content changed
pull-ci-openshift-kubernetes-autoscaler-release-4.13-e2e-aws-operator openshift/kubernetes-autoscaler presubmit Registry content changed
pull-ci-openshift-kubernetes-autoscaler-release-4.12-e2e-aws-operator openshift/kubernetes-autoscaler presubmit Registry content changed
pull-ci-openshift-kubernetes-autoscaler-release-4.11-e2e-aws-operator openshift/kubernetes-autoscaler presubmit Registry content changed
pull-ci-openshift-kubernetes-autoscaler-release-4.10-e2e-aws-operator openshift/kubernetes-autoscaler presubmit Registry content changed
pull-ci-openshift-kubernetes-autoscaler-release-4.9-e2e-aws-operator openshift/kubernetes-autoscaler presubmit Registry content changed
pull-ci-openshift-kubernetes-autoscaler-release-4.8-e2e-aws-operator openshift/kubernetes-autoscaler presubmit Registry content changed
pull-ci-openshift-kubernetes-autoscaler-release-4.7-e2e-aws-operator openshift/kubernetes-autoscaler presubmit Registry content changed
pull-ci-openshift-kubernetes-autoscaler-release-4.6-e2e-aws-operator openshift/kubernetes-autoscaler presubmit Registry content changed
pull-ci-openshift-kubernetes-autoscaler-release-4.5-e2e-aws-operator openshift/kubernetes-autoscaler presubmit Registry content changed
pull-ci-openshift-kubernetes-autoscaler-release-4.4-e2e-aws-operator openshift/kubernetes-autoscaler presubmit Registry content changed
pull-ci-openshift-kubernetes-autoscaler-release-4.3-e2e-aws-operator openshift/kubernetes-autoscaler presubmit Registry content changed
pull-ci-openshift-kubernetes-autoscaler-release-4.2-e2e-aws-operator openshift/kubernetes-autoscaler presubmit Registry content changed
pull-ci-openshift-kubernetes-autoscaler-release-4.1-e2e-aws-operator openshift/kubernetes-autoscaler presubmit Registry content changed
pull-ci-openshift-kubernetes-autoscaler-master-e2e-azure-operator openshift/kubernetes-autoscaler presubmit Registry content changed
pull-ci-openshift-kubernetes-autoscaler-release-4.15-e2e-azure-operator openshift/kubernetes-autoscaler presubmit Registry content changed
pull-ci-openshift-kubernetes-autoscaler-release-4.14-e2e-azure-operator openshift/kubernetes-autoscaler presubmit Registry content changed
pull-ci-openshift-kubernetes-autoscaler-release-4.13-e2e-azure-operator openshift/kubernetes-autoscaler presubmit Registry content changed

A total of 12772 jobs have been affected by this change. The above listing is non-exhaustive and limited to 35 jobs.

A full list of affected jobs can be found here

Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 10 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 20 rehearsals
Comment: /pj-rehearse max to run up to 35 rehearsals
Comment: /pj-rehearse auto-ack to run up to 10 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse abort to abort all active rehearsals

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Aug 10, 2023

@petr-muller: This pull request references OTA-905 which is a valid jira issue.

Details

In response to this:

This PR is a part of the same effort as #40703

OTA maintains staging (identical to production) and integration (running on engineering candidate OCP clusters) Cincinnati instances. We are searching for traffic that we could route to especially the integration one, so that we can find possible problems with engineering candidate early.

All the instances are serving identical data (up to some minimal skew coming from when individual instance scrape their source data) so we should be able to easily use the integration instance in CI clusters that are running a released OCP version. Most CI clusters are not running such versions, and these need to not query OSUS, otherwise
they would trip an alert, causing noise in CI jobs. CI clusters are not querying OSUS since #8631

We can enhance the logic to validate whether the version we are going to install is known to OSUS. If it is, we know we are installing a published OCP version and it is safe to let the cluster query OSUS. We still clear the channel to prevent the cluster from querying OSUS if we the version we install is not known to OSUS.

Changing ipi-install-install, this has a huge blast radius with these three potential failure cases:

  1. Non-zero exit code of my bash will cause steps to fail. I ran a ton of rehearsals.
  2. Integration Cincinnati breaks and clusters will start to alert about failing to query OSUS. That's the point of this change.
  3. Some clusters will query for updates when they should not. There can be corner cases that I missed, like some rogue update jobs that do not use ipi-install-install-stableinitial steps. I plan to monitor search.ci for an uptick of jobs where CannotRetrieveUpdates alert fires

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Aug 10, 2023

@petr-muller: This pull request references OTA-905 which is a valid jira issue.

Details

In response to this:

This PR is a part of the same effort as #40703

OTA maintains staging (identical to production) and integration (running on engineering candidate OCP clusters) Cincinnati instances. We are searching for traffic that we could route to especially the integration one, so that we can find possible problems with engineering candidate early.

All the instances are serving identical data (up to some minimal skew coming from when individual instance scrape their source data) so we should be able to easily use the integration instance in CI clusters that are running a released OCP version. Most CI clusters are not running such versions, and these need to not query OSUS, otherwise
they would trip an alert, causing noise in CI jobs. CI clusters are not querying OSUS since #8631

We can enhance the logic to validate whether the version we are going to install is known to OSUS. If it is, we know we are installing a published OCP version and it is safe to let the cluster query OSUS. We still clear the channel to prevent the cluster from querying OSUS if we the version we install is not known to OSUS.

Changing ipi-install-install, this has a huge blast radius with these three potential failure cases:

  1. Non-zero exit code of my bash will cause steps to fail. I ran a ton of rehearsals.
  2. Integration Cincinnati breaks and clusters will start to alert about failing to query OSUS. That's the point of this change. Integration OSUS runs on Engineering Candidate OCP cluster and if ECs break real-world workloads, we want to know that.
  3. Some clusters will query for updates when they should not. There can be corner cases that I missed, like some rogue update jobs that do not use ipi-install-install-stableinitial steps. I plan to monitor search.ci for an uptick of jobs where CannotRetrieveUpdates alert fires

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link
Member

@wking wking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Aug 10, 2023
@dgoodwin
Copy link
Contributor

/approve

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 10, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dgoodwin, petr-muller, wking

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 10, 2023
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 10, 2023

@petr-muller: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/rehearse/stolostron/rbac-query-proxy/release-2.3/test-e2e 5b5ca776e9d93f77b3cf7a96480c902d87a8b8e1 link unknown /pj-rehearse pull-ci-stolostron-rbac-query-proxy-release-2.3-test-e2e
ci/rehearse/openshift/coredns/release-4.15/e2e-aws-ovn 5b5ca776e9d93f77b3cf7a96480c902d87a8b8e1 link unknown /pj-rehearse pull-ci-openshift-coredns-release-4.15-e2e-aws-ovn
ci/rehearse/periodic-ci-openshift-multiarch-master-nightly-4.14-ocp-e2e-ibmcloud-ovn-multi-s390x 157b4b1 link unknown /pj-rehearse periodic-ci-openshift-multiarch-master-nightly-4.14-ocp-e2e-ibmcloud-ovn-multi-s390x
ci/rehearse/medik8s/machine-deletion-remediation/main/4.14-openshift-e2e 157b4b1 link unknown /pj-rehearse pull-ci-medik8s-machine-deletion-remediation-main-4.14-openshift-e2e
ci/rehearse/periodic-ci-openshift-release-master-ci-4.14-upgrade-from-stable-4.13-e2e-aws-ovn-upgrade 157b4b1 link unknown /pj-rehearse periodic-ci-openshift-release-master-ci-4.14-upgrade-from-stable-4.13-e2e-aws-ovn-upgrade
ci/rehearse/opendatahub-io/kubeflow/master/odh-notebook-controller-e2e 157b4b1 link unknown /pj-rehearse pull-ci-opendatahub-io-kubeflow-master-odh-notebook-controller-e2e

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@petr-muller
Copy link
Member Author

/pj-rehearse periodic-ci-openshift-multiarch-master-nightly-4.14-ocp-e2e-ibmcloud-ovn-multi-s390x

@petr-muller
Copy link
Member Author

/pj-rehearse ack

@openshift-ci-robot openshift-ci-robot added the rehearsals-ack Signifies that rehearsal jobs have been acknowledged label Aug 10, 2023
@openshift-merge-robot openshift-merge-robot merged commit c05d849 into openshift:master Aug 10, 2023
petr-muller added a commit to petr-muller/release that referenced this pull request Aug 14, 2023
…s, where possible

This PR is a part of the same effort as openshift#40703 and implements the same change as openshift#40711. This change is a separate step to avoid mixing up rehearsals.

OTA maintains staging (identical to production) and integration (running on engineering candidate OCP clusters) Cincinnati instances. We are searching for traffic that we could route to especially the integration one, so that we can find possible problems with engineering candidate early.

All the instances are serving identical data (up to some minimal skew coming from when individual instance scrape their source data) so we should be able to easily use the integration instance in CI clusters that are running a released OCP version. Most CI clusters are not running such versions, and these need to not query OSUS, otherwise they would trip an alert, causing noise in CI jobs. CI clusters are not querying OSUS since openshift#8631

We can enhance the logic to validate whether the version we are going to install is known to OSUS. If it is, we know we are installing a published OCP version and it is safe to let the cluster query OSUS. We still clear the channel to prevent the cluster from querying OSUS if we the version we install is not known to OSUS.
petr-muller added a commit to petr-muller/release that referenced this pull request Aug 17, 2023
…sible

This PR is a part of the same effort as openshift#40703 and implements the same change as openshift#40711. This change is a separate step to avoid mixing up rehearsals.

OTA maintains staging (identical to production) and integration (running on engineering candidate OCP clusters) Cincinnati instances. We are searching for traffic that we could route to especially the integration one, so that we can find possible problems with engineering candidate early.

All the instances are serving identical data (up to some minimal skew coming from when individual instance scrape their source data) so we should be able to easily use the integration instance in CI clusters that are running a released OCP version. Most CI clusters are not running such versions, and these need to not query OSUS, otherwise
they would trip an alert, causing noise in CI jobs. CI clusters are not querying OSUS since openshift#8631

We can enhance the logic to validate whether the version we are going to install is known to OSUS. If it is, we know we are installing a published OCP version and it is safe to let the cluster query OSUS. We still clear the channel to prevent the cluster from querying OSUS if we the version we install is not known to OSUS.
@petr-muller petr-muller deleted the use-integration-cincinnati-in-more-ci-clusters branch August 23, 2023 14:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. rehearsals-ack Signifies that rehearsal jobs have been acknowledged

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants