Skip to content

Conversation

@openshift-bot
Copy link
Contributor

Please merge as soon as https://errata.devel.redhat.com/advisory/56099 is shipped live OR if a Cincinnati-first release is approved.

This should provide adequate soak time for candidate channel PR #297

This PR will also enable upgrades from 4.3.27 to releases in fast-4.4

@openshift-bot openshift-bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 24, 2020
@wking
Copy link
Member

wking commented Jun 29, 2020

Reviewing CI jobs:

  • 4.2.29 -> 4.3.27 failed Application behind service load balancer with PDB is not disrupted with Service was unreachable during disruption for at least 15m10s of 41m30s (37%):, which is being tracked in rhbz#1828858.
  • 4.2.33 -> 4.3.27 failed with Frontends were unreachable during disruption for at least 11m42s of 47m20s (25%):, which is being tracked in rhbz#1850074.
  • 4.3.9 -> 4.3.27 failed with Cluster did not complete upgrade: timed out waiting for the condition: Cluster operator openshift-controller-manager is still updating. Needs more digging.
  • 4.3.10 -> 4.3.27 failed with error simulating policy: Throttling: Rate exceeded. Fixed in 4.5+; not backported. Unlikely to affect folks that aren't launching zounds of clusters in the same AWS account each day.
  • 4.3.12 -> 4.3.27 failed with Kubernetes API was unreachable during disruption for at least 5m21s of 47m49s (11%):. Sounds like rhbz#1850057, although that's technically about 4.4 -> 4.5. rhbz#1852056 is POST for master/4.6 to help in this space.
  • 4.3.23 -> 4.3.27 failed in setup with failed to initialize the cluster: Cluster operator image-registry has not yet reported success. That's a problem, but it's a 4.3.23 problem, not a 4.3.27 problem.

@wking
Copy link
Member

wking commented Jun 29, 2020

Digging into the 4.3.9 -> 4.3.27 failure:

$ curl -s https://storage.googleapis.com/origin-ci-test/logs/release-openshift-origin-installer-launch-gcp/1275691585048154112/artifacts/launch/clusteroperators.json | jq -r '.items[] | select(.metadata.name == "openshift-controller-manager").status.conditions[] | .lastTransitionTime + " " + .type + "=" + .status + " " + (.reason // "-") + ": " + (.message // "-")'
2020-06-24T07:43:26Z Degraded=False AsExpected: -
2020-06-24T08:22:19Z Progressing=False AsExpected: -
2020-06-24T07:45:57Z Available=True AsExpected: -
2020-06-24T07:43:26Z Upgradeable=Unknown NoData: -

Huh. That looks happy. Checking the CVO logs:

$ curl -s https://storage.googleapis.com/origin-ci-test/logs/release-openshift-origin-installer-launch-gcp/1275691585048154112/artifacts/launch/pods/openshift-cluster-version_cluster-version-operator-65986df549-96s4s_cluster-version-operator.log | grep 'Running sync.*in state\|Result of work'
...
I0624 09:15:02.771043       1 task_graph.go:611] Result of work: [Cluster operator openshift-controller-manager is still updating]
I0624 09:18:00.536361       1 sync_worker.go:471] Running sync registry.svc.ci.openshift.org/ocp/release@sha256:a2bdd3b4516e05760d01e2589fc0866f7386c1c10c866b29fea137067e76f2ae (force=true) on generation 2 in state Updating at attempt 8
I0624 09:23:45.588106       1 task_graph.go:611] Result of work: [Cluster operator openshift-controller-manager is still updating]
I0624 09:26:50.850345       1 sync_worker.go:471] Running sync registry.svc.ci.openshift.org/ocp/release@sha256:a2bdd3b4516e05760d01e2589fc0866f7386c1c10c866b29fea137067e76f2ae (force=true) on generation 2 in state Updating at attempt 9

So still stuck after the ClusterOperator conditions had settled. Checking the version:

$ curl -s https://storage.googleapis.com/origin-ci-test/logs/release-openshift-origin-installer-launch-gcp/1275691585048154112/artifacts/launch/clusteroperators.json | jq -r '.items[] | select(.metadata.name == "openshift-controller-manager").status.versions'
[
  {
    "name": "operator",
    "version": "4.3.9"
  }
]

Well, there you go. Looks like that's fixed in 4.5+, but not (yet?) backported to 4.3.

Anyhow, looks like none of these failures are regressions (and 4.3.27 Insights/Telemetry has been quiet), so I'm fine with this fast promotion going out once we have public errata.

@wking
Copy link
Member

wking commented Jul 1, 2020

Errata is public.

/lgtm
/hold cancel
/retest

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 1, 2020
@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: openshift-bot, wking

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added lgtm Indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Jul 1, 2020
@openshift-merge-robot openshift-merge-robot merged commit d3d9d12 into master Jul 1, 2020
@sdodson sdodson deleted the pr-fast-4.3.27 branch October 12, 2020 16:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants