Skip to content

Conversation

@stbenjam
Copy link
Member

This reverts commit 067c05a. This
version of RHCOS is shipping with a broken podman that does not
correctly return exit codes. The fix from
https://bugzilla.redhat.com/show_bug.cgi?id=1741157 needs to be
backported into a new podman RPM for EL8 and the fix brought into RHCOS.

If the etcd health check fails, bootkube continues on regardless instead
of retrying. This manifests itself most catastrophically for the
baremetal IPI platform, where the first etcd health check fails very
quickly as DNS records aren't available yet.

This reverts commit 067c05a.  This
version of RHCOS is shipping with a broken podman that does not
correctly return exit codes. The fix from
https://bugzilla.redhat.com/show_bug.cgi?id=1741157 needs to be
backported into a new podman RPM for EL8 and the fix brought into RHCOS.

If the etcd health check fails, bootkube continues on regardless instead
of retrying. This manifests itself most catastrophically for the
baremetal IPI platform, where the first etcd health check fails very
quickly as DNS records aren't available yet.
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: stbenjam
To complete the pull request process, please assign abhinavdahiya
You can assign the PR to them by writing /assign @abhinavdahiya in a comment when ready.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Aug 27, 2019
@cgwalters
Copy link
Member

Hmm but that's going to back out other important fixes...

@stbenjam
Copy link
Member Author

We could also ask for a new RHCOS build with an older podman without this problem

@stbenjam
Copy link
Member Author

baremetal IPI is most obviously hit by this, but having a broken podman like this is pretty catastrophic and is probably going to manifest itself in other odd ways. We should get a newer podman soon, but it would be a few days, and it's a long time to be in this situation.

@miabbott
Copy link
Member

Older podman from the 42.80.20190725.1 build would be podman-1.4.2-1.module+el8.1.0+3423+f0eda5e0

@abhinavdahiya
Copy link
Contributor

I'm more inclined towards not backing out important fixes for our GA platforms like Azure, vSphere.

@stbenjam
Copy link
Member Author

We may get a new RHCOS relatively quickly, backport PR is open containers/podman#3895

@yuqi-zhang
Copy link
Contributor

We're working on both getting the new one, as well as a build for reverting ONLY podman to the old version for RHCOS. I think this can be closed in favour of either one.

@openshift-ci-robot
Copy link
Contributor

@stbenjam: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
ci/prow/e2e-openstack 4fee45b link /test e2e-openstack
ci/prow/e2e-libvirt 4fee45b link /test e2e-libvirt

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@stbenjam
Copy link
Member Author

Sounds good to me, thanks so much

/close

@openshift-ci-robot
Copy link
Contributor

@stbenjam: Closed this PR.

Details

In response to this:

Sounds good to me, thanks so much

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants