-
Notifications
You must be signed in to change notification settings - Fork 462
[release-4.4] Bug 1830102: templates: Add a special machine-config-daemon-firstboot-v42.service #1707
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[release-4.4] Bug 1830102: templates: Add a special machine-config-daemon-firstboot-v42.service #1707
Conversation
This is aiming to fix: https://bugzilla.redhat.com/show_bug.cgi?id=1829642 AKA openshift#1215 (comment) Basically we have our systemd units dynamically differentiate between "4.2" and "4.3 or above" by looking at the aleph version.
|
@ashcrow: No Bugzilla bug is referenced in the title of this pull request. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
@ashcrow: This pull request references Bugzilla bug 1830102, which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/bugzilla refresh |
|
@kikisdeliveryservice: This pull request references Bugzilla bug 1830102, which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
fixed underlying bz to 4.4.0 and readded bugzilla valid.. |
|
failures seem a bit flaky.. timeouts/ebs can't be deleted, looking but also: /test e2e-aws |
|
/retest |
|
this e2e-aws test is def flaking |
|
seems to be regularly hitting : https://bugzilla.redhat.com/show_bug.cgi?id=1829241 |
|
Tested this PR, with following steps and scaled-up node came up:
Reading through journal log from one of the scaled-up worker node:
|
Is it actually working fine after reboot? From the logs it says started so I’m assuming after reboot it works, can we basically check it the node is healthy? |
yes, crio is running fine after reboot. My concern/question was related to failure during firstboot if that impacts anything. crio status looks good after reboot. Node seems healthy to me as they are all in ready state. MCO reported all 5 worker nodes in same pool which matches current rendered config for worker. Did I miss something to check for node to be considered as healthy? |
|
Alrighty, so I think crio is started no matter what before pivot and it correctly fails. After reboot it’s all fine. I think it’s ok, the only thing we could do is prevent crio from starting as we do for kubelet but I don’t see it as a big deal cc @cgwalters |
|
Regarding crio yeah, that came up on the call and also noted it here #1706 (comment) My thinking here is that this is a bug in all branches - one we should fix, I just was trying to do the most minimal/obvious fix. Since this gets us nodes joining the cluster, let's do other things as a followup? |
|
Yup, if doesn't impact cluster, this fix should be sufficient to move further! |
|
Not sure what goes on so the e2e-aws tho |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ashcrow, runcom The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/retest |
|
@ashcrow: Overrode contexts on behalf of ashcrow: ci/prow/e2e-aws DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
@ashcrow: All pull requests linked via external trackers have merged: openshift/machine-config-operator#1707. Bugzilla bug 1830102 has been moved to the MODIFIED state. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Backport of #1706