Skip to content

Conversation

@sinnykumari
Copy link
Contributor

Also, add machine-config-daemon-firstboot-v42.service so that new
nodes through machine-api comes up as expected on a 4.2 to 4.3
upgraded cluster

Backport of PRs:

Also, add machine-config-daemon-firstboot-v42.service so that new
nodes through machine-api comes up as expected on a 4.2 to 4.3
upgraded cluster

Backport of PRs:
- openshift#1366
- openshift#1706
@openshift-ci-robot
Copy link
Contributor

@sinnykumari: No Bugzilla bug is referenced in the title of this pull request.
To reference a bug, add 'Bug XXX:' to the title of this pull request and request another bug refresh with /bugzilla refresh.

Details

In response to this:

templates: Don't enable machine-config-daemon-host.service by default

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@sinnykumari sinnykumari requested a review from cgwalters May 21, 2020 07:15
@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 21, 2020
@sinnykumari sinnykumari requested review from ashcrow and runcom and removed request for ashcrow May 21, 2020 07:16
@sinnykumari
Copy link
Contributor Author

This PR is as a result of doing some karg related testing to create a new node using machine-api. Existing cluster is upgraded from 4.2 to 4.3 and manually updated machineset with 4.3 bootimage.
In the m-c-d firstboot log noticed that karg foo is not getting applied as expected, possibly due to race with m-c-d-host service.

sh-4.4# journalctl -u machine-config-daemon-firstboot.service
...
update.go:235] Checking Reconcilable for config mco-empty-mc to rendered-worker-785abb27e6b7edc988f37d09a97b86c7
May 20 05:24:46 ip-10-0-154-174 machine-config-daemon[1601]: I0520 05:24:46.076361    1601 update.go:1051] Starting update from mco-empty-mc to rendered-worker-785abb27e6b7edc988f37d09a97b86c7: &{osUpdate:true kargs:true fips:false passwd>
May 20 05:24:46 ip-10-0-154-174 machine-config-daemon[1601]: I0520 05:24:46.102063    1601 update.go:578] Updating files
May 20 05:24:46 ip-10-0-154-174 machine-config-daemon[1601]: I0520 05:24:46.102080    1601 update.go:597] Deleting stale data
May 20 05:24:46 ip-10-0-154-174 machine-config-daemon[1601]: I0520 05:24:46.104141    1601 update.go:1051] Running rpm-ostree [kargs --append=foo]
May 20 05:25:11 ip-10-0-154-174 machine-config-daemon[1601]: I0520 05:25:11.899278    1601 update.go:929] Updating OS to quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:e0334fa2112f0c89d690062baeb98d7e7d9a0aaedb3066507778d618ca7aaaf8
May 20 05:25:11 ip-10-0-154-174 machine-config-daemon[1601]: I0520 05:25:11.542760    1603 rpm-ostree.go:366] Running captured: podman create --net=none --annotation=org.openshift.machineconfigoperator.pivot=true --name ostree-container-p>
May 20 05:25:11 ip-10-0-154-174 machine-config-daemon[1601]: 2020-05-20 05:25:11.66433868 +0000 UTC m=+0.102035747 container create 44255b02eac339569d01da3d130282b27665b9def76760dd54ac3198449aad02 (image=quay.io/openshift-release-dev/ocp->
May 20 05:25:11 ip-10-0-154-174 machine-config-daemon[1601]: I0520 05:25:11.679619    1603 rpm-ostree.go:366] Running captured: podman mount 44255b02eac339569d01da3d130282b27665b9def76760dd54ac3198449aad02
May 20 05:25:11 ip-10-0-154-174 machine-config-daemon[1601]: 2020-05-20 05:25:11.77611808 +0000 UTC m=+0.077943798 container mount 44255b02eac339569d01da3d130282b27665b9def76760dd54ac3198449aad02 (image=quay.io/openshift-release-dev/ocp-v>
May 20 05:25:11 ip-10-0-154-174 machine-config-daemon[1601]: I0520 05:25:11.781623    1603 rpm-ostree.go:246] Pivoting to: 43.81.202005131953.0 (eedd107b1cafc4003a59a20546d92485f8509b55a75611f2d3e1a0f4ca2345c7)
May 20 05:25:11 ip-10-0-154-174 machine-config-daemon[1601]: client(id:cli dbus:1.15 unit:machine-config-daemon-host.service uid:0) added; new total=2
May 20 05:25:11 ip-10-0-154-174 machine-config-daemon[1601]: client(id:cli dbus:1.15 unit:machine-config-daemon-host.service uid:0) vanished; remaining=1
May 20 05:25:11 ip-10-0-154-174 machine-config-daemon[1601]: Txn KernelArgs on /org/projectatomic/rpmostree1/rhcos successful
May 20 05:25:11 ip-10-0-154-174 machine-config-daemon[1601]: client(id:cli dbus:1.14 unit:machine-config-daemon-firstboot.service uid:0) vanished; remaining=0
May 20 05:25:11 ip-10-0-154-174 machine-config-daemon[1601]: In idle state; will auto-exit in 63 seconds
May 20 05:25:13 ip-10-0-154-174 machine-config-daemon[1601]: I0520 05:25:13.106053    1601 update.go:1051] Running rpm-ostree [kargs --delete=foo]
May 20 05:25:15 ip-10-0-154-174 machine-config-daemon[1601]: I0520 05:25:15.418878    1601 update.go:578] Updating files
May 20 05:25:15 ip-10-0-154-174 machine-config-daemon[1601]: I0520 05:25:15.419098    1601 update.go:597] Deleting stale data
May 20 05:25:15 ip-10-0-154-174 machine-config-daemon[1601]: error: failed to run pivot: failed to start machine-config-daemon-host.service: exit status 1
May 20 05:25:15 ip-10-0-154-174 systemd[1]: machine-config-daemon-firstboot.service: Main process exited, code=exited, status=1/FAILURE
May 20 05:25:15 ip-10-0-154-174 systemd[1]: machine-config-daemon-firstboot.service: Failed with result 'exit-code'.
May 20 05:25:15 ip-10-0-154-174 systemd[1]: Failed to start Machine Config Daemon Firstboot.
May 20 05:25:15 ip-10-0-154-174 systemd[1]: machine-config-daemon-firstboot.service: Consumed 194ms CPU time

@sinnykumari
Copy link
Contributor Author

Not sure which existing bug would work best, partially related one is https://bugzilla.redhat.com/show_bug.cgi?id=1830102 , perhaps we can clone this for 4.3?

@kikisdeliveryservice kikisdeliveryservice changed the title templates: Don't enable machine-config-daemon-host.service by default [release-4.3] templates: Don't enable machine-config-daemon-host.service by default May 21, 2020
@openshift-ci-robot
Copy link
Contributor

@sinnykumari: No Bugzilla bug is referenced in the title of this pull request.
To reference a bug, add 'Bug XXX:' to the title of this pull request and request another bug refresh with /bugzilla refresh.

Details

In response to this:

[release-4.3] templates: Don't enable machine-config-daemon-host.service by default

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@kikisdeliveryservice
Copy link
Contributor

Not sure which existing bug would work best, partially related one is https://bugzilla.redhat.com/show_bug.cgi?id=1830102 , perhaps we can clone this for 4.3?

this makes sense to me!!

@kikisdeliveryservice
Copy link
Contributor

/retest

@sinnykumari sinnykumari changed the title [release-4.3] templates: Don't enable machine-config-daemon-host.service by default Bug 1838984: [release-4.3] templates: Don't enable machine-config-daemon-host.service by default May 22, 2020
@openshift-ci-robot openshift-ci-robot added bugzilla/severity-medium Referenced Bugzilla bug's severity is medium for the branch this PR is targeting. bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. labels May 22, 2020
@openshift-ci-robot
Copy link
Contributor

@sinnykumari: This pull request references Bugzilla bug 1838984, which is invalid:

  • expected dependent Bugzilla bug 1829642 to target a release in 4.4.0, 4.4.z, but it targets "4.5.0" instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

Bug 1838984: [release-4.3] templates: Don't enable machine-config-daemon-host.service by default

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@sinnykumari
Copy link
Contributor Author

/bugzilla refresh

@openshift-ci-robot openshift-ci-robot added the bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. label May 22, 2020
@openshift-ci-robot
Copy link
Contributor

@sinnykumari: This pull request references Bugzilla bug 1838984, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

6 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.3.z) matches configured target release for branch (4.3.z)
  • bug is in the state NEW, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)
  • dependent bug Bugzilla bug 1830102 is in the state CLOSED (ERRATA), which is one of the valid states (VERIFIED, RELEASE_PENDING, CLOSED (ERRATA))
  • dependent Bugzilla bug 1830102 targets the "4.4.0" release, which is one of the valid target releases: 4.4.0, 4.4.z
  • bug has dependents
Details

In response to this:

/bugzilla refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot removed the bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. label May 22, 2020
@cgwalters
Copy link
Member

Looks like a clean backport of the two changes. That said...we do need to weigh risk/reward here. This needs manual testing - CI on this repository isn't covering this. If we somehow e.g. broke 4.1 bootimage clusters that'd only be apparent later in the periodics.

I would hope that most people are upgrading to 4.4, which already has these fixes.

Anyways
/approve
/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label May 27, 2020
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cgwalters, sinnykumari

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [cgwalters,sinnykumari]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ashcrow
Copy link
Member

ashcrow commented May 28, 2020

The backports look good though I'm having a hard time tracking these from master->4.4->4.3. Can you provide the paths for each and the bugs associated?

@sinnykumari
Copy link
Contributor Author

This is not a direct backport from a single bug. It comprises of two fixes:

  1. Don't enable machine-config-daemon-host.service by default #1366 - It got merged when 4.4 was master branch and we never backported it and due to which we don't have any associated bugzilla in 4.4 and 4.5
  2. Bug 1829642: templates: Add a special machine-config-daemon-firstboot-v42.service #1706 - If we only backport Don't enable machine-config-daemon-host.service by default #1366 we may hit what is described in Bug 1829642: templates: Add a special machine-config-daemon-firstboot-v42.service #1706 (comment) . Associated 4.4 bugs is https://bugzilla.redhat.com/show_bug.cgi?id=1830102) and 4.5 bug is https://bugzilla.redhat.com/show_bug.cgi?id=1829642.

@ashcrow ashcrow added the cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. label May 28, 2020
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

6 similar comments
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot openshift-merge-robot merged commit 2b17a50 into openshift:release-4.3 May 29, 2020
@openshift-ci-robot
Copy link
Contributor

@sinnykumari: All pull requests linked via external trackers have merged: openshift/machine-config-operator#1747. Bugzilla bug 1838984 has been moved to the MODIFIED state.

Details

In response to this:

Bug 1838984: [release-4.3] templates: Don't enable machine-config-daemon-host.service by default

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-medium Referenced Bugzilla bug's severity is medium for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants