Skip to content

Conversation

@bgilbert
Copy link
Contributor

#1729 introduced code to obtain the version of the Ignition binary shipped in the OS root filesystem. That PR contemplated that the version would be used more widely in the MCD, but that hasn't occurred, so its only current use is to log the Ignition version at MCD startup. On balance, I think that code path introduces more risk than value, so this PR backs it out. My reasoning is:

  1. By the time this code is invoked, Ignition will never run again, so its version is useful for forensics and not much else. In particular, the version of the Ignition binary in machine-os-content probably doesn't match the one in the bootimage.
  2. MCD is invoking the Ignition binary via a private (non-contractual) path in the root filesystem, but Ignition is designed and tested only for use in the initramfs. In practice, ignition --version should be harmless since it exits very early in Ignition startup, but see below.
  3. If the MCD fails to run the binary, say because the path to Ignition has changed, MCD considers that a fatal error.
  4. And indeed, we now have a bug where an ignition --version segfault is apparently blocking an upgrade from 4.5 to 4.6, on exactly one customer node. From the stack trace, the problem could be in Ignition itself, in the initialization code of a vendored library, or in the Go runtime. As a practical matter, the crash is unlikely to be root-caused.

cc @LorbusChris @sinnykumari @ashcrow from the original PR.

b823087 introduced code to obtain the version of the Ignition binary
shipped in the OS root filesystem.  That commit contemplated that the
version would be used more widely in the MCD, but that hasn't occurred,
so its only current use is to log the Ignition version at MCD startup.
On balance, I think that code path introduces more risk than value, so
this PR backs it out.  My reasoning is:

1. By the time this code is invoked, Ignition will never run again, so
its version is useful for forensics and not much else.  In particular,
the version of the Ignition binary in machine-os-content probably doesn't
match the one in the bootimage.

2. MCD is invoking the Ignition binary via a private (non-contractual)
path in the root filesystem, but Ignition is designed and tested only for
use in the initramfs.  In practice, `ignition --version` _should_ be
harmless since it exits very early in Ignition startup, but see below.

3. If the MCD fails to run the binary, say because the path to Ignition
has changed, MCD considers that a fatal error.

4. And indeed, we now have a bug
(https://bugzilla.redhat.com/show_bug.cgi?id=1927731) where an
`ignition --version` segfault is apparently blocking an upgrade from 4.5
to 4.6, on exactly one customer node.  From the stack trace
(https://bugzilla.redhat.com/show_bug.cgi?id=1927731#c7), the problem
could be in Ignition itself, in the initialization code of a vendored
library, or in the Go runtime.  As a practical matter, the crash is
unlikely to be root-caused.

This reverts commit b823087.
@darkmuggle
Copy link

@bgilbert and I have spoken at length on this.

However, I support the removal of this check based solely on the reason 3 -- the check is a fatal error for purely informational purposes at this point. The MCD team spoke quite a bit about simplifying and making the code safe; this PR accomplishes that goal.

@kikisdeliveryservice
Copy link
Contributor

Would love for ci to finish running but on an initial pass, it seems like this isn't providing much value but causing problems (via reason 3 above).

@bgilbert
Copy link
Contributor Author

/retest

@kikisdeliveryservice
Copy link
Contributor

Tests are a mess today... will check back in later.

/retest

@bgilbert
Copy link
Contributor Author

/retest

Copy link
Contributor

@sinnykumari sinnykumari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with reasoning provided here. In future if we need to know and utilize ignition version, we can revisit and discuss with ignition team to do it in safer way.

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 25, 2021
@bgilbert
Copy link
Contributor Author

/retest

Copy link
Contributor

@kikisdeliveryservice kikisdeliveryservice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like we have agreement this is good so

/lgtm

@kikisdeliveryservice
Copy link
Contributor

That umm didnt work..

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Feb 25, 2021
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bgilbert, kikisdeliveryservice, LorbusChris, sinnykumari

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [kikisdeliveryservice,sinnykumari]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kikisdeliveryservice
Copy link
Contributor

@bgilbert do we want to/how far should we backport this?

@bgilbert
Copy link
Contributor Author

It looks like the code was introduced in 4.6. Since RHBZ 1927731 reports a failure in a 4.6 upgrade, I'd be inclined to backport to both 4.6 and 4.7.

@bgilbert bgilbert changed the title Revert "pkg/daemon: Add IgnitionVersion to Daemon" Bug 1927731: Revert "pkg/daemon: Add IgnitionVersion to Daemon" Feb 25, 2021
@openshift-ci-robot
Copy link
Contributor

@bgilbert: This pull request references Bugzilla bug 1927731, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.8.0) matches configured target release for branch (4.8.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)
Details

In response to this:

Bug 1927731: Revert "pkg/daemon: Add IgnitionVersion to Daemon"

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added bugzilla/severity-medium Referenced Bugzilla bug's severity is medium for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. labels Feb 25, 2021
@bgilbert
Copy link
Contributor Author

/bugzilla refresh

@openshift-ci-robot
Copy link
Contributor

@bgilbert: This pull request references Bugzilla bug 1927731, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.8.0) matches configured target release for branch (4.8.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)
Details

In response to this:

/bugzilla refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@bgilbert
Copy link
Contributor Author

/cherrypick release-4.6
/cherrypick release-4.7

@openshift-cherrypick-robot

@bgilbert: once the present PR merges, I will cherry-pick it on top of release-4.6 in a new PR and assign it to you.

Details

In response to this:

/cherrypick release-4.6
/cherrypick release-4.7

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@kikisdeliveryservice
Copy link
Contributor

/cherrypick release-4.7

@openshift-cherrypick-robot

@kikisdeliveryservice: once the present PR merges, I will cherry-pick it on top of release-4.7 in a new PR and assign it to you.

Details

In response to this:

/cherrypick release-4.7

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@bgilbert
Copy link
Contributor Author

/retest

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 25, 2021

@bgilbert: The following tests failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
ci/prow/okd-e2e-aws 2203d88 link /test okd-e2e-aws
ci/prow/e2e-aws-workers-rhel7 2203d88 link /test e2e-aws-workers-rhel7

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-merge-robot openshift-merge-robot merged commit 058c02d into openshift:master Feb 25, 2021
@openshift-ci-robot
Copy link
Contributor

@bgilbert: All pull requests linked via external trackers have merged:

Bugzilla bug 1927731 has been moved to the MODIFIED state.

Details

In response to this:

Bug 1927731: Revert "pkg/daemon: Add IgnitionVersion to Daemon"

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-cherrypick-robot

@bgilbert: #2431 failed to apply on top of branch "release-4.6":

Applying: Revert "pkg/daemon: Add IgnitionVersion to Daemon"
Using index info to reconstruct a base tree...
M	pkg/daemon/daemon.go
Falling back to patching base and 3-way merge...
Auto-merging pkg/daemon/daemon.go
CONFLICT (content): Merge conflict in pkg/daemon/daemon.go
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 Revert "pkg/daemon: Add IgnitionVersion to Daemon"
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

Details

In response to this:

/cherrypick release-4.6
/cherrypick release-4.7

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@bgilbert bgilbert deleted the ignition branch February 26, 2021 00:12
@bgilbert
Copy link
Contributor Author

/cherrypick release-4.7

@openshift-cherrypick-robot

@bgilbert: new pull request created: #2438

Details

In response to this:

/cherrypick release-4.7

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@bgilbert
Copy link
Contributor Author

4.6 cherry-pick in #2439.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-medium Referenced Bugzilla bug's severity is medium for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants