Skip to content

Conversation

@stbenjam
Copy link
Member

@stbenjam stbenjam commented Jan 16, 2020

For a variety of reasons, it may be useful to keep the bootstrap VM
around for debugging. My general approach has been to interrupt the
installer before it has a chance to delete the bootstrap VM.

If the environment variable OPENSHIFT_INSTALL_PRESERVE_BOOTSTRAP is set
to any value, the installer will not delete the bootstrap.

@openshift-ci-robot openshift-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Jan 16, 2020
@williamcaban
Copy link

We need this option for troubleshooting failure on disconnected environments where we are unable to further troubleshoot failed deployment because the bootstrap is not longer there.

@abhinavdahiya
Copy link
Contributor

Hmm if the cluster bootstrap succeeds I don't see how keeping the bootstrap is useful?
If the cluster bootstrap doesn't complete the bootstrap is not destroyed.

@stbenjam
Copy link
Member Author

I have had many occasions where bootstrap succeeds, but installation failed and it would be useful to examine the bootstrap node. Yesterday we had an issue with machine-config-operator rendering different manifests on bootstrap vs the cluster but the bootstrap had already been removed so we couldn’t examine why.

@sdodson
Copy link
Member

sdodson commented Jan 20, 2020

#1692 previously closed PR in the same direction

I think #2936 is a more generally applicable solution because it narrows the gap between bootstrap and running cluster state.

@sdodson
Copy link
Member

sdodson commented Jan 24, 2020

@abhinavdahiya would you be in support of this being available as an installer build time option? (never to be represented in product docs)

@sdodson
Copy link
Member

sdodson commented Jan 29, 2020

@stbenjam

We've discussed the need for this and have arrived at the conclusion that we'd accept an environment variable that preserved bootstrap host and emitted warnings advising that this is intended only for debugging purposes and poses a risk to cluster stability. Similar to the warning emitted when the release image is overridden via env var.

Would you be interested in updating your work for that or would you prefer someone from Installer team take that on?

@stbenjam
Copy link
Member Author

@stbenjam

We've discussed the need for this and have arrived at the conclusion that we'd accept an environment variable that preserved bootstrap host and emitted warnings advising that this is intended only for debugging purposes and poses a risk to cluster stability. Similar to the warning emitted when the release image is overridden via env var.

Would you be interested in updating your work for that or would you prefer someone from Installer team take that on?

Sure, thanks - I'll update the PR this afternoon.

For a variety of reasons, it may be useful to keep the bootstrap VM
around for debugging. My general approach has been to interrupt the
installer before it has a chance to delete the bootstrap VM.

If the environment variable OPENSHIFT_INSTALL_PRESERVE_BOOTSTRAP is set
to any value, the installer will not delete the bootstrap.
@stbenjam stbenjam changed the title cmd/openshift-install/create: add option to preserve bootstrap cmd/openshift-install/create: add env var to preserve bootstrap Jan 29, 2020
@stbenjam
Copy link
Member Author

@sdodson PTAL

@sdodson
Copy link
Member

sdodson commented Jan 29, 2020

@stbenjam Thanks, while we don't intend to allow this to enter into CI jobs because we don't wish for CI to diverge from normal customer workflows, we will likely enable this via clusterbot. Does openshift-install destroy cluster destroy the bootstrap host if it's left intact by this change? If not we need to at least get an issue in to track adding that.

@sdodson
Copy link
Member

sdodson commented Jan 29, 2020

/approve
@patrickdillon PTAL

@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sdodson

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 29, 2020
@stbenjam
Copy link
Member Author

Does openshift-install destroy cluster destroy the bootstrap host if it's left intact by this change?

I believe so, any terraform resources that still exist should get destroyed - but I haven't tested yet.

@sdodson
Copy link
Member

sdodson commented Jan 30, 2020

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jan 30, 2020
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

4 similar comments
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

21 similar comments
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot openshift-merge-robot merged commit 4922752 into openshift:master Jan 31, 2020
@openshift-ci-robot
Copy link
Contributor

@stbenjam: All pull requests linked via external trackers have merged. Bugzilla bug 1796627 has been moved to the MODIFIED state.

Details

In response to this:

Bug 1796627: cmd/openshift-install/create: add env var to preserve bootstrap

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot
Copy link
Contributor

@stbenjam: The following tests failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
ci/prow/e2e-libvirt e3b65eb link /test e2e-libvirt
ci/prow/e2e-aws-scaleup-rhel7 e3b65eb link /test e2e-aws-scaleup-rhel7

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@sdodson
Copy link
Member

sdodson commented Feb 4, 2020

/cherry-pick release-4.3

@openshift-cherrypick-robot

@sdodson: new pull request created: #3053

Details

In response to this:

/cherry-pick release-4.3

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants