Skip to content

Conversation

@hexfusion
Copy link
Contributor

@hexfusion hexfusion commented Mar 5, 2020

Backups might eventually be managed by a static pod controller. This proposal sets up a directory structure to make that easier to implement.

The idea here is that each master is prepared for restoring the cluster in possible autonomous fashion.

cc @deads2k @alaypatel07 @retroflexer

@openshift-ci-robot openshift-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Mar 5, 2020
@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: hexfusion
To complete the pull request process, please assign ecordell
You can assign the PR to them by writing /assign @ecordell in a comment when ready.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Mar 5, 2020
4. On all masters copy directory `/etc/kubernetes/static-pod-manifests/kube-apiserver-pod-${APISERVER_REVISION}`
to `/etc/kubernetes/static-pod-manifests/backup-N/kube-apiserver-pod`.
5. On all masters replace directory `/etc/kubernetes/static-pod-manifests/backups` with a copy of
`/etc/kubernetes/static-pod-manifests/backup-N` directory.
Copy link
Contributor

@retroflexer retroflexer Mar 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On 5, would a symoblic link of /etc/kubernetes/static-pod-manifests/backups to /etc/kubernetes/static-pod-manifests/backup-N be better, perhaps?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably, the size of the etcd snapshot would not make a copy reasonable. Managing the link would be simpler.


1. On all masters create a new backup revision `/etc/kubernetes/static-pod-manifests/backup-N`.
2. On all masters write `/etc/kubernetes/static-pod-manifests/backup-N/backup.env` file containing 3 environmental
variables `CREATED`, `ETCD_REVISION` and `APISERVER_REVISION`.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs to be extended now as we are managing more static pods resources. I think we should also include OCP version.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree.

@openshift-bot
Copy link

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci-robot openshift-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 17, 2020
@openshift-bot
Copy link

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci-robot openshift-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Nov 16, 2020
@openshift-bot
Copy link

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

@openshift-ci-robot
Copy link

@openshift-bot: Closed this PR.

Details

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants