-
Notifications
You must be signed in to change notification settings - Fork 713
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate bootkube checkpointer image. #180
Conversation
This patch adds a new binary: "checkpoint". The checkpoint program will ensure that the latest apiserver manifest is checkpointed, in case of a system / apiserver crash. It is implemented via static manifests. The checkpoint program will store a pod manifest on disk. When it detects an apiserver is not running, it will move that file into the directory that was specified as the kubelet's config dir. From there, the kubelet will see that pod manifest and run it. Once the program detects both our temporary apiserver and self-hosted apiserver is running, it will remove the manifest from the config dir, causing the kubelet to kill it and allow the self-hosted apiserver to take over.
Adds a check to ensure the API is actually running, responding to requests, when the checkpoint service notices the pod manifest. Came accross an issue when upgrading a cluster where the API server daemonset was removed, and the API server brought down, but the kubelet still returned that pod from its read-only API. Now, if we notice that an API server pod is returned from kubelet (the actual, self-hosted one, not a temp one) then we also check that it is up and accepting requests, otherwise we assume it to be dead.
This is no longer necessary since the self-hosted apiserver will keep trying to rebind to the insecure port.
This reverts commit 17e8cbc939172eafeafca1daed3903ea5d90a9e4.
This reverts commit 3bc3e82b220e757016c42ff18af3eef0c6681e89.
This reverts commit db0d7316d3352d184b38a3c56c49820e12136313.
1.5.1 updates
I'm ok with this as a starting point. LGTM cc @mikedanese @jbeda -- feel the same? |
On second thought maybe we should wait and bring it in one PR, I'm not sure we're going to like the build process looks. Can anyone suggest what we want the build/make/vendoring to look like to get this to build? I'm not very familiar with go and it's dependencies so currently a bit lost in terms of how to make this build the binary, and then the image, but doing some digging. Please let me know if anyone has ideas. |
I might second that - if we don't have a way to build the image, we're just going to end up maintaining code in two places. There are also some near term changes I wanted to get into the checkpointer (which we would have to double-commit if no images are being built here): kubernetes-retired/bootkube#124 Just as another option - is there anything blocking from just using the images which are currently hosted on quay.io? If it's mostly an issue of annotations - thats something we can decide on a convention and just change. |
Also something seems a bit odd with this PR - the actual history isn't associated with any of the code (click on one of the commits) |
@aaronlevy The direct blocker for us using the quay.io image is multi-platform support. @dgoodwin You could/should copy the scripts/Dockerfiles in |
The process for preserving commit history involves a lot of black magic. I would suggest we forget about this, @aaronlevy if you or anyone in your group would like to just submit this file as is to get it into git, then we can go from there, that's ok with me. If not I propose we forget the history, and I can just redo this as a flat file copy. |
Closing now, current plan to have bootkube build for required arches until we have build process in place for image in external repos like this one. |
This is just a raw copy of the checkpointer from https://github.com/kubernetes-incubator/bootkube/tree/master/cmd/checkpoint, with preserved git history.
I am proposing we first merge this in as is to make sure we get the history into the repo cleanly, then go to work on setting up the image build process.
CC @luxas @aaronlevy