Skip to content
This repository has been archived by the owner on Jul 30, 2021. It is now read-only.

pkg/checkpoint: Try kubelet secureClient, fallback to read-only #1027

Merged
merged 1 commit into from
Nov 26, 2018
Merged

pkg/checkpoint: Try kubelet secureClient, fallback to read-only #1027

merged 1 commit into from
Nov 26, 2018

Conversation

dghubble
Copy link
Contributor

@dghubble dghubble commented Nov 12, 2018

Change pod-checkpointer to first try to list pods via the Kubeletsecure (10250) endpoint, then fallback to the Kubelet read-only (10255) endpoint, before defaulting to an empty list of parent pods

Related: #1025

@coreosbot
Copy link

Can one of the admins verify this patch?

2 similar comments
@coreosbot
Copy link

Can one of the admins verify this patch?

@coreosbot
Copy link

Can one of the admins verify this patch?

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Nov 12, 2018
@dghubble dghubble changed the title pkg/checkpoint: Try kubelet secureClient, fallback to insecureClient pkg/checkpoint: Try kubelet secureClient, fallback to read-only Nov 12, 2018
@dghubble
Copy link
Contributor Author

/assign @rphillips

@rphillips
Copy link
Contributor

Is there a default timeout on the secure kubeclient?

@dghubble
Copy link
Contributor Author

dghubble commented Nov 12, 2018

I don't see one set in client-go rest by default. I could add one for say 15 seconds? More/less? k.secureClient.Get().AbsPath("/pods/").Timeout(15*time.Second).Do().Into(podList)

* Change pod-checkpointer to first try to list pods via the Kubelet
secure (10250) endpoint, then fallback to the Kubelet read-only (10255)
endpoint, before defaulting to an empty list of parent pods
@rphillips
Copy link
Contributor

/ok-to-test

dghubble added a commit to poseidon/terraform-render-bootstrap that referenced this pull request Nov 26, 2018
dghubble added a commit to poseidon/terraform-render-bootstrap that referenced this pull request Nov 26, 2018
dghubble added a commit to poseidon/typhoon that referenced this pull request Nov 26, 2018
dghubble added a commit to poseidon/typhoon that referenced this pull request Nov 26, 2018
@dghubble
Copy link
Contributor Author

dghubble commented Nov 26, 2018

I've been testing with an image quay.io/dghubble/pod-checkpointer:55f0b95cf1471d8179b083c031f43a3a3bcabd0a with this.

On a typical bootkube cluster today (allows kubelet read-only, no ClusterRoleBinding for pod-checkpointer), the checkpointer falls back to the insecure client as usual.

E1126 07:21:31.817761       1 kubelet.go:55] failed to secure list local parent pods, fallback to insecure: Forbidden (user=system:serviceaccount:kube-system:pod-checkpointer, verb=get, resource=nodes, subresource=proxy)

On a cluster with --read-only-port=0, pod-checkpointer previously failed:

# previous 
failed to list local parent pods, assuming none are running: Get http://127.0.0.1:10255/pods/: dial tcp 127.0.0.1:10255: connect: connection refused

With this pod-checkpointer change:

# with ClusterRoleBinding, no error logs
NA
# without ClusterRolebinding, fallback to insecure, insecure is disabled
E1126 07:21:31.817761       1 kubelet.go:55] failed to secure list local parent pods, fallback to insecure: Forbidden (user=system:serviceaccount:kube-system:pod-checkpointer, verb=get, resource=nodes, subresource=proxy)
E1126 07:21:31.818357       1 kubelet.go:58] failed to insecure list local parent pods, assuming none are running: Get http://127.0.0.1:10255/pods/?timeout=15s: dial tcp 127.0.0.1:10255: connect: connection refused

I think this shows the change doesn't regess any current behavior, but adds a way to support disabling the Kubelet read-only port. I can have #1025 be a followup when a pod-checkpointer image is ready.

dghubble added a commit to poseidon/terraform-render-bootstrap that referenced this pull request Nov 26, 2018
dghubble added a commit to poseidon/typhoon that referenced this pull request Nov 26, 2018
@rphillips
Copy link
Contributor

coreosbot run e2e checkpointer

@rphillips
Copy link
Contributor

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: rphillips

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 26, 2018
@rphillips
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Nov 26, 2018
@k8s-ci-robot k8s-ci-robot merged commit 83e25e5 into kubernetes-retired:master Nov 26, 2018
@rphillips
Copy link
Contributor

rphillips commented Nov 26, 2018

Pushed the new checkpoint image to quay (tag: 83e25e5968391b9eb342042c435d1b3eeddb2be1)

@dghubble dghubble deleted the checkpointer-try-secure branch November 26, 2018 17:52
dghubble added a commit to poseidon/terraform-render-bootstrap that referenced this pull request Nov 27, 2018
* Updates pod-checkpointer to prefer the Kubelet secure
API (before falling back to the Kubelet read-only API that
is disabled on Typhoon clusters since
poseidon/typhoon#324)
* Previously, pod-checkpointer checkpointed an initial set
of pods during bootstrapping so recovery from power cycling
clusters was unaffected, but logs were noisy
* kubernetes-retired/bootkube#1027
* kubernetes-retired/bootkube#1025
dghubble added a commit to poseidon/typhoon that referenced this pull request Nov 27, 2018
* Updates pod-checkpointer to prefer the Kubelet secure
API (before falling back to the Kubelet read-only API that
is disabled on Typhoon clusters since
#324)
* Previously, pod-checkpointer checkpointed an initial set
of pods during bootstrapping so recovery from power cycling
clusters was unaffected, but logs were noisy
* kubernetes-retired/bootkube#1027
* kubernetes-retired/bootkube#1025
dghubble-robot pushed a commit to poseidon/terraform-onprem-kubernetes that referenced this pull request Nov 27, 2018
* Updates pod-checkpointer to prefer the Kubelet secure
API (before falling back to the Kubelet read-only API that
is disabled on Typhoon clusters since
poseidon/typhoon#324)
* Previously, pod-checkpointer checkpointed an initial set
of pods during bootstrapping so recovery from power cycling
clusters was unaffected, but logs were noisy
* kubernetes-retired/bootkube#1027
* kubernetes-retired/bootkube#1025
dghubble-robot pushed a commit to poseidon/terraform-digitalocean-kubernetes that referenced this pull request Nov 27, 2018
* Updates pod-checkpointer to prefer the Kubelet secure
API (before falling back to the Kubelet read-only API that
is disabled on Typhoon clusters since
poseidon/typhoon#324)
* Previously, pod-checkpointer checkpointed an initial set
of pods during bootstrapping so recovery from power cycling
clusters was unaffected, but logs were noisy
* kubernetes-retired/bootkube#1027
* kubernetes-retired/bootkube#1025
dghubble-robot pushed a commit to poseidon/terraform-aws-kubernetes that referenced this pull request Nov 27, 2018
* Updates pod-checkpointer to prefer the Kubelet secure
API (before falling back to the Kubelet read-only API that
is disabled on Typhoon clusters since
poseidon/typhoon#324)
* Previously, pod-checkpointer checkpointed an initial set
of pods during bootstrapping so recovery from power cycling
clusters was unaffected, but logs were noisy
* kubernetes-retired/bootkube#1027
* kubernetes-retired/bootkube#1025
dghubble-robot pushed a commit to poseidon/terraform-google-kubernetes that referenced this pull request Nov 27, 2018
* Updates pod-checkpointer to prefer the Kubelet secure
API (before falling back to the Kubelet read-only API that
is disabled on Typhoon clusters since
poseidon/typhoon#324)
* Previously, pod-checkpointer checkpointed an initial set
of pods during bootstrapping so recovery from power cycling
clusters was unaffected, but logs were noisy
* kubernetes-retired/bootkube#1027
* kubernetes-retired/bootkube#1025
sphw pushed a commit to m10io/tokenx-typhoon that referenced this pull request Dec 17, 2018
* Updates pod-checkpointer to prefer the Kubelet secure
API (before falling back to the Kubelet read-only API that
is disabled on Typhoon clusters since
poseidon#324)
* Previously, pod-checkpointer checkpointed an initial set
of pods during bootstrapping so recovery from power cycling
clusters was unaffected, but logs were noisy
* kubernetes-retired/bootkube#1027
* kubernetes-retired/bootkube#1025
dghubble-robot pushed a commit to poseidon/terraform-azure-kubernetes that referenced this pull request May 25, 2020
* Updates pod-checkpointer to prefer the Kubelet secure
API (before falling back to the Kubelet read-only API that
is disabled on Typhoon clusters since
poseidon/typhoon#324)
* Previously, pod-checkpointer checkpointed an initial set
of pods during bootstrapping so recovery from power cycling
clusters was unaffected, but logs were noisy
* kubernetes-retired/bootkube#1027
* kubernetes-retired/bootkube#1025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm Indicates that a PR is ready to be merged. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants