Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

latest storage-provisioner: cannot get resource "endpoints" in API group "" in the namespace "kube-system" #6523

Closed
tolusha opened this issue Feb 6, 2020 · 20 comments
Assignees
Labels
addon/storage-provisioner Issues relating to storage provisioner addon kind/bug Categorizes issue or PR as related to a bug. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.

Comments

@tolusha
Copy link

tolusha commented Feb 6, 2020

Not sure, but it seems related to https://github.com/kubernetes/minikube/pull/6511/files

The exact command to reproduce the issue:

minikube start
kubectl create namespace test
kubectl apply -f pvc.yaml

The full output of the command that failed:


storage provisioner logs:

E0206 10:46:39.353949       1 leaderelection.go:331] error retrieving resource lock kube-system/k8s.io-minikube-hostpath: endpoints "k8s.io-minikube-hostpath" is forbidden: User "system:serviceaccount:kube-system:storage-provisioner" cannot get resource "endpoints" in API group "" in the namespace "kube-system

The output of the minikube logs command:

The operating system version:

minikube version
minikube version: v1.5.2
commit: 792dbf92a1de583fcee76f8791cff12e0c9440ad-dirty
cat /etc/os-release 
NAME="Linux Mint"
VERSION="19.3 (Tricia)"

pvc.yaml:

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: test
  namespace: test
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
@dimara
Copy link
Contributor

dimara commented Feb 6, 2020

We, Arrikto, also bumped into this with MiniKF, which uses Minikube v1.2.0.

It seems that the storage-provisioner image was retagged yesterday. Specifically:
OLD:

gcr.io/k8s-minikube/storage-provisioner                            v1.8.1                                     sha256:088daa9fcbccf04c3f415d77d5a6360d2803922190b675cb7fc88a9d2d91985a   4689081edb10        2 years ago         80.8MB

NEW:

gcr.io/k8s-minikube/storage-provisioner                            v1.8.1                                     sha256:341d2667af45c57709befaf7524dc1649a3b025d63a03ee6d48013574043c93d   c74c2f8485c3        6 hours ago         30.7MB

This is very likely to cause problems and break existing software/deployments, as it has now. Pushing different images with the same tags should be avoided, right?

@dimara
Copy link
Contributor

dimara commented Feb 6, 2020

I see that PR #6156 changes the storage-provisioner code. @tstromberg @nanikjava shouldn't the STORAGE_PROVISIONER_TAG variable be changed after this?

@marlowl
Copy link

marlowl commented Feb 6, 2020

Got a related problem I guess: kubectl describe pvc is giving the following message: waiting for a volume to be created, either by external provisioner "k8s.io/minikube-hostpath" or manually created by system administrator. PVC stuck at pending. Running Minikube v1.7.1

@tstromberg
Copy link
Contributor

Likely related to #6496

Since the bug is affecting older versions, it sounds like we rolled out a buggy provisioner image. I'll see if I can back it out. In the future we need to make sure to not repush to specific revisions.

@tstromberg tstromberg self-assigned this Feb 6, 2020
@tstromberg
Copy link
Contributor

@dimara thanks for the hashes!

I ran:

docker tag gcr.io/k8s-minikube/storage-provisioner@sha256:088daa9fcbccf04c3f415d77d5a6360d2803922190b675cb7fc88a9d2d91985a gcr.io/k8s-minikube/storage-provisioner:v1.8.1
docker push gcr.io/k8s-minikube/storage-provisioner:v1.8.1

Please let me know if this has helped in your environment.

@tstromberg tstromberg changed the title It is not possible to create PVC anymore storage-provisioner: cannot get resource "endpoints" in API group "" in the namespace "kube-system" Feb 6, 2020
@tstromberg tstromberg added kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. addon/storage-provisioner Issues relating to storage provisioner addon labels Feb 6, 2020
@VlasLive
Copy link

VlasLive commented Feb 6, 2020

@dimara thanks for the hashes!

I ran:

docker tag gcr.io/k8s-minikube/storage-provisioner@sha256:088daa9fcbccf04c3f415d77d5a6360d2803922190b675cb7fc88a9d2d91985a gcr.io/k8s-minikube/storage-provisioner:v1.8.1
docker push gcr.io/k8s-minikube/storage-provisioner:v1.8.1

Please let me know if this has helped in your environment.

Helped in our env.

@tolusha
Copy link
Author

tolusha commented Feb 6, 2020

It works on my side.

@dimara
Copy link
Contributor

dimara commented Feb 6, 2020

@tstromberg It works now. Thanks!

@tstromberg
Copy link
Contributor

Thanks!

@tstromberg tstromberg reopened this Feb 6, 2020
@tstromberg
Copy link
Contributor

This should no longer affect users, but I'm leaving this open until we address the root cause.

Process fix: #6526
Root cause fix TBD

@tstromberg tstromberg changed the title storage-provisioner: cannot get resource "endpoints" in API group "" in the namespace "kube-system" latest storage-provisioner: cannot get resource "endpoints" in API group "" in the namespace "kube-system" Feb 6, 2020
@djmarcin
Copy link

I was broken by this for a while before I figured out that in order to fix I have to completely purge my ~/.minikube directory so that it will fetch the corrected v1.8.1. Would it be good to publish a v1.8.2 so that anyone who already has the broken v1.8.1 image gets the update?

@ChenRoth
Copy link

is there any plan on having a newer tag for the fix? the alternative is to hack minikube into working :(

@bodom0015
Copy link

bodom0015 commented Mar 9, 2020

Possibly a duplicate of #3129, which appears to still be present in v1.8.1

@afbjorklund
Copy link
Collaborator

Pushing a v1.8.2 sounds like a good fix for this (the bad binary).

@ianfoo
Copy link
Contributor

ianfoo commented Mar 10, 2020

Another user report in case it helps anyone:

I'd read this thread a number of times and still couldn't figure out how to get past having PVCs stuck in Pending. I had the right hash on the provisioner image, based on what's shown above. However, when I checked pods in namespace kube-system I saw storage-provisioner in Terminating state, stuck for some reason. I force deleted this pod (with --force --grace-period 0) and restarted minikube (v1.8.1) and the storage provisioner pod came up successfully, and my PVCs were bound to newly created PVs. Unblock'd! 🙌🏼

@limbuu
Copy link

limbuu commented Apr 24, 2020

@tstromberg
Thanks it works now.

I was having same issue in minikube version: v1.8.1. I have used virtualbox as VM driver.

$ minikube version
minikube version: v1.8.1
commit: cbda04cf6bbe65e987ae52bb393c10099ab62014

$ kubectl apply -f mysql-pvc.yaml 
persistentvolumeclaim/mysql-data-disk created

$ kubectl get pvc
NAME              STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS   AGE
mysql-data-disk   Pending                                      standard       14m

 kubectl describe pvc mysql-data-disk
Name:          mysql-data-disk
Namespace:     default
StorageClass:  standard
Status:        Pending
Volume:        
Labels:        <none>
Annotations:   kubectl.kubernetes.io/last-applied-configuration:
                 {"apiVersion":"v1","kind":"PersistentVolumeClaim","metadata":{"annotations":{},"name":"mysql-data-disk","namespace":"default"},"spec":{"ac...
               volume.beta.kubernetes.io/storage-provisioner: k8s.io/minikube-hostpath
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      
Access Modes:  
VolumeMode:    Filesystem
Mounted By:    mysql-deployment-655cdd8d98-rqbrf
Events:
  Type    Reason                Age                  From                         Message
  ----    ------                ----                 ----                         -------
  Normal  ExternalProvisioning  2m (x26 over 8m15s)  persistentvolume-controller  waiting for a volume to be created, either by external provisioner "k8s.io/minikube-hostpath" or manually created by system administrator

When I dig deeper, I found that my storage-provisioner was enabled but its pod was evicted.

$ minikube addons list
|-----------------------------|----------|--------------|
|         ADDON NAME          | PROFILE  |    STATUS    |
|-----------------------------|----------|--------------|
| dashboard                   | minikube | disabled     |
| default-storageclass        | minikube | enabled ✅   |
| efk                         | minikube | disabled     |
| freshpod                    | minikube | disabled     |
| gvisor                      | minikube | disabled     |
| helm-tiller                 | minikube | disabled     |
| ingress                     | minikube | disabled     |
| ingress-dns                 | minikube | disabled     |
| istio                       | minikube | disabled     |
| istio-provisioner           | minikube | disabled     |
| logviewer                   | minikube | disabled     |
| metrics-server              | minikube | disabled     |
| nvidia-driver-installer     | minikube | disabled     |
| nvidia-gpu-device-plugin    | minikube | disabled     |
| registry                    | minikube | disabled     |
| registry-creds              | minikube | disabled     |
| storage-provisioner         | minikube | enabled ✅   |
| storage-provisioner-gluster | minikube | disabled     |
|-----------------------------|----------|--------------|

$ kubectl get pods -n kube-system
NAME                          READY   STATUS    RESTARTS   AGE
coredns-6955765f44-fcnsw      1/1     Running   26         28d
coredns-6955765f44-tjlgt      1/1     Running   25         28d
etcd-m01                      1/1     Running   17         24d
kube-apiserver-m01            1/1     Running   2864       24d
kube-controller-manager-m01   1/1     Running   1481       28d
kube-proxy-wwm2q              1/1     Running   25         28d
kube-scheduler-m01            1/1     Running   1481       28d
storage-provisioner           0/1     Evicted   0          28d

Now its working,

$ docker pull gcr.io/k8s-minikube/storage-provisioner:v1.8.1
v1.8.1: Pulling from k8s-minikube/storage-provisioner
Digest: sha256:088daa9fcbccf04c3f415d77d5a6360d2803922190b675cb7fc88a9d2d91985a
Status: Image is up to date for gcr.io/k8s-minikube/storage-provisioner:v1.8.1
gcr.io/k8s-minikube/storage-provisioner:v1.8.1

$ docker images
REPOSITORY                                TAG                 IMAGE ID            CREATED             SIZE
k8s.gcr.io/kube-proxy                     v1.17.3             ae853e93800d        2 months ago        116MB
k8s.gcr.io/kube-apiserver                 v1.17.3             90d27391b780        2 months ago        171MB
k8s.gcr.io/kube-controller-manager        v1.17.3             b0f1517c1f4b        2 months ago        161MB
k8s.gcr.io/kube-scheduler                 v1.17.3             d109c0821a2b        2 months ago        94.4MB
<none>                                    <none>              af341ccd2df8        3 months ago        5.56MB
kubernetesui/dashboard                    v2.0.0-beta8        eb51a3597525        4 months ago        90.8MB
k8s.gcr.io/coredns                        1.6.5               70f311871ae1        5 months ago        41.6MB
k8s.gcr.io/etcd                           3.4.3-0             303ce5db0e90        6 months ago        288MB
kubernetesui/metrics-scraper              v1.0.2              3b08661dc379        6 months ago        40.1MB
gluster/glusterfile-provisioner           latest              1fd2d1999179        23 months ago       233MB
k8s.gcr.io/pause                          3.1                 da86e6ba6ca1        2 years ago         742kB
gcr.io/k8s-minikube/storage-provisioner   v1.8.1              4689081edb10        2 years ago         80.8MB

$ kubectl get pods -n kube-system
NAME                          READY   STATUS    RESTARTS   AGE
coredns-6955765f44-fcnsw      1/1     Running   26         28d
coredns-6955765f44-tjlgt      1/1     Running   25         28d
etcd-m01                      1/1     Running   17         24d
kube-apiserver-m01            1/1     Running   2864       24d
kube-controller-manager-m01   1/1     Running   1481       28d
kube-proxy-wwm2q              1/1     Running   25         28d
kube-scheduler-m01            1/1     Running   1481       28d
storage-provisioner           1/1     Running   0          114s

$ kubectl apply -f mysql-pvc.yaml 
persistentvolumeclaim/mysql-data-disk unchanged

$ kubectl get pvc
NAME              STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
mysql-data-disk   Bound    pvc-4e8b1ff1-9080-449c-afbd-deca623d4b0c   100Mi      RWO            standard       42m

@limbuu
Copy link

limbuu commented Apr 24, 2020

@tstromberg
Although pvc and pv are created, another problem keeps the pod in pending state.

$ kubectl logs -f storage-provisioner -n kube-system
E0424 08:40:25.014427       1 controller.go:682] Error watching for provisioning success, can't provision for claim "default/mysql-data-disk": events is forbidden: User "system:serviceaccount:kube-system:storage-provisioner" cannot list resource "events" in API group "" in the namespace "default"
E0424 08:59:56.106645       1 controller.go:682] Error watching for provisioning success, can't provision for claim "default/mysql-data-disk": events is forbidden: User "system:serviceaccount:kube-system:storage-provisioner" cannot list resource "events" in API group "" in the namespace "default"

Inside mysql pod,

$ kubectl get pods
NAME                                READY   STATUS    RESTARTS   AGE
mysql-deployment-655cdd8d98-8cfv9   0/1     Pending   0          10s

$ kubectl describe pod mysql-deployment-655cdd8d98-8cfv9
Name:           mysql-deployment-655cdd8d98-8cfv9
Namespace:      default
Priority:       0
Node:           <none>
Labels:         app=mysql
                pod-template-hash=655cdd8d98
Annotations:    <none>
Status:         Pending
IP:             
IPs:            <none>
Controlled By:  ReplicaSet/mysql-deployment-655cdd8d98
Containers:
  mysql:
    Image:      mysql:5.7
    Port:       3306/TCP
    Host Port:  0/TCP
    Environment:
      MYSQL_ROOT_PASSWORD:  <set to the key 'MYSQL_ROOT_PASSWORD' in secret 'mysql-secret'>  Optional: false
      MYSQL_DATABASE:       <set to the key 'MYSQL_DATABASE' in secret 'mysql-secret'>       Optional: false
      MYSQL_USER:           <set to the key 'MYSQL_USER' in secret 'mysql-secret'>           Optional: false
      MYSQL_PASSWORD:       <set to the key 'MYSQL_PASSWORD' in secret 'mysql-secret'>       Optional: false
    Mounts:
      /var/lib/mysql from mysql-data (rw,path="mysql")
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-bcgp4 (ro)
Conditions:
  Type           Status
  PodScheduled   False 
Volumes:
  mysql-data:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  mysql-data-disk
    ReadOnly:   false
  default-token-bcgp4:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-bcgp4
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason            Age        From               Message
  ----     ------            ----       ----               -------
  Warning  FailedScheduling  <unknown>  default-scheduler  persistentvolumeclaim "mysql-data-disk" not found
  Warning  FailedScheduling  <unknown>  default-scheduler  persistentvolumeclaim "mysql-data-disk" not found
  Warning  FailedScheduling  <unknown>  default-scheduler  0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate.

FYI, I have not added any taints or node selector or node affinity anti-affinity rules.
And, after couple hours of running, storage-provisioner pod get evicted again.

$ kubectl get pods -n kube-system
NAME                          READY   STATUS    RESTARTS   AGE
coredns-6955765f44-fcnsw      1/1     Running   26         28d
coredns-6955765f44-tjlgt      1/1     Running   25         28d
etcd-m01                      1/1     Running   17         24d
kube-apiserver-m01            1/1     Running   2864       24d
kube-controller-manager-m01   1/1     Running   1481       28d
kube-proxy-wwm2q              1/1     Running   25         28d
kube-scheduler-m01            1/1     Running   1481       28d
storage-provisioner           0/1     Evicted   0          163m

@limbuu
Copy link

limbuu commented Apr 29, 2020

The pod was evicted by low disk space, removing unused images and the intermediate images (none tags) and stopped containers fixed the issue for me.

$ docker rm $(docker ps -a -q)

$ docker rmi $(docker images -f dangling=true -q )

@priyawadhwa priyawadhwa added priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. and removed priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Jun 17, 2020
@mostafa-eltaher
Copy link

This worked for me and fixed the issue on minikube 1.12.1
#8414 (comment)

@sharifelgamal
Copy link
Collaborator

This should be fixed with the newest version of storage-provisioner. I'm going to close this issue for now but if someone continues to see issues, open a new bug or comment here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
addon/storage-provisioner Issues relating to storage provisioner addon kind/bug Categorizes issue or PR as related to a bug. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Projects
None yet
Development

No branches or pull requests