Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pods can't find cached images in minikube with containerd #4985

Closed
priyawadhwa opened this issue Aug 5, 2019 · 5 comments
Closed

Pods can't find cached images in minikube with containerd #4985

priyawadhwa opened this issue Aug 5, 2019 · 5 comments
Assignees
Labels
cmd/cache Issues with the "cache" command co/runtime/containerd priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Milestone

Comments

@priyawadhwa
Copy link

Caching images doesn't work with the containerd backend, instead pods will always try to pull images from a remote registry even if they exist locally.

To reproduce, copy these files into a directory:

Dockerfile

FROM busybox
CMD sleep 6000

pod.yaml

apiVersion: v1
kind: Pod
metadata:
  name: cache-bug
  namespace: default
spec:
  containers:
  - name: cache-bug
    image: gcr.io/k8s-minikube/cache-bug
    imagePullPolicy: IfNotPresent
  1. Build minikube from my branch (Rebuild gvisor image for integration tests #4717), since I added support for caching local images on minikube cache add there
  2. Run docker build -t gcr.io/k8s-minikube/cache-bug .
  3. minikube start --container-runtime=containerd --docker-opt containerd=/var/run/containerd/containerd.sock --logtostderr
  4. minikube cache add gcr.io/k8s-minikube/cache-bug
  5. minikube ssh -- sudo ctr images ls | grep cache-bug should show that the image exists within Minikube
  6. kubectl apply -f pod.yaml
  7. kubectl get pod cache-bug should show status ImagePullBackOff instead of Running because the pod is trying to get the remote image which doesn't exist, even though the image exists locally.

The above steps with the docker runtime work as expected, with the pod having status Running after a few seconds.

@tstromberg tstromberg added this to the v1.4.0 Candidate milestone Aug 5, 2019
@tstromberg tstromberg added cmd/cache Issues with the "cache" command priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Aug 5, 2019
@tstromberg tstromberg removed their assignment Aug 6, 2019
@afbjorklund
Copy link
Collaborator

afbjorklund commented Aug 7, 2019

Looks to be a bug with the containerd cri plugin, that is not in the dockershim and not in other plugins.

Fails with containerd:

  Type     Reason     Age   From               Message
  ----     ------     ----  ----               -------
  Normal   Scheduled  9s    default-scheduler  Successfully assigned default/cache-bug to minikube
  Normal   Pulling    6s    kubelet, minikube  Pulling image "gcr.io/k8s-minikube/cache-bug"
  Warning  Failed     5s    kubelet, minikube  Failed to pull image "gcr.io/k8s-minikube/cache-bug": rpc error: code = Unknown desc = failed to resolve image "gcr.io/k8s-minikube/cache-bug:latest": no available registry endpoint: gcr.io/k8s-minikube/cache-bug:latest not found
  Warning  Failed     5s    kubelet, minikube  Error: ErrImagePull
  Normal   BackOff    5s    kubelet, minikube  Back-off pulling image "gcr.io/k8s-minikube/cache-bug"
  Warning  Failed     5s    kubelet, minikube  Error: ImagePullBackOff

Works fine with CRI-O:

  Type    Reason     Age   From               Message
  ----    ------     ----  ----               -------
  Normal  Scheduled  7s    default-scheduler  Successfully assigned default/cache-bug to minikube
  Normal  Pulled     5s    kubelet, minikube  Container image "gcr.io/k8s-minikube/cache-bug" already present on machine
  Normal  Created    5s    kubelet, minikube  Created container cache-bug

If I understood correctly, this could be an issue with GetImageRef as used by shouldPullImage:

	spec := kubecontainer.ImageSpec{Image: image}
	imageRef, err := m.imageService.GetImageRef(spec)
	if err != nil {
		msg := fmt.Sprintf("Failed to inspect image %q: %v", container.Image, err)
		m.logIt(ref, v1.EventTypeWarning, events.FailedToInspectImage, logPrefix, msg, klog.Warning)
		return "", msg, ErrImageInspect
	}

	present := imageRef != ""
	if !shouldPullImage(container, present) {
		if present {
			msg := fmt.Sprintf("Container image %q already present on machine", container.Image)
			m.logIt(ref, v1.EventTypeNormal, events.PulledImage, logPrefix, msg, klog.Info)
			return imageRef, "", nil
		}
		msg := fmt.Sprintf("Container image %q is not present with pull policy of Never", container.Image)
		m.logIt(ref, v1.EventTypeWarning, events.ErrImageNeverPullPolicy, logPrefix, msg, klog.Warning)
		return "", msg, ErrImageNeverPull
	}

But nothing indicates that this is something in minikube, looks like an upstream containerd thing.

@afbjorklund
Copy link
Collaborator

Changing to Never gives a more distinct problem statement:

Warning ErrImageNeverPull 2s (x2 over 2s) kubelet, minikube Container image "gcr.io/k8s-minikube/cache-bug" is not present with pull policy of Never

Even though the image is successfully imported and listed:

unpacking gcr.io/k8s-minikube/cache-bug:latest (sha256:ec1de3156f10594891bec7321851727593665df38eaa1f26af5dda9dc532c125)...done

$ sudo ctr images list 
REF                                            TYPE                                       DIGEST                                                                  SIZE      PLATFORMS   LABELS 
gcr.io/k8s-minikube/cache-bug:latest           application/vnd.oci.image.manifest.v1+json sha256:ec1de3156f10594891bec7321851727593665df38eaa1f26af5dda9dc532c125 1.4 MiB   linux/amd64 -      
gcr.io/k8s-minikube/storage-provisioner:v1.8.1 application/vnd.oci.image.manifest.v1+json sha256:9b4c1942733f9e4209ad4764691f72099b2d6e3967708d441cd30386de29b8f1 19.7 MiB  linux/amd64 -      
...

The shouldPullImage code in Kubernetes is quite trivial, so the problem is with imageRef:

// shouldPullImage returns whether we should pull an image according to
// the presence and pull policy of the image.
func shouldPullImage(container *v1.Container, imagePresent bool) bool {
	if container.ImagePullPolicy == v1.PullNever {
		return false
	}

	if container.ImagePullPolicy == v1.PullAlways ||
		(container.ImagePullPolicy == v1.PullIfNotPresent && (!imagePresent)) {
		return true
	}

	return false
}

@afbjorklund
Copy link
Collaborator

afbjorklund commented Aug 8, 2019

Apparently we are calling the wrong command, ctr images import rather than ctr cri load.
This causes the image to show up in ctr images list, but not in crictl images unfortunately.

This will be fixed in containerd 1.3: containerd/cri#909

But for now (1.2.6), we need to use the older command - and add some versioning later on...
That is: check the containerd version, and use the appropriate command (for containerd 1.3)

@priyawadhwa : can you verify if running sudo ctr cri load on the tarball fixes the issue ?


@tstromberg: do you know why it was changed earlier ? (e091338 in #3767)

'ctr cri load' doesn't work, use 'ctr image import' instead.

Was that when we had older (broken) containerd, perhaps ?

UPDATE: Apparently we could also use ctr --namespace k8s.io

@priyawadhwa
Copy link
Author

Hey @afbjorklund thanks for looking int o this -- ctr cri load doesn't seem to be working.

Failed to cache and load images: loading cached images: loading image /Users/priyawadhwa/.minikube/cache/images/gcr.io/k8s-minikube/cache-bug: containerd load /tmp/cache-bug: command failed: sudo ctr cri load /tmp/cache-bug
stdout: 
stderr: ctr: failed to load image: rpc error: code = Unknown desc = failed to import image: image config "sha256:48eb125734951ed6638bd39b9d8e9abac01ff72a758db972fbe7c5446254d123" not found
: Process exited with status 1

but I tried with sudo ctr -n=k8s.io images import and that is working! I'll incorporate that change into my PR and see if that solves the issue. Thanks!

@afbjorklund
Copy link
Collaborator

Confusingly included in #4717, as ee4cbbb

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cmd/cache Issues with the "cache" command co/runtime/containerd priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
Development

No branches or pull requests

4 participants