pull images failed #80

haozi4263 · 2022-12-02T03:03:01Z

finch pull --platform=amd64 xxx

FATA[1167] failed to extract layer sha256:9cc8d31519b533c03cd8347147f9ea0b9bfbda4650200d388a1495a34812283f: mount callback failed on /var/lib/containerd/tmpmounts/containerd-mount3705620677: failed to Lchown "/var/lib/containerd/tmpmounts/containerd-mount3705620677/kubeflow/src" for UID 29511686, GID 1085706827: lchown /var/lib/containerd/tmpmounts/containerd-mount3705620677/kubeflow/src: invalid argument (Hint: try increasing the number of subordinate IDs in /etc/subuid and /etc/subgid): unknown
FATA[1168] exit status 1

estesp · 2022-12-02T05:32:17Z

Is this image public/shareable? This looks like an image that uses extremely large UIDs and/or GIDs, which when running rootless (or simply via a runtime with user namespaces enabled) means you have exhausted the (standard 2^16) ~65k range of UIDs/GIDs used to map filesystem ownership. I expect this image will not run on any rootless/user namespace-enabled container runtime, unless the /etc/sub{u,g}id files are created which allow a significant range of subordinate IDs to be used within containers.

I'm not quite sure what the value of using IDs in the very high range (that UID is somewhere above 2^24?; GID is even larger!) are, but if you own the image, I would be curious why the need for extremely large integers for the owner and group.

haozi4263 · 2022-12-02T09:05:55Z

image: ccr.ccs.tencentyun.com/cube-studio/kubeflow-dashboard:2022.09.01 is publish
use docker pull ccr.ccs.tencentyun.com/cube-studio/kubeflow-dashboard:2022.09.01 is ok

ningziwen · 2022-12-05T21:07:15Z

Reproduced in Finch.

FATA[0125] failed to extract layer sha256:9cc8d31519b533c03cd8347147f9ea0b9bfbda4650200d388a1495a34812283f: mount callback failed on /var/lib/containerd/tmpmounts/containerd-mount3084210000: failed to Lchown "/var/lib/containerd/tmpmounts/containerd-mount3084210000/kubeflow/src" for UID 29511686, GID 1085706827: lchown /var/lib/containerd/tmpmounts/containerd-mount3084210000/kubeflow/src: invalid argument (Hint: try increasing the number of subordinate IDs in /etc/subuid and /etc/subgid): unknown
FATA[0114] exit status 1

However, it worked with the nerdctl built from v1.0.0 tag, which is what we are using in Finch. Will continue the investigation

estesp · 2022-12-05T22:13:22Z

It's important compare nerdctl (or any other runtime tool) running the same way it is inside Finch, which based on the output is running inside a user namespace ("rootless" mode, specifically); the container shown will probably work on any container runtime that is not running the container within a user namespace (either "rootless" mode or simply inside a root-created user namespace with a specific range of subordinate uid and gids). If you use the nerdctl install that sets up rootless on a Linux system, you should be able to reproduce the same issue, unless you use an extremely large subordinate mapping for the ID ranges.

ningziwen · 2022-12-06T00:48:47Z

Reproduced in nerdctl in finch VM shell.

FATA[0139] failed to extract layer sha256:9cc8d31519b533c03cd8347147f9ea0b9bfbda4650200d388a1495a34812283f: mount callback failed on /var/lib/containerd/tmpmounts/containerd-mount1146161846: failed to Lchown "/var/lib/containerd/tmpmounts/containerd-mount1146161846/kubeflow/src" for UID 29511686, GID 1085706827: lchown /var/lib/containerd/tmpmounts/containerd-mount1146161846/kubeflow/src: invalid argument (Hint: try increasing the number of subordinate IDs in /etc/subuid and /etc/subgid): unknown

ningziwen · 2022-12-06T20:32:30Z

Validated it can work after extending subuid and subgid.

[ningziwe@lima-finch ningziwe]$ cat /etc/subuid
ningziwe:100000:29700000
[ningziwe@lima-finch ningziwe]$ cat /etc/subgid
ningziwe:100000:1085800000
[ningziwe@lima-finch ningziwe]$
logout
➜  ~ finch pull ccr.ccs.tencentyun.com/cube-studio/kubeflow-dashboard:2022.09.01
...
elapsed: 339.7s                                                                   total:  942.4  (2.8 MiB/s)

Workaround:

# Log in VM shell
LIMA_HOME=/Applications/Finch/lima/data /Applications/Finch/lima/bin/limactl shell finch

# In VM shell, modify /etc/subuid and /etc/subgid to a larger number
sudo vi /etc/subuid
sudo vi /etc/subgid

# Logout VM shell and restart finch VM
finch vm stop
finch vm start

# Try to pull the image again
finch pull ccr.ccs.tencentyun.com/cube-studio/kubeflow-dashboard:2022.09.01

ningziwen · 2022-12-06T21:12:59Z

As @estesp mentioned, the root cause is the image has extremely large UID/GID but the default number is 65536 in Finch.

I found a relevant issue in k8s. From the issue, 65536 is the default UID/GID number for most distributions and this issue is to fix the extremely large UID/GID in image side.

I suggest referring this issue and checking if the UID/GID of your image should/could be adjusted.

If you find it is necessary to use images with extremely large UID/GID, please elaborate the use case here. We can discuss making subuid/subgid configurable if the use case can be justified.

ningziwen · 2023-03-03T22:05:15Z

The large uid/guid issue was resolved by switching to rootful container inside VM. #196

haozi4263 added the bug Something isn't working label Dec 2, 2022

pendo324 assigned ningziwen Dec 2, 2022

AkihiroSuda mentioned this issue Dec 7, 2022

Extend the default subuid range from 64k to 1G lima-vm/lima#1227

Closed

ningziwen mentioned this issue Jan 30, 2023

Support for large UIDs #176

Closed

ningziwen closed this as completed Mar 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pull images failed #80

pull images failed #80

haozi4263 commented Dec 2, 2022

estesp commented Dec 2, 2022

haozi4263 commented Dec 2, 2022

ningziwen commented Dec 5, 2022 •

edited

Loading

estesp commented Dec 5, 2022

ningziwen commented Dec 6, 2022

ningziwen commented Dec 6, 2022 •

edited

Loading

ningziwen commented Dec 6, 2022

ningziwen commented Mar 3, 2023

pull images failed #80

pull images failed #80

Comments

haozi4263 commented Dec 2, 2022

estesp commented Dec 2, 2022

haozi4263 commented Dec 2, 2022

ningziwen commented Dec 5, 2022 • edited Loading

estesp commented Dec 5, 2022

ningziwen commented Dec 6, 2022

ningziwen commented Dec 6, 2022 • edited Loading

ningziwen commented Dec 6, 2022

ningziwen commented Mar 3, 2023

ningziwen commented Dec 5, 2022 •

edited

Loading

ningziwen commented Dec 6, 2022 •

edited

Loading