-
Notifications
You must be signed in to change notification settings - Fork 1.4k
starting container with gvisor and NVIDIA gpu is getting hanged. #10997
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Logs from containerd
Stuck case screenshot |
…ess(). Loader.createContainerProcess() => createDeviceFiles() calls devutil.GoferClientFromContext() when using nvidia-container-runtime-hook. devutil.GoferClientFromContext() expects `Kernel.containerNames` map to be initialized with the current container's ID -> name mapping. However, for sub-containers we were not initializing this map before Loader.createContainerProcess(). This change fixes that. We hadn't hit this yet because we never had multi-container usages of nvidia-container-runtime-hook. Fixes #10997 PiperOrigin-RevId: 683064310
I can see from logs that you have a multi-container set up. The first container is a pause container. The second container is
I think #10998 should fix this issue. Let me know if it works. |
@ayushr2 It isn't getting stuck now. Thanks 👍🏼. One container started successfully, but after that, all containers remained running. They should exit after displaying the output of the Gvisor LogsThere are some unsupported syscalls & some CPU architecture-related warnings. But it ran successfully once. if you see logs runsc.log.20241007-111033.311499.gofer.txt Containerd logs saying container started successfully
|
…ess(). Loader.createContainerProcess() => createDeviceFiles() calls devutil.GoferClientFromContext() when using nvidia-container-runtime-hook. devutil.GoferClientFromContext() expects `Kernel.containerNames` map to be initialized with the current container's ID -> name mapping. However, for sub-containers we were not initializing this map before Loader.createContainerProcess(). This change fixes that. We hadn't hit this yet because we never had multi-container usages of nvidia-container-runtime-hook. Updates #10997 PiperOrigin-RevId: 683064310
…ess(). Loader.createContainerProcess() => createDeviceFiles() calls devutil.GoferClientFromContext() when using nvidia-container-runtime-hook. devutil.GoferClientFromContext() expects `Kernel.containerNames` map to be initialized with the current container's ID -> name mapping. However, for sub-containers we were not initializing this map before Loader.createContainerProcess(). This change fixes that. We hadn't hit this yet because we never had multi-container usages of nvidia-container-runtime-hook. Updates #10997 PiperOrigin-RevId: 683064310
…ess(). Loader.createContainerProcess() => createDeviceFiles() calls devutil.GoferClientFromContext() when using nvidia-container-runtime-hook. devutil.GoferClientFromContext() expects `Kernel.containerNames` map to be initialized with the current container's ID -> name mapping. However, for sub-containers we were not initializing this map before Loader.createContainerProcess(). This change fixes that. We hadn't hit this yet because we never had multi-container usages of nvidia-container-runtime-hook. Updates #10997 PiperOrigin-RevId: 683256638
Can you show all log files? Where there no With GPU containers, there is a "device gofer connection". This connection is only cleaned up when the container is destroyed: Lines 1289 to 1290 in cbbd0b4
The container is destroyed with The boot logs show that the
Does this issue not happen without gVisor? i.e. try using just runc. Is the container status = STOPPED without calling |
…nning. This is helpful in a scenario where: - The container application is a GPU application. - The container application exits. Its mount namespace is released, and as a result all gofer mounts are released. So all gofer connections are closed, except the device gofer connection which is not associated with any mount. - The gofer process continues to hang waiting for the device gofer connection to exit. Before this change, the device gofer connection was only cleaned upon container destruction (e.g. via `runsc delete`). After this change, the gofer process is cleaned up without an explicit `runsc delete`. The gofer monitor is tracking the rootfs mount's gofer connection. When the application exits, it is notified that the connection is closed. We can use this signal to clean up the device gofer connection too. Updates #10997 PiperOrigin-RevId: 683307750
If somehow this is the case, then #11003 should fix the issue. Could you give that a shot as well? |
#11003 did not work. there were no
|
I think that was for a different container. The container ID and sandbox ID in the |
@PSKP-95 Does this issue not occur with runc (without gVisor)? |
Yes , it does not occur with runc. I have added screenshot of it in my previous comment also added logs of containerd with runc. |
@PSKP-95 : Have you considered using this setup instead: https://gvisor.dev/docs/user_guide/containerd/quick_start/ ? Our own k8s setup on GKE has us use a shim for containerd (containerd-shim-runsc-v1) which translates commands from containerd to commands to runsc. This may be why some commands that @ayushr2 references don't appear in logs. Your setup is not something I've seen before. |
Hey @zkoopmans , so I've followed the same doc to get Gvisor and the containerd up and running, which is working out great so far... but when I try to use it with the GPU, I'm getting an error. It is only with GPU In the case of GPU, I have to use NVIDIA shim. That's what I understood from the docs. |
@PSKP-95 : You shouldn't need to use the NVIDIA shim. My guess is that the error you're running into with the gVisor shim is that the container crashes rather than hangs? This is because the sandboxed application tries to reach out to the GPU resource, but the sandbox doesn't have the I suspect you need to set flags in the gVisor shim configuration, usually You want the |
@zkoopmans From the debug logs provided above, it looks like the nvproxy flag is set. |
I think that's with the NVIDIA shim method he was using, no? If he's using the gVisor shim, he'll need to set the flags in the runsc config instead of the custom runsc call he has above. |
Not sure if there is nvidia shim involved here. He's using the @PSKP-95 I mostly have the environment set up. I am able to run non-GPU pods with But in your case, you are using
You might need to update that to P.S. I have still not been able to configure the
Even though I fixed the |
Okay I was able to fix my setup (the issue was there was some lingering pod in the background somehow). I can confirm that fixing I will update our docs to mention this. This is not obvious. Unfortunately, |
I found the nvidia-container-toolkit pointers for this:
Hence, the only way to get containerd to use runsc shim correctly alongside nvidia-container-runtime right now is to manually modify |
…nerd. The only way to get containerd to use runsc shim correctly alongside nvidia-container-runtime right now is to manually modify /etc/containerd/config.toml after running `nvidia-ctk runtime configure` and updating `plugins."io.containerd.grpc.v1.cri".containerd.runtimes.nvidia` => `runtime_type` to "io.containerd.runsc.v1". See #10997 (comment). Fixes #10997 PiperOrigin-RevId: 687431610
Hi @ayushr2, are you able to get output from
I was hoping, it will mount necessary devices and executables like it does in case of |
Ah I see what's going on. In my setup, I am also passing When I remove that flag, I get the same error you are getting. Upon inspecting the OCI spec being passed to runsc, I don't see the prestart hook in the OCI spec of the Let me try to figure out what's going wrong. Until then, you can use |
Okay I figured it out.
This is to instruct the runtime shim to invoke However, this format of options is not understood by runsc shim. To specify options for the runsc shim, we need to use the format described in https://gvisor.dev/docs/user_guide/containerd/configuration/. Here is what my setup looks like and I can confirm that your demo works:
I will update #11060 to mention this. |
…nerd. The only way to get containerd to use runsc shim correctly alongside nvidia-container-runtime right now is to manually modify /etc/containerd/config.toml after running `nvidia-ctk runtime configure`. See #10997 (comment) and #10997 (comment). Fixes #10997 PiperOrigin-RevId: 687431610
Hi @ayushr2, Thank you so much! It's working now. I have a question though—don't you think the |
Got it now. Keeping |
Yeah, You can have multiple toml config files for runsc. Here is how you could set up containerd:
I also realized that you don't even need |
Description
Using
g4dn.xlarge
ec2 instance on AWS which hasTesla T4
. Installed NVIDIA driver550.90.07
which is supported by nvproxy. checked by runningrunsc nvproxy list-supported-drivers
Then installed
nvidia-container-runtime
using below command & then configured forcontainerd
. It added runtime in/etc/containerd/config.toml
As per gvisor documentation, created bash script with below content and kept on PATH (
/usr/local/bin/runscgpu
Here is nvidia-container-runtime configuration. Added
runscgpu
as runtime.Steps to reproduce
lspci | grep -i nvidia
pod.json
andcontainer.json
like below,pod.json
container.json
runsc version
docker version (if using docker)
uname
Linux ip-172-21-148-200.eu-central-1.compute.internal 6.1.109-118.189.amzn2023.x86_64 #1 SMP PREEMPT_DYNAMIC Tue Sep 10 08:59:12 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
kubectl (if using Kubernetes)
No response
repo state (if built from source)
No response
runsc debug logs (if available)
The text was updated successfully, but these errors were encountered: