[Improvement] Persist CUDA compat libraries paths to prevent reset on `apt-get` by emricksini-h · Pull Request #30784 · vllm-project/vllm

emricksini-h · 2025-12-16T14:45:01Z

Currently, the Dockerfile registers CUDA compatibility libraries using a transient RUN ldconfig /path/to/compat command. This updates the cache but does not persist the configuration.

If a user extends this image or runs a debug container and executes apt-get install (which triggers a default ldconfig), the custom compatibility path is wiped from the cache. This causes the container to silently fall back to the host driver's native CUDA version (e.g., 12.4) instead of the container's optimized version (12.9), potentially degrading performance or raising version compatibility mismatch errors.

This PR make this more robust by writing the path to /etc/ld.so.conf.d/00-cuda-compat.conf before running ldconfig. This ensures the compatibility layer persists regardless of future package installations or cache rebuilds.

chatgpt-codex-connector · 2025-12-16T14:45:08Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.

gemini-code-assist

Code Review

This pull request correctly persists the CUDA compatibility library path by creating a configuration file in /etc/ld.so.conf.d/, which is a good improvement for the Docker image's robustness. However, the implementation introduces a critical command injection vulnerability by using an unquoted build argument within a command substitution. I have provided specific comments and suggestions to address this. This pattern of unquoted variables appears elsewhere in the Dockerfile, and I strongly recommend a full audit to fix all instances and secure the build process.

docker/Dockerfile

github-actions · 2025-12-16T15:27:05Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors.

You ask your reviewers to trigger select CI tests on top of fastcheck CI.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

If you have any questions, please reach out to us on Slack at https://slack.vllm.ai.

🚀

wangshangsam

Looks reasonable, but, in the PR description, could you include a concrete example of where the existing code fails? Cuz

If a user extends this image or runs a debug container and executes apt-get install (which triggers a default ldconfig), the custom compatibility path is wiped from the cache.

This I can understand. But

This causes the container to silently fall back to the host driver's native CUDA version (e.g., 12.4) instead of the container's optimized version (12.9), potentially degrading performance or raising version compatibility mismatch errors.

This I don't quite understand. I thought that, the container's CUDA version is just the container's CUDA version, and the whole /usr/local/cuda-$(echo $CUDA_VERSION | cut -d. -f1,2)/compat/ thing is just to enable compatibility mode (i.e., you can run a later CUDA version on a older driver version)?

mgoin

I see, this makes sense to me as a fragile configuration right now. But I agree with @wangshangsam to clarify

emricksini-h · 2026-01-13T12:06:18Z

Thanks @wangshangsam & @mgoin for the review !

To give a concrete example, I ran two versions of the vLLM Docker image on a cluster node with CUDA 12.4 (driver 550.163.01) installed. The first image is the base vllm-openai. The second uses the same base but executes a setup script via a CMD argument to install debug utilities with apt-get.

In the first docker image (default), I have:

<<K9s-Shell>> Pod: inference/test-6fb4b645d8-f4bg5 | Container: test
root@test-6fb4b645d8-f4bg5:/vllm-workspace# nvidia-smi

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.163.01             Driver Version: 550.163.01     CUDA Version: 12.9     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|

In the second (debug), I have:

<<K9s-Shell>> Pod: inference/dev-66dc7d79c9-x5v6h | Container: dev
root@dev-66dc7d79c9-x5v6h:/vllm-workspace# nvidia-smi

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.163.01             Driver Version: 550.163.01     CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|

Ultimately, there is only one driver version, but that driver may be compatible with different CUDA versions (including newer ones). Incompatibility issues can arise when code compiled with CUDA 12.9 (or newer) is executed in a Docker container that lacks the necessary compatibility layer, causing it to fall back to the node's version (12.4).

In my case, I encountered the following error when loading Qwen_Qwen3-VL-4B-Instruct-FP8 in the debug image:

(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843] EngineCore failed to start.
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843] Traceback (most recent call last):
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 834, in run_engine_core
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]     engine_core = EngineCoreProc(*args, **kwargs)
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 610, in __init__
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]     super().__init__(
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 109, in __init__
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]     num_gpu_blocks, num_cpu_blocks, kv_cache_config = self._initialize_kv_caches(
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]                                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 235, in _initialize_kv_caches
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]     available_gpu_memory = self.model_executor.determine_available_memory()
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/abstract.py", line 126, in determine_available_memory
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]     return self.collective_rpc("determine_available_memory")
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/uniproc_executor.py", line 75, in collective_rpc
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]     result = run_method(self.driver_worker, method, args, kwargs)
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/serial_utils.py", line 479, in run_method
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]     return func(*args, **kwargs)
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]   File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 120, in decorate_context
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]     return func(*args, **kwargs)
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_worker.py", line 324, in determine_available_memory
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]     self.model_runner.profile_run()
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 4322, in profile_run
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]     dummy_encoder_outputs = self.model.embed_multimodal(
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen3_vl.py", line 1512, in embed_multimodal
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]     video_embeddings = self._process_video_input(multimodal_input)
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen3_vl.py", line 1413, in _process_video_input
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]     video_embeds = self.visual(pixel_values_videos, grid_thw=grid_thw)
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]   File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]     return self._call_impl(*args, **kwargs)
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]   File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1786, in _call_impl
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]     return forward_call(*args, **kwargs)
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen3_vl.py", line 560, in forward
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]     hidden_states = blk(
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]                     ^^^^
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]   File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]     return self._call_impl(*args, **kwargs)
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]   File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1786, in _call_impl
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]     return forward_call(*args, **kwargs)
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen3_vl.py", line 237, in forward
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]     x = x + self.attn(
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]             ^^^^^^^^^^
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]   File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]     return self._call_impl(*args, **kwargs)
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]   File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1786, in _call_impl
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]     return forward_call(*args, **kwargs)
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen2_5_vl.py", line 398, in forward
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]     context_layer = vit_flash_attn_wrapper(
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]                     ^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]   File "/usr/local/lib/python3.12/dist-packages/vllm/attention/ops/vit_attn_wrappers.py", line 82, in vit_flash_attn_wrapper
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]     return torch.ops.vllm.flash_attn_maxseqlen_wrapper(
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]   File "/usr/local/lib/python3.12/dist-packages/torch/_ops.py", line 1255, in __call__
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]     return self._op(*args, **kwargs)
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]            ^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]   File "/usr/local/lib/python3.12/dist-packages/vllm/attention/ops/vit_attn_wrappers.py", line 36, in flash_attn_maxseqlen_wrapper
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]     output = flash_attn_varlen_func(
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]              ^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]   File "/usr/local/lib/python3.12/dist-packages/vllm/vllm_flash_attn/flash_attn_interface.py", line 253, in flash_attn_varlen_func
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]     out, softmax_lse = torch.ops._vllm_fa2_C.varlen_fwd(
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]   File "/usr/local/lib/python3.12/dist-packages/torch/_ops.py", line 1255, in __call__
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]     return self._op(*args, **kwargs)
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843]            ^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843] torch.AcceleratorError: CUDA error: the provided PTX was compiled with an unsupported toolchain.
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843] Search for `cudaErrorUnsupportedPtxVersion' in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html for more information.
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843] CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843] For debugging consider passing CUDA_LAUNCH_BLOCKING=1
(EngineCore_DP0 pid=1159) ERROR 01-13 03:55:38 [core.py:843] Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

By running the following command, the issue got resolved:

ldconfig /usr/local/cuda-12.9/compat/

The fix in the PR prevents the issue to appear by making sure the compat is always enabled.

Signed-off-by: emricksini-h <emrick.birivoutin@hcompany.ai>

wangshangsam · 2026-01-13T20:11:31Z

Thanks @emricksini-h ! Now this makes sense.

huydhn · 2026-01-15T03:09:56Z

Unfortunately, I think this change doesn't work with newer drivers. PyTorch x vLLM benchmark jobs are using 580.105.08 to support newer CUDA 13.0

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.105.08             Driver Version: 580.105.08     CUDA Version: 13.0     |
+-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA B200                    Off |   00000000:D1:00.0 Off |                    0 |
| N/A   32C    P0            141W /  750W |       0MiB / 183359MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

With this change, importing PyTorch fails right away:

python3 -c 'import torch; torch.cuda.is_available()'
/usr/local/lib/python3.12/dist-packages/torch/cuda/__init__.py:182: UserWarning: CUDA initialization: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 803: system has unsupported display driver / cuda driver combination (Triggered internally at /pytorch/c10/cuda/CUDAFunctions.cpp:119.)
  return torch._C._cuda_getDeviceCount() > 0

Here is an example failure https://github.com/pytorch/pytorch-integration-testing/actions/runs/21017403967/job/60426060877#step:19:1452

… `apt-get` (vllm-project#30784) Signed-off-by: emricksini-h <emrick.birivoutin@hcompany.ai>

…reset on `apt-get` (vllm-project#30784)" This reverts commit 2a60ac9. Signed-off-by: emricksini-h <emrick.birivoutin@hcompany.ai>

… `apt-get` (vllm-project#30784) Signed-off-by: emricksini-h <emrick.birivoutin@hcompany.ai>

… `apt-get` (vllm-project#30784) Signed-off-by: emricksini-h <emrick.birivoutin@hcompany.ai> Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>

…reset on `apt-get` (vllm-project#30784)" This reverts commit 2a60ac9.

…reset on `apt-get` (vllm-project#30784)" (#31) This reverts commit 2a60ac9.

* [Docker][Dev] Fix libnccl-dev version for the CUDA 13.0.1 devel image [Docker][Dev] Fix libnccl-dev version conflict for the CUDA 13.0.1 devel image Further update * feat: Support FA4 for mm-encoder-attn-backend for qwen models * feat: Kernel warmup for vit fa4 * fix: Fix some minor conflicts due to the introduction of flash_attn.cute * Revert "[Docker][Dev] Fix libnccl-dev version for the CUDA 13.0.1 devel image" This reverts commit ab76b28. * chore: Update requirements and revert README.md * chore: Install git for flash_attn cute installation * lint: Fix linting * Revert "[Improvement] Persist CUDA compat libraries paths to prevent reset on `apt-get` (vllm-project#30784)" (#31) This reverts commit 2a60ac9. --------- Co-authored-by: Shang Wang <shangw@nvidia.com>

… `apt-get` (vllm-project#30784) Signed-off-by: emricksini-h <emrick.birivoutin@hcompany.ai>

mergify bot added ci/build nvidia labels Dec 16, 2025

github-project-automation bot added this to NVIDIA Dec 16, 2025

gemini-code-assist bot reviewed Dec 16, 2025

View reviewed changes

docker/Dockerfile Outdated Show resolved Hide resolved

docker/Dockerfile Outdated Show resolved Hide resolved

emricksini-h force-pushed the fix/persist-cuda-compat-config branch from 47bf56c to 83b9527 Compare December 16, 2025 17:30

emricksini-h requested review from DarkLight1337 and ywang96 as code owners December 16, 2025 17:30

mergify bot added the multi-modality Related to multi-modality (#4194) label Dec 16, 2025

emricksini-h force-pushed the fix/persist-cuda-compat-config branch from 83b9527 to 18ff788 Compare December 16, 2025 17:33

wangshangsam assigned emricksini-h Jan 12, 2026

wangshangsam reviewed Jan 12, 2026

View reviewed changes

mgoin approved these changes Jan 12, 2026

View reviewed changes

github-project-automation bot moved this to Ready in NVIDIA Jan 12, 2026

emricksini-h added 2 commits January 13, 2026 13:08

Persist cuda compatibility libraries path

54ae5b4

Signed-off-by: emricksini-h <emrick.birivoutin@hcompany.ai>

Add quotes to prevent command injection

e64ddb2

Signed-off-by: emricksini-h <emrick.birivoutin@hcompany.ai>

emricksini-h force-pushed the fix/persist-cuda-compat-config branch from 18ff788 to e64ddb2 Compare January 13, 2026 12:08

wangshangsam approved these changes Jan 13, 2026

View reviewed changes

mgoin added the ready ONLY add when PR is ready to merge/full CI is needed label Jan 13, 2026

mgoin enabled auto-merge (squash) January 13, 2026 20:22

vllm-bot merged commit 2a60ac9 into vllm-project:main Jan 13, 2026
94 of 97 checks passed

github-project-automation bot moved this from Ready to Done in NVIDIA Jan 13, 2026

huydhn mentioned this pull request Jan 15, 2026

[Bug]: Fail to load vLLM on new NVIDIA driver #32373

Closed

1 task

This was referenced Jan 15, 2026

[Docker] Remove CUDA compatibility library loading; fixes #32373 CentML/vllm#26

Closed

[Docker] Remove CUDA compatibility library loading; fixes #32373 #32377

Closed

sammysun0711 pushed a commit to sammysun0711/vllm that referenced this pull request Jan 16, 2026

[Improvement] Persist CUDA compat libraries paths to prevent reset on…

105977c

… `apt-get` (vllm-project#30784) Signed-off-by: emricksini-h <emrick.birivoutin@hcompany.ai>

emricksini-h mentioned this pull request Jan 16, 2026

[Docker][Hotfix] CUDA compatibility enablement #32474

Closed

akh64bit pushed a commit to akh64bit/vllm that referenced this pull request Jan 16, 2026

[Improvement] Persist CUDA compat libraries paths to prevent reset on…

9916b4b

… `apt-get` (vllm-project#30784) Signed-off-by: emricksini-h <emrick.birivoutin@hcompany.ai>

wangshangsam added a commit to CentML/vllm that referenced this pull request Jan 24, 2026

Revert "[Improvement] Persist CUDA compat libraries paths to prevent …

d1672a5

…reset on `apt-get` (vllm-project#30784)" This reverts commit 2a60ac9.

wangshangsam added a commit to CentML/vllm that referenced this pull request Jan 25, 2026

Revert "[Improvement] Persist CUDA compat libraries paths to prevent …

9a4fc64

…reset on `apt-get` (vllm-project#30784)" (#31) This reverts commit 2a60ac9.

zhandaz pushed a commit to CentML/vllm that referenced this pull request Jan 25, 2026

Revert "[Improvement] Persist CUDA compat libraries paths to prevent …

f66762f

…reset on `apt-get` (vllm-project#30784)" (#31) This reverts commit 2a60ac9.

wpc mentioned this pull request Jan 26, 2026

[CI/Build][BugFix] fix cuda/compat loading order issue in docker build #33116

Merged

5 tasks

ehfd mentioned this pull request Feb 6, 2026

[Bugfix] Fix CUDA compatibility path setting for both datacenter and consumer NVIDIA GPUs #33992

Merged

5 tasks

ehfd mentioned this pull request Feb 15, 2026

[build] fix priority of cuda-compat libraries in ld loading #34226

Closed

5 tasks

ItzDEXX pushed a commit to ItzDEXX/vllm that referenced this pull request Feb 19, 2026

[Improvement] Persist CUDA compat libraries paths to prevent reset on…

bbed7cc

… `apt-get` (vllm-project#30784) Signed-off-by: emricksini-h <emrick.birivoutin@hcompany.ai>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Improvement] Persist CUDA compat libraries paths to prevent reset on `apt-get`#30784

[Improvement] Persist CUDA compat libraries paths to prevent reset on `apt-get`#30784
vllm-bot merged 2 commits intovllm-project:mainfrom
emricksini-h:fix/persist-cuda-compat-config

emricksini-h commented Dec 16, 2025 •

edited by github-actions bot

Loading

Uh oh!

chatgpt-codex-connector bot commented Dec 16, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Dec 16, 2025

Uh oh!

wangshangsam left a comment

Uh oh!

mgoin left a comment

Uh oh!

emricksini-h commented Jan 13, 2026

Uh oh!

wangshangsam commented Jan 13, 2026

Uh oh!

Uh oh!

huydhn commented Jan 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

Conversation

emricksini-h commented Dec 16, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chatgpt-codex-connector bot commented Dec 16, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Dec 16, 2025

Uh oh!

wangshangsam left a comment

Choose a reason for hiding this comment

Uh oh!

mgoin left a comment

Choose a reason for hiding this comment

Uh oh!

emricksini-h commented Jan 13, 2026

Uh oh!

wangshangsam commented Jan 13, 2026

Uh oh!

Uh oh!

huydhn commented Jan 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

emricksini-h commented Dec 16, 2025 •

edited by github-actions bot

Loading