Skip to content
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 23 additions & 15 deletions sgl-kernel/python/sgl_kernel/load_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -205,20 +205,28 @@ def _find_cuda_home():


def _preload_cuda_library():
"""Preload the CUDA runtime library to help avoid 'libcudart.so.12 not found' issues."""
cuda_home = Path(_find_cuda_home())

if (cuda_home / "lib").is_dir():
cuda_path = cuda_home / "lib"
elif (cuda_home / "lib64").is_dir():
cuda_path = cuda_home / "lib64"
else:
# Search for 'libcudart.so.12' in subdirectories
for path in cuda_home.rglob("libcudart.so.12"):
cuda_path = path.parent
break
else:
raise RuntimeError("Could not find CUDA lib directory.")

cuda_include = (cuda_path / "libcudart.so.12").resolve()
if cuda_include.exists():
ctypes.CDLL(str(cuda_include), mode=ctypes.RTLD_GLOBAL)
candidate_dirs = [
cuda_home / "lib",
cuda_home / "lib64",
Path("/usr/lib/x86_64-linux-gnu"),
Path("/usr/lib/aarch64-linux-gnu"),
Path("/usr/lib64"),
Path("/usr/lib"),
Copy link

@Inokinoki Inokinoki Nov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think one promising way, to detect the lib in the env, would be detecting whether there are nvidia-cuda-runtime-like packages (such packages can be found on pytorch pypi, and NVIDIA's nvidia-cuda-runtime-cu11, nvidia-cuda-runtime-cu12, nvidia-cuda-runtime) installed in the current env.

$ pip list
...
nvidia-cuda-runtime-cu12 (12.0.107)
...

If so, its paths could be also candidate dirs:

>>> import nvidia.cuda_runtime.lib
>>> nvidia.cuda_runtime.lib.__path__
['/home/ubuntu/dev/test-cudart/lib/python3.6/site-packages/nvidia/cuda_runtime/lib']

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed on that. However, given we currently have no reason to believe we need to keep this pre-loading logic, maybe that's not worth the extra complexity right now? In particular, I don't see any such logic in vLLM repo. Wdyt?

]

for base in candidate_dirs:
candidate = base / "libcudart.so.12"
if candidate.exists():
try:
cuda_runtime_lib = candidate.resolve()
ctypes.CDLL(str(cuda_runtime_lib), mode=ctypes.RTLD_GLOBAL)
logger.debug(f"Preloaded CUDA runtime under {cuda_runtime_lib}")
return
except Exception as e:
logger.debug(f"Failed to load {cuda_runtime_lib}: {e}")
continue

logger.debug("[sgl_kernel] Could not preload CUDA runtime library")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The previous implementation used rglob to perform a recursive search for the CUDA library within cuda_home. While the new list of explicit paths is a great improvement for performance and common cases, removing the recursive search entirely might be a regression for setups where the library is in a non-standard subdirectory of CUDA_HOME. To improve robustness, consider adding the rglob search back as a final fallback mechanism after the main loop.

Suggested change
logger.debug("[sgl_kernel] Could not preload CUDA runtime library")
# Fallback to a recursive search in cuda_home as a last resort
if cuda_home.is_dir():
for candidate in cuda_home.rglob("libcudart.so.12"):
if not candidate.is_file():
continue
try:
cuda_runtime_lib = candidate.resolve()
ctypes.CDLL(str(cuda_runtime_lib), mode=ctypes.RTLD_GLOBAL)
logger.debug(f"Preloaded CUDA runtime under {cuda_runtime_lib} (found via rglob)")
return
except Exception as e:
logger.debug(f"Failed to load {candidate} (found via rglob): {e}")
continue
logger.debug("[sgl_kernel] Could not preload CUDA runtime library")