Skip to content

Add vllm#28931

Merged
h-vetinari merged 165 commits intoconda-forge:mainfrom
maresb:vllm
Jul 29, 2025
Merged

Add vllm#28931
h-vetinari merged 165 commits intoconda-forge:mainfrom
maresb:vllm

Conversation

@maresb
Copy link
Copy Markdown
Contributor

@maresb maresb commented Jan 25, 2025

Very rough draft. I will almost certainly require help.

Opened on the advice of @h-vetinari in conda-forge/xformers-feedstock#42

Direct and transitive dependencies:

Closes #24710
Fixes #29105

Checklist

  • Title of this PR is meaningful: e.g. "Adding my_nifty_package", not "updated meta.yaml".
  • License file is packaged (see here for an example).
  • Source is from official source.
  • Package does not vendor other packages. (If a package uses the source of another package, they should be separate packages or the licenses of all packages need to be packaged).
  • If static libraries are linked in, the license of the static library is packaged.
  • Package does not ship static libraries. If static libraries are needed, follow CFEP-18.
  • Build number is 0.
  • A tarball (url) rather than a repo (e.g. git_url) is used in your recipe (see here for more details).
  • GitHub users listed in the maintainer section have posted a comment confirming they are willing to be listed there.
  • When in trouble, please check our knowledge base documentation before pinging a team.

@github-actions
Copy link
Copy Markdown
Contributor

Hi! This is the staged-recipes linter and your PR looks excellent! 🚀

@conda-forge-admin
Copy link
Copy Markdown
Contributor

Hi! This is the friendly automated conda-forge-linting service.

I wanted to let you know that I linted all conda-recipes in your PR (recipes/vllm/recipe.yaml) and found some lint.

Here's what I've got...

For recipes/vllm/recipe.yaml:

  • ❌ license_file entry is missing, but is required.
  • ❌ Non noarch packages should have python requirement without any version constraints.
  • ❌ Non noarch packages should have python requirement without any version constraints.

For recipes/vllm/recipe.yaml:

  • ℹ️ Please depend on pytorch directly. If your package definitely requires the CUDA version, please depend on pytorch =*=cuda*.
  • ℹ️ Use importlib-metadata instead of importlib_metadata
  • ℹ️ PyPI default URL is now pypi.org, and not pypi.io. You may want to update the default source url.

This message was generated by GitHub Actions workflow run https://github.com/conda-forge/conda-forge-webservices/actions/runs/12962027449. Examine the logs at this URL for more detail.

@conda-forge-admin
Copy link
Copy Markdown
Contributor

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipes/vllm/recipe.yaml) and found it was in an excellent condition.

@maresb
Copy link
Copy Markdown
Contributor Author

maresb commented Jan 25, 2025

Interesting, we're getting different results between CUDA 11.8 and 12.0.

Both fail in the following command:

['cmake', '$SRC_DIR', '-G', 'Ninja', '-DCMAKE_BUILD_TYPE=RelWithDebInfo', 
'-DVLLM_TARGET_DEVICE=cuda', '-DVLLM_PYTHON_EXECUTABLE=$PREFIX/bin/python', 
'-DVLLM_PYTHON_PATH=$PREFIX/lib/python3.9/site-packages/pip/_vendor/pyproject_hooks/_in_process:$PREFIX/lib/python39.zip:$PREFIX/lib/python3.9:$PREFIX/lib/python3.9/lib-dynload:$PREFIX/lib/python3.9/site-packages:$PREFIX/lib/python3.9/site-packages/setuptools/_vendor', 
'-DFETCHCONTENT_BASE_DIR=$SRC_DIR/.deps', '-DNVCC_THREADS=1',
'-DCMAKE_JOB_POOL_COMPILE:STRING=compile', '-DCMAKE_JOB_POOLS:STRING=compile=2']

12.0 fails earlier at CUDA detection:

 │ │       -- Caffe2: Found protobuf with new-style protobuf targets.
 │ │       -- Caffe2: Protobuf version 28.2.0
 │ │       -- Could NOT find CUDA (missing: CUDA_INCLUDE_DIRS) (found version "12.0")
 │ │       CMake Warning at $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Caffe2/public/cuda.cmake:31 (message):
 │ │         Caffe2: CUDA cannot be found.  Depending on whether you are building Caffe2
 │ │         or a Caffe2 dependent library, the next warning / error will give you more
 │ │         info.
 │ │       Call Stack (most recent call first):
 │ │         $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:86 (include)
 │ │         $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
 │ │         CMakeLists.txt:84 (find_package)
 │ │       
 │ │       
 │ │       CMake Error at $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:90 (message):
 │ │         Your installed Caffe2 version uses CUDA but I cannot find the CUDA
 │ │         libraries.  Please set the proper CUDA prefixes and / or install CUDA.
 │ │       Call Stack (most recent call first):
 │ │         $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
 │ │         CMakeLists.txt:84 (find_package)

11.8 gets further:

 │ │       -- Caffe2: Found protobuf with new-style protobuf targets.
 │ │       -- Caffe2: Protobuf version 28.2.0
 │ │       -- Found CUDA: /usr/local/cuda (found version "11.8")
 │ │       -- The CUDA compiler identification is NVIDIA 11.8.89 with host compiler GNU 11.4.0
 │ │       -- Detecting CUDA compiler ABI info
 │ │       -- Detecting CUDA compiler ABI info - done
 │ │       -- Check for working CUDA compiler: $PREFIX/bin/nvcc - skipped
 │ │       -- Detecting CUDA compile features
 │ │       -- Detecting CUDA compile features - done
 │ │       -- Found CUDAToolkit: /usr/local/cuda/include (found version "11.8.89")
 │ │       -- Caffe2: CUDA detected: 11.8
 │ │       -- Caffe2: CUDA nvcc is: /usr/local/cuda/bin/nvcc
 │ │       -- Caffe2: CUDA toolkit directory: /usr/local/cuda
 │ │       -- Caffe2: Header version is: 11.8
 │ │       -- Found Python: $PREFIX/bin/python (found version "3.9.21") found components: Interpreter
 │ │       CMake Warning at $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Caffe2/public/cuda.cmake:140 (message):
 │ │         Failed to compute shorthash for libnvrtc.so
 │ │       Call Stack (most recent call first):
 │ │         $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:86 (include)
 │ │         $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
 │ │         CMakeLists.txt:84 (find_package)
 │ │       
 │ │       
 │ │       CMake Warning (dev) at $PREFIX/share/cmake-3.31/Modules/FindPackageHandleStandardArgs.cmake:441 (message):
 │ │         The package name passed to `find_package_handle_standard_args` (nvtx3) does
 │ │         not match the name of the calling package (Caffe2).  This can lead to
 │ │         problems in calling code that expects `find_package` result variables
 │ │         (e.g., `_FOUND`) to follow a certain pattern.
 │ │       Call Stack (most recent call first):
 │ │         $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Caffe2/public/cuda.cmake:174 (find_package_handle_standard_args)
 │ │         $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:86 (include)
 │ │         $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
 │ │         CMakeLists.txt:84 (find_package)
 │ │       This warning is for project developers.  Use -Wno-dev to suppress it.
 │ │       
 │ │       -- Could NOT find nvtx3 (missing: nvtx3_dir)
 │ │       CMake Warning at $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Caffe2/public/cuda.cmake:180 (message):
 │ │         Cannot find NVTX3, find old NVTX instead
 │ │       Call Stack (most recent call first):
 │ │         $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:86 (include)
 │ │         $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
 │ │         CMakeLists.txt:84 (find_package)
 │ │       
 │ │       
 │ │       -- USE_CUDNN is set to 0. Compiling without cuDNN support
 │ │       -- USE_CUSPARSELT is set to 0. Compiling without cuSPARSELt support
 │ │       -- USE_CUDSS is set to 0. Compiling without cuDSS support
 │ │       -- USE_CUFILE is set to 0. Compiling without cuFile support
 │ │       -- Automatic GPU detection failed. Building for common architectures.
 │ │       -- Autodetected CUDA architecture(s): 3.5;5.0;8.0;8.6;8.9;9.0;8.9+PTX;9.0+PTX
 │ │       -- Added CUDA NVCC flags for: -gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_89,code=sm_89;-gencode;arch=compute_90,code=sm_90;-gencode;arch=compute_89,code=compute_89;-gencode;arch=compute_90,code=compute_90
 │ │       -- Found Torch: $PREFIX/lib/libtorch.so
 │ │       -- CUDA target architectures: 3.5;5.0;8.0;8.6;8.9;9.0
 │ │       -- CUDA supported target architectures: 8.0;8.6;8.9;9.0
 │ │       -- FetchContent base directory: $SRC_DIR/.deps
 │ │       CMake Error at $PREFIX/share/cmake-3.31/Modules/ExternalProject/shared_internal_commands.cmake:943 (message):
 │ │         error: could not find git for clone of cutlass-populate
 │ │       Call Stack (most recent call first):
 │ │         $PREFIX/share/cmake-3.31/Modules/ExternalProject.cmake:3041 (_ep_add_download_command)
 │ │         CMakeLists.txt:29 (ExternalProject_Add)
 │ │       Call Stack (most recent call first):
 │ │         $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:86 (include)
 │ │         $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
 │ │         CMakeLists.txt:84 (find_package)
 │ │       
 │ │       
 │ │       -- USE_CUDNN is set to 0. Compiling without cuDNN support
 │ │       -- USE_CUSPARSELT is set to 0. Compiling without cuSPARSELt support
 │ │       -- USE_CUDSS is set to 0. Compiling without cuDSS support
 │ │       -- USE_CUFILE is set to 0. Compiling without cuFile support
 │ │       -- Automatic GPU detection failed. Building for common architectures.
 │ │       -- Autodetected CUDA architecture(s): 3.5;5.0;8.0;8.6;8.9;9.0;8.9+PTX;9.0+PTX
 │ │       -- Added CUDA NVCC flags for: -gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_89,code=sm_89;-gencode;arch=compute_90,code=sm_90;-gencode;arch=compute_89,code=compute_89;-gencode;arch=compute_90,code=compute_90
 │ │       -- Found Torch: $PREFIX/lib/libtorch.so
 │ │       -- CUDA target architectures: 3.5;5.0;8.0;8.6;8.9;9.0
 │ │       -- CUDA supported target architectures: 8.0;8.6;8.9;9.0
 │ │       -- FetchContent base directory: $SRC_DIR/.deps
 │ │       CMake Error at $PREFIX/share/cmake-3.31/Modules/ExternalProject/shared_internal_commands.cmake:943 (message):
 │ │         error: could not find git for clone of cutlass-populate
 │ │       Call Stack (most recent call first):
 │ │         $PREFIX/share/cmake-3.31/Modules/ExternalProject.cmake:3041 (_ep_add_download_command)
 │ │         CMakeLists.txt:29 (ExternalProject_Add)
 │ │       
 │ │       
 │ │       -- Configuring incomplete, errors occurred!

@conda-forge-admin
Copy link
Copy Markdown
Contributor

Hi! This is the friendly automated conda-forge-linting service.

I wanted to let you know that I linted all conda-recipes in your PR (recipes/vllm/recipe.yaml) and found some lint.

Here's what I've got...

For recipes/vllm/recipe.yaml:

  • ❌ Selectors in comment form no longer work in v1 recipes. Instead, if / then / else maps must be used. See lines [39, 41, 46, 48, 49].

This message was generated by GitHub Actions workflow run https://github.com/conda-forge/conda-forge-webservices/actions/runs/12967644561. Examine the logs at this URL for more detail.

@conda-forge-admin
Copy link
Copy Markdown
Contributor

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipes/vllm/recipe.yaml) and found it was in an excellent condition.

@h-vetinari
Copy link
Copy Markdown
Member

Thanks @maresb! I had forgotten that there's already #24710, perhaps @mediocretech would be interested in collaborating?

W.r.t. CUDA, we need to move on from 12.0 here, which isn't used anywhere else in conda-forge anymore - it's just that staged-recipes seems to have been forgotten in the context of conda-forge/conda-forge-pinning-feedstock#6630.

@maresb
Copy link
Copy Markdown
Contributor Author

maresb commented Jan 26, 2025

Oh, I didn't notice that effort, thanks @h-vetinari! Although that's old it looks like @rongou is eager to help! 🚀

Do you think that CUDA 12.0 is actually causing a problem here? I was thinking (i.e. wildly guessing) that we need to patch CMakeLists.txt, but I've never used cmake. 😞

@h-vetinari
Copy link
Copy Markdown
Member

Mainly I want to avoid redundant work. As soon as #28938 is in and we have merged main here, I'll be happy to take a look what's going on.

@h-vetinari
Copy link
Copy Markdown
Member

In any case, you'll have to address

 │ │       CMake Error at $PREFIX/share/cmake-3.31/Modules/ExternalProject/shared_internal_commands.cmake:943 (message):
 │ │         error: could not find git for clone of cutlass-populate

@maresb
Copy link
Copy Markdown
Contributor Author

maresb commented Jan 27, 2025

Woah, after adding git as a host dependency it's compiling on CUDA 11.8 until it runs out of memory and crashes. Maybe I can add some swap. CUDA 12.0 is still not being discovered.

...
 │ │ Building wheels for collected packages: vllm
 │ │   Building wheel for vllm (pyproject.toml): started
 │ │   Building wheel for vllm (pyproject.toml): still running...
...
 │ │   Building wheel for vllm (pyproject.toml): still running...
##[warning]Free memory is lower than 5%; Currently used: 95.80%
##[warning]Free memory is lower than 5%; Currently used: 95.80%
##[warning]Free memory is lower than 5%; Currently used: 95.80%
##[warning]Free memory is lower than 5%; Currently used: 95.80%
 │ │   Building wheel for vllm (pyproject.toml): still running...
 │ │   Building wheel for vllm (pyproject.toml): still running...
 │ │   Building wheel for vllm (pyproject.toml): still running...

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jan 27, 2025

Hi! This is the staged-recipes linter and I found some lint.

It looks like some changes were made outside the recipes/ directory. To ensure everything runs smoothly, please make sure that recipes are only added to the recipes/ directory and no other files are changed.

If these changes are intentional (and you aren't submitting a recipe), please add a maintenance label to the PR.

File-specific lints and/or hints:

  • .scripts/debug_osx_arch.sh:

    • lints:
      • Do not edit files outside of the recipes/ directory.
  • conda-forge.yml:

    • lints:
      • Do not edit files outside of the recipes/ directory.
  • .scripts/new_run_osx_build.sh:

    • lints:
      • Do not edit files outside of the recipes/ directory.
  • .scripts/new_run_docker_build.sh:

    • lints:
      • Do not edit files outside of the recipes/ directory.

@maresb
Copy link
Copy Markdown
Contributor Author

maresb commented Jan 27, 2025

Ah, hmm, I just added swap to conda-forge.yml. Not sure how that's supposed to work here on staged-recipes. 🤔

EDIT: Oh good, the linter is complaining, so that will help us to remember to revert it before merging.

EDIT2: Hmm, it seems that the swap setting works on linux_64 but fails on linux_64_cuda_*:

image

@maresb
Copy link
Copy Markdown
Contributor Author

maresb commented Jan 29, 2025

Hi @h-vetinari!

As soon as #28938 is in and we have merged main here, I'll be happy to take a look what's going on.

As a brief summary of the above, I merged main into this branch after #28938 was merged into main. It didn't seem to change anything with respect to the errors.

On CUDA 12.x I'm hitting the error:

Your installed Caffe2 version uses CUDA but I cannot find the CUDA
 │ │         libraries.  Please set the proper CUDA prefixes and / or install CUDA

On 11.8, after adding git as a host dependency, compilation starts but it runs out of memory. I tried to add swap by editing conda-forge.yml, but it didn't apply to the CUDA builds.

I'd be grateful for any advice you could provide. Thanks!

@h-vetinari
Copy link
Copy Markdown
Member

On CUDA 12.x I'm hitting the error:

We're (now) aware of the CUDA-angle of conda-forge/pytorch-cpu-feedstock#333

On 11.8, after adding git as a host dependency, compilation starts but it runs out of memory. I tried to add swap by editing conda-forge.yml, but it didn't apply to the CUDA builds.

#28979

@maresb
Copy link
Copy Markdown
Contributor Author

maresb commented Feb 1, 2025

I would have hoped to get more out of setting VERBOSE=1. The only logs I get are:

 │ │   Building wheel for vllm (pyproject.toml): still running...

VERBOSE=1 is supposed to add the flag -DCMAKE_VERBOSE_MAKEFILE=ON. Not sure what exactly that does.

Here's the corresponding Python code to go from the envvar to get the flag:

https://github.com/vllm-project/vllm/blob/a1fc18c030e4d0466f2b23cb7dd4d11ce4b85603/vllm/envs.py#L138-L140

https://github.com/vllm-project/vllm/blob/a1fc18c030e4d0466f2b23cb7dd4d11ce4b85603/setup.py#L132-L134

@maresb maresb closed this Feb 4, 2025
@maresb maresb reopened this Feb 4, 2025
@shermansiu shermansiu mentioned this pull request Feb 12, 2025
2 tasks
@shermansiu
Copy link
Copy Markdown
Contributor

Hmm, it still appears broken after conda-forge/pytorch-cpu-feedstock#339.

Could NOT find CUDA (missing: CUDA_INCLUDE_DIRS) (found version "12.6")

Is CUDA_INCLUDE_DIRS properly set?

@shermansiu
Copy link
Copy Markdown
Contributor

shermansiu commented Feb 12, 2025

The CUDA 11.8 build probably fails because it's out of disk space and/or RAM, but that's just speculation:

##[warning]Free disk space on / is lower than 5%; Currently used: 95.08% (x5)

##[warning]Free memory is lower than 5%; Currently used: 96.11% (x5)

@maresb
Copy link
Copy Markdown
Contributor Author

maresb commented Feb 12, 2025

Hey @shermansiu, great to have you around!!!

I'm a bit lost since I'm not very familiar with CUDA.

I was just now having some trouble getting the CI to rerun the CUDA builds, but rebasing seems to have fixed it.

Also, post-rebase things seem to be proceeding slightly further for 12.6:

 │ │       -- Caffe2: Found protobuf with new-style protobuf targets.
 │ │       -- Caffe2: Protobuf version 28.3.0
 │ │       -- Unable to find cublas_v2.h in either "$PREFIX/targets/x86_64-linux/include" or "$PREFIX/math_libs/include"
 │ │       -- Found CUDAToolkit: $PREFIX/targets/x86_64-linux/include (found version "12.6.85")
 │ │       -- Check for working CUDA compiler: $PREFIX/bin/nvcc - skipped
 │ │       -- Detecting CUDA compile features
 │ │       -- Detecting CUDA compile features - done
 │ │       -- Unable to find cublas_v2.h in either "$PREFIX/targets/x86_64-linux/include" or "$PREFIX/math_libs/include"
 │ │       -- Caffe2: CUDA detected: 12.6.85
 │ │       -- Caffe2: CUDA nvcc is: $PREFIX/bin/nvcc
 │ │       -- Caffe2: CUDA toolkit directory:
 │ │       -- Caffe2: Header version is: 12.6
 │ │       CMake Error at $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Caffe2/public/cuda.cmake:107 (get_target_property):
 │ │         get_target_property() called with non-existent target "CUDA::nvrtc".
 │ │       Call Stack (most recent call first):
 │ │         $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:86 (include)
 │ │         $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
 │ │         CMakeLists.txt:81 (find_package)
 │ │       
 │ │       
 │ │       -- Found Python: $PREFIX/bin/python (found version "3.9.21") found components: Interpreter
 │ │       CMake Warning at $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Caffe2/public/cuda.cmake:116 (message):
 │ │         Failed to compute shorthash for libnvrtc.so
 │ │       Call Stack (most recent call first):
 │ │         $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:86 (include)
 │ │         $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
 │ │         CMakeLists.txt:81 (find_package)
 │ │       
 │ │       
 │ │       CMake Warning (dev) at $PREFIX/share/cmake-3.31/Modules/FindPackageHandleStandardArgs.cmake:441 (message):
 │ │         The package name passed to `find_package_handle_standard_args` (nvtx3) does
 │ │         not match the name of the calling package (Caffe2).  This can lead to
 │ │         problems in calling code that expects `find_package` result variables
 │ │         (e.g., `_FOUND`) to follow a certain pattern.
 │ │       Call Stack (most recent call first):
 │ │         $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Caffe2/public/cuda.cmake:154 (find_package_handle_standard_args)
 │ │         $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:86 (include)
 │ │         $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
 │ │         CMakeLists.txt:81 (find_package)
 │ │       This warning is for project developers.  Use -Wno-dev to suppress it.
 │ │       
 │ │       -- Could NOT find nvtx3 (missing: nvtx3_dir)
 │ │       CMake Warning at $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Caffe2/public/cuda.cmake:160 (message):
 │ │         Cannot find NVTX3, find old NVTX instead
 │ │ Failed to build vllm
 │ │       Call Stack (most recent call first):
 │ │         $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:86 (include)
 │ │         $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
 │ │         CMakeLists.txt:81 (find_package)
 │ │       
 │ │       
 │ │       -- USE_CUDNN is set to 0. Compiling without cuDNN support
 │ │       -- USE_CUSPARSELT is set to 0. Compiling without cuSPARSELt support
 │ │       -- USE_CUDSS is set to 0. Compiling without cuDSS support
 │ │       -- USE_CUFILE is set to 0. Compiling without cuFile support
 │ │       -- Added CUDA NVCC flags for:
 │ │       CMake Warning at $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:22 (message):
 │ │         static library kineto_LIBRARY-NOTFOUND not found.
 │ │       Call Stack (most recent call first):
 │ │         $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:120 (append_torchlib_if_found)
 │ │         CMakeLists.txt:81 (find_package)
 │ │       
 │ │       
 │ │       -- Found Torch: $PREFIX/lib/libtorch.so
 │ │       CMake Error at CMakeLists.txt:122 (message):
 │ │         Can't find CUDA or HIP installation.

I'm not too sure what this means or how to fix it. I'd be very grateful for any suggestions.

@shermansiu
Copy link
Copy Markdown
Contributor

Hmm, I'd like to build the recipe locally to diagnose this further, but at a glance, the following line looks a bit concerning:

 -- Unable to find cublas_v2.h in either "$PREFIX/targets/x86_64-linux/include" or "$PREFIX/math_libs/include"

Copy link
Copy Markdown
Member

@h-vetinari h-vetinari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, you need more than just {{ compiler("cuda") }} to get what all the CUDA components you need.

Look like you need at minimum

    - cuda-version =={{ cuda_compiler_version }}
    - cuda-cudart-dev
    - cuda-nvrtc-dev
    - libcublas-dev

in the host environment. Also note that we're still figuring out an issue with nvtx, see conda-forge/pytorch-cpu-feedstock#357

Comment thread recipes/vllm/recipe.yaml Outdated
Comment on lines +33 to +38
- cmake
- git
- ${{ stdlib('c') }}
- ${{ compiler('c') }}
- ${{ compiler('cxx') }}
- ${{ compiler('cuda') }}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All this (+ninja) should move to the build environment.

Copy link
Copy Markdown
Contributor

@shermansiu shermansiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to resolve the nvtx issue, but then it complains about not being able to find kineto.

Using USE_KINETO=0 doesn't seem to work because the existing PyTorch .cmake files in the environment already have kineto enabled.

lib/python3.9/site-packages/torch/share/cmake/Torch/TorchConfig.cmake

if(ON)
  append_torchlib_if_found(kineto)
endif()

See:

Comment thread recipes/vllm/recipe.yaml
Comment thread recipes/vllm/recipe.yaml
@shermansiu
Copy link
Copy Markdown
Contributor

shermansiu commented Jul 21, 2025

Thanks so much, h-vetenari, for your incredibly detailed review!

Copy link
Copy Markdown
Member

@h-vetinari h-vetinari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getting there!

Comment thread recipes/vllm/recipe.yaml Outdated
Comment thread recipes/vllm/recipe.yaml Outdated
Copy link
Copy Markdown
Member

@h-vetinari h-vetinari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR basically LGTM now! Thanks for all the hard work! It's still marked as a draft, are you missing anything else? Some tasks related to the gpu server are still ahead, and some suggestions for further improvements below.

Comment thread recipes/vllm/recipe.yaml Outdated
Comment thread recipes/vllm/conda_build_config.yaml
Comment thread recipes/vllm/conda_build_config.yaml Outdated
Comment thread recipes/vllm/recipe.yaml Outdated
Comment thread recipes/vllm/recipe.yaml
Comment thread recipes/vllm/patches/0003-Manually-define-gettid.patch
@shermansiu
Copy link
Copy Markdown
Contributor

shermansiu commented Jul 24, 2025

Thanks for looking at this, @h-vetinari! I should be able to get to the rest of the things later in the week, hopefully!

@shermansiu
Copy link
Copy Markdown
Contributor

If there are no other requested changes, I'm good to have this merged! The pull request is no longer just a draft, but I don't have the permissions to change it.

@shermansiu
Copy link
Copy Markdown
Contributor

@maresb Please add "Closes #24710" and "fixes #29105" to the initial PR description so that we can close the other vllm PR and the package request issue automatically once this gets merged

@shermansiu
Copy link
Copy Markdown
Contributor

Anyways, the CUDA 12.6 build works locally and the tests pass:

 │ Installing test environment
 │ ✔ Successfully updated the test environment
 │ Testing commands:
 │ ============================= test session starts ==============================
 │ platform linux -- Python 3.10.18, pytest-8.4.1, pluggy-1.6.0
 │ rootdir: $PREFIX/etc/conda/test-files/vllm/2
 │ plugins: anyio-4.9.0
 │ collected 22 items
 │ vllm/tests/core/test_scheduler.py ......................                 [100%]
 │ ============================== 22 passed in 6.97s ==============================
 │
 ╰─────────────────── (took 72 seconds)
 ✔ all tests passed!

Copy link
Copy Markdown
Member

@h-vetinari h-vetinari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still marked as a draft (intentional?), and you'll have to do the procedures to get the rights to the opengpu server, but the PR itself LGTM!

@shermansiu
Copy link
Copy Markdown
Contributor

"Only those with write access to this repository can mark a draft pull request as ready for review." - I'm unable to change this!

@h-vetinari h-vetinari marked this pull request as ready for review July 29, 2025 03:40
@shermansiu
Copy link
Copy Markdown
Contributor

Please add "Closes #24710" and "fixes #29105" to the initial PR description so that we can close the other vllm PR and the package request issue automatically once this gets merged

Thanks @h-vetinari for adding that in! 😄

@h-vetinari h-vetinari merged commit ee83645 into conda-forge:main Jul 29, 2025
6 of 8 checks passed
@maresb maresb deleted the vllm branch July 30, 2025 05:23
@maresb maresb restored the vllm branch August 17, 2025 10:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Package request: vllm

5 participants