python3Packages.vllm: 0.10.1.1 -> 0.10.2#447722
Conversation
|
|
I'm actually getting a build failure when building the package with CUDA support. See this link for more details and logs. Relevant lines from failed build log: -- FlashMLA is available at /nix/store/ab1x9ra2sc9k9r2nxsrhhcm0izncxbsz-flashmla-1.0.0
-- Build type: RelWithDebInfo
-- Target device: cuda
-- Found Python: /nix/store/829wb290i87wngxlh404klwxql5v18p4-python3-3.13.7/bin/python3.13 (found version "3.13.7") found components: Interpreter Development.Module Development.SABIModule
CMake Warning at /nix/store/s9xqh0qzpk19hx5n0i52r2rf2fpxhw9a-vllm-flash-attn-2.7.4.post1/CMakeLists.txt:75 (message):
Pytorch version 2.4.0 expected for CUDA build, saw 2.8.0 instead.
-- CUDA target architectures: 7.5;8.0;8.6;8.9;9.0;10.0;12.0
-- CUDA supported target architectures: 8.0;8.6;8.9;9.0;10.0;12.0
-- FA2_ARCHS: 8.0+PTX
-- FA3_ARCHS: 9.0a;8.0
-- vllm-flash-attn is available at /nix/store/s9xqh0qzpk19hx5n0i52r2rf2fpxhw9a-vllm-flash-attn-2.7.4.post1
-- Configuring done (28.7s)
CMake Error at /nix/store/a94d5zmalqava26y3hqsnj5l11l5kl5y-cmake-3.31.7/share/cmake-3.31/Modules/FindPython/Support.cmake:4240 (add_library):
Cannot find source file:
/nix/store/ab1x9ra2sc9k9r2nxsrhhcm0izncxbsz-flashmla-1.0.0/csrc/kernels_fp8/flash_fwd_mla_fp8_sm90.cu
Call Stack (most recent call first):
/nix/store/a94d5zmalqava26y3hqsnj5l11l5kl5y-cmake-3.31.7/share/cmake-3.31/Modules/FindPython.cmake:692 (__Python_add_library)
cmake/utils.cmake:462 (Python_add_library)
cmake/external_projects/flashmla.cmake:53 (define_gpu_extension_target)
CMakeLists.txt:942 (include)
CMake Error at /nix/store/a94d5zmalqava26y3hqsnj5l11l5kl5y-cmake-3.31.7/share/cmake-3.31/Modules/FindPython/Support.cmake:4240 (add_library):
No SOURCES given to target: _flashmla_C
Call Stack (most recent call first):
/nix/store/a94d5zmalqava26y3hqsnj5l11l5kl5y-cmake-3.31.7/share/cmake-3.31/Modules/FindPython.cmake:692 (__Python_add_library)
cmake/utils.cmake:462 (Python_add_library)
cmake/external_projects/flashmla.cmake:53 (define_gpu_extension_target)
CMakeLists.txt:942 (include)
CMake Generate step failed. Build files cannot be regenerated correctly. |
|
Command to reproduce building with CUDA support from my fork, for reference: nix-build \
-I nixpkgs=https://github.com/daniel-fahey/nixpkgs/archive/fbc1629b8775a9eeb932f1b79d14fa973adb67ec.tar.gz \
--expr 'with import <nixpkgs> { config = { allowUnfree = true; cudaSupport = true; }; }; python313Packages.vllm'Edit: see nixbuild.net failed build #4558385 for derivation |
|
|
Thanks for the PR @daniel-fahey ! |
Not yet, good suggestion! I'm busy with other stuff so am just showing my working as I go along in drips-and-drabs in-case someone else needs to pick this up. |
3c5d6cc to
9058b43
Compare
142c7e5 to
e7fb896
Compare
|
|
Now the CPU build fails with: |
|
How was it building on this branch earlier today??? |
44bd507 to
e7fb896
Compare
|
This builds the working Python 3.12 CPU version: [daniel@laptop:~/Source/nixpkgs]$ nix-build -I nixpkgs=https://github.com/daniel-fahey/nixpkgs/archive/fbc1629b8775a9eeb932f1b79d14fa973adb67ec.tar.gz --expr 'with import <nixpkgs> { }; python312Packages.vllm'
/nix/store/zdm5svz7ygij3dj13djsizm1zggbnapw-python3.12-vllm-0.10.2And this is the broken one: [daniel@laptop:~/Source/nixpkgs]$ nix-build -I nixpkgs=https://github.com/daniel-fahey/nixpkgs/archive/e7fb8960241ab2211c471004bcd2b04683767c33.tar.gz --expr 'with import <nixpkgs> { }; python312Packages.vllm'
this derivation will be built:
/nix/store/j98dgdx87kjahfj73cd0wvzp3c0px4dh-python3.12-vllm-0.10.2.drv
building '/nix/store/j98dgdx87kjahfj73cd0wvzp3c0px4dh-python3.12-vllm-0.10.2.drv' on 'ssh://eu.nixbuild.net'...
copying 0 paths...
[nixbuild.net] Cached build failure. The id of the previously failed build is 4565380. See this link for the failed build details and logs: https://nixbuild.net/builds/4565380?t=EtkBCm8KBWJ1aWxkCgpidWlsZDpyZWFkGAMiCQoHCAcSAxD5BSINCgsIBBIHOgUKAxiBCDImCiQKAggbEgYIBRICCAUaFgoECgIIBQoICgYg9eyp0wYKBBoCCAAyFgoUCgIIGxIOCAISAxiACBIFEITTlgISJAgAEiAGRu1AM51kzPYGl4nShP2nj4SwLZdIQEZhBaIO6KS15xpATQzlvJ7Y-SLutaWbRFkbSrm3qOrarrHr60v_1kS5vx08FGnOtkZSZjwNvMDFTJwiO1O-MfmWac0BNAXtrNz8ASIiCiAuWrCOa6uQpEanWP7s1LJ8iKWZhtugtJoeboULO7JD1w==
[nixbuild.net] To turn off build failure caching, see https://docs.nixbuild.net/settings/#reuse-build-failures
error: build of '/nix/store/j98dgdx87kjahfj73cd0wvzp3c0px4dh-python3.12-vllm-0.10.2.drv' on 'ssh://eu.nixbuild.net' failed: Cached build failure (build 4565380): Cannot build '/nix/store/j98dgdx87kjahfj73cd0wvzp3c0px4dh-python3.12-vllm-0.10.2.drv'.
Reason: builder failed with exit code 1.
Output paths:
/nix/store/cfj51613p4smfza7150mzsk21smfrzxl-python3.12-vllm-0.10.2
/nix/store/l2gjwc39xss9hg63rbpgh66is87ay4iw-python3.12-vllm-0.10.2-dist
error: Cannot build '/nix/store/j98dgdx87kjahfj73cd0wvzp3c0px4dh-python3.12-vllm-0.10.2.drv'.
Reason: builder failed with exit code 1.
Output paths:
/nix/store/cfj51613p4smfza7150mzsk21smfrzxl-python3.12-vllm-0.10.2
/nix/store/l2gjwc39xss9hg63rbpgh66is87ay4iw-python3.12-vllm-0.10.2-dist
Last 2 log lines:
> [nixbuild.net] Cached build failure. The id of the previously failed build is 4565380. See this link for the failed build details and logs: https://nixbuild.net/builds/4565380?t=EtkBCm8KBWJ1aWxkCgpidWlsZDpyZWFkGAMiCQoHCAcSAxD5BSINCgsIBBIHOgUKAxiBCDImCiQKAggbEgYIBRICCAUaFgoECgIIBQoICgYg9eyp0wYKBBoCCAAyFgoUCgIIGxIOCAISAxiACBIFEITTlgISJAgAEiAGRu1AM51kzPYGl4nShP2nj4SwLZdIQEZhBaIO6KS15xpATQzlvJ7Y-SLutaWbRFkbSrm3qOrarrHr60v_1kS5vx08FGnOtkZSZjwNvMDFTJwiO1O-MfmWac0BNAXtrNz8ASIiCiAuWrCOa6uQpEanWP7s1LJ8iKWZhtugtJoeboULO7JD1w==
> [nixbuild.net] To turn off build failure caching, see https://docs.nixbuild.net/settings/#reuse-build-failures
For full logs, run:
nix log /nix/store/j98dgdx87kjahfj73cd0wvzp3c0px4dh-python3.12-vllm-0.10.2.drvComparing the build inputs: [daniel@laptop:~/Source/nixpkgs]$ diff \
> <(nix derivation show --impure -I nixpkgs=https://github.com/daniel-fahey/nixpkgs/archive/fbc1629b8775a9eeb932f1b79d14fa973adb67ec.tar.gz --expr 'with import <nixpkgs> { }; python312Packages.vllm' | jq -r '.[] | .inputDrvs | keys[]') \
> <(nix derivation show --impure -I nixpkgs=https://github.com/daniel-fahey/nixpkgs/archive/e7fb8960241ab2211c471004bcd2b04683767c33.tar.gz --expr 'with import <nixpkgs> { }; python312Packages.vllm' | jq -r '.[] | .inputDrvs | keys[]')
57a58
> /nix/store/lnybvwfry02293c564dm33kv7ash7bck-python3.12-blake3-1.0.7.drv
71d71
< /nix/store/rc213bx4f77s732gj49y51r87z5mcjzy-python3.12-blake3-1.0.6.drvBlake3-py was updated in 29c1e9e, maybe somehow relevant? It is a dependency. |
e7fb896 to
5d92454
Compare
|
a4e10ee to
1486864
Compare
|
Anyone got a better alternative than reverting 29c1e9e? I think it's valid to do so, and I'll report the blake3 v1.0.7 compatibility issue with Python 3.12 vLLM v0.10.2+CPU upstream? |
|
Right, options: Option 1: blake3 pin for Python 3.12 CPU builds:# something like this (I've not tested it yet)
dependencies = [
aioprometheus
# Pin blake3 to 1.0.6 for Python 3.12 CPU builds
# blake3 1.0.7 causes build failures - see https://github.com/NixOS/nixpkgs/pull/447722
(if (cpuSupport && pythonOlder "3.13") then
blake3.overrideAttrs (old: {
version = "1.0.6";
src = fetchPypi {
pname = "blake3";
version = "1.0.6";
hash = "sha256-/Jhu3+7XSW024FMuUvwid3B95wSRfOH3T9yN3wMy0tY=";
};
})
else
blake3
)
cachetools
cbor2
# ... rest of deps ...
];Option 2: just mark Python 3.12 CPU builds as broken:meta = {
description = "High-throughput and memory-efficient inference and serving engine for LLMs";
# ... existing meta ...
broken = (cpuSupport && pythonOlder "3.13"); # blake3 1.0.7 incompatibility, see https://github.com/NixOS/nixpkgs/pull/447722
badPlatforms = [
"x86_64-darwin"
];
};I'm leaning toward Option 2 (marking broken) since:
Reverting the blake3 update globally would be wrong - that affects the entire ecosystem and blake3 1.0.7 is already in master. The pin would work but adds complexity for a configuration that might not have many users. Will mark as broken for now and file an upstream issue with vLLM about the blake3 1.0.7 incompatibility with Python 3.12 CPU builds? |
|
blake3-py 1.0.7 basically only updates dependencies, so I didn't expect it to break anything. If Python 3.12 support for vllm is not important then I also think Option 2 (marking broken) is the better option til we figure out what the root cause is. Please file an issue with vllm, perhaps its authors have more insight. |
1486864 to
470418e
Compare
470418e to
ded72cb
Compare
|
Could you do so @daniel-fahey please? You may then add a link to the issue in the derivation. |
Will do tomorrow, I'm out 🪩🕺 |
Notably adds Python 3.13 support (see vllm-project/vllm#13164) among many other things (https://github.com/vllm-project/vllm/releases/tag/v0.10.2). Co-authored-by: Gaétan Lepage <gaetan@glepage.com>
ded72cb to
cc60981
Compare
GaetanLepage
left a comment
There was a problem hiding this comment.
Thanks @daniel-fahey !
Notably adds Python 3.13 support (see vllm-project/vllm#13164) among many other things (https://github.com/vllm-project/vllm/releases/tag/v0.10.2).
Supersedes existing PR #442802 from @r-ryantm.
Python 3.13 support is also needed for vLLM to appear in the builds on the new (and fantastic initiative, btw) nixos-cuda Hydra, configured by @GaetanLepage.
For reference I share part of my engineer's log re: the nixos-cuda builder herein:
The new hydra.nixos-cuda.org instance is building the cuda-packages jobset, which is defined by
release-cuda.nix. Whilepython3Packages.vllmis explicitly listed in that file, it doesn't appear in any evaluation results.This is because vLLM is being filtered out during the evaluation phase:
b6f6c61, see below)disabled = pythonAtLeast "3.13";in its package definitionerror: vllm-0.10.1.1 not supported for interpreter python3.13To check which nixpkgs revision the new nixos-cuda Hydra is currently using:
To reproduce the evaluation failure:
And to verify vLLM is missing from the evaluation:
Things done
passthru.tests.nixpkgs-reviewon this PR. See nixpkgs-review usage../result/bin/.Add a 👍 reaction to pull requests you find important.