libwholegraph wheels: use nvidia-nccl wheels instead of vendoring libnccl.so#284
libwholegraph wheels: use nvidia-nccl wheels instead of vendoring libnccl.so#284jameslamb merged 2 commits intorapidsai:branch-25.10from jameslamb:devendor-libnccl
Conversation
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
|
/ok to test |
|
I think this is ready for review!
But probably because of the ongoing |
| echo "libwholegraph-${RAPIDS_PY_CUDA_SUFFIX} @ file://$(echo "${LIBWHOLEGRAPH_WHEELHOUSE}"/libwholegraph_*.whl)" >> "${PIP_CONSTRAINT}" | ||
|
|
||
| export SKBUILD_CMAKE_ARGS="-DBUILD_SHARED_LIBS=ON;-DCMAKE_MESSAGE_LOG_LEVEL=VERBOSE;-DCUDA_STATIC_RUNTIME=ON;-DWHOLEGRAPH_BUILD_WHEELS=ON" | ||
| export SKBUILD_CMAKE_ARGS="-DBUILD_SHARED_LIBS=ON;-DCMAKE_MESSAGE_LOG_LEVEL=VERBOSE;-DCUDA_STATIC_RUNTIME=ON" |
There was a problem hiding this comment.
Not strictly related, but noticed while I was looking at logs to confirm the NCCL stuff was working correctly.
CMake Warning:
Manually-specified variables were not used by the project:
CUDA_STATIC_RUNTIME
WHOLEGRAPH_BUILD_WHEELS
Think we still want CUDA_STATIC_RUNTIME because it could get passed through here:
But WHOLEGRAPH_BUILD_WHEELS doesn't do anything anywhere in this project.
git grep '_BUILD_WHEEL'|
/ok to test |
|
This will need an admin merge because of the failing |
|
@jameslamb you have my permission to admin merge once we have packaging/CI approval |
|
Thank you! |
gforsyth
left a comment
There was a problem hiding this comment.
Very nice improvement, let's gooooo!
|
|
||
| # PyPI limit is 100 MiB, fail CI before we get too close to that | ||
| max_allowed_size_compressed = '75M' | ||
| max_allowed_size_compressed = '10Mi' |
| # PyPI limit is 100 MiB, fail CI before we get too close to that | ||
| max_allowed_size_compressed = '80Mi' |
|
Thanks @gforsyth ! Ok I'm going to admin-merge this and then manually trigger the nightly tests. |
|
All of the wheel tests passed on the manually-triggered nightly run! https://github.com/rapidsai/cugraph-gnn/actions/runs/17250350223 Just one conda test failed, and it looks to me like something temporary while downloading a large dataset. I just restarted it... I'm hopeful it'll pass and that nightlies will be working here again. |
Fixes #281
Similar to rapidsai/cuvs#827, proposes that wheels get their copy of
libnccl.soat runtime fromnvidia-nccl-cu{12,13}wheels, instead of vendoring a copying insidelibwholegraph.Notes for Reviewers
Benefits of these changes
libncclbetween this project andpytorch" may be to blame for the failing nightly testslibwholegraphwheelspypi.orgwithout needing an exception 🎉 🎉 🎉libnccl.solibnccl.sobeing loaded