WIP: wheels CI: stricter torch index selection, test oldest versions of dependencies by jameslamb · Pull Request #413 · rapidsai/cugraph-gnn

jameslamb · 2026-02-23T20:44:23Z

Fixes #410

There, some nightly wheels tests were failing because CUDA 13 packages were being installed but testing against CUDA 12 pylibwholegraph packages. This fixes that, along with some other improvements to wheel testing:

ensures CUDA variants of torch are always installed (no fallback to pypi.org CPU-only packages)
ensures the correct torch index (based on CUDA major version) is used
adds coverage of "oldest" dependencies group, to check that lower bounds on dependencies are correct

…pendencies

ci/test_wheel_pylibwholegraph.sh

jameslamb · 2026-02-23T20:52:38Z

dependencies.yaml

+          # (useful in CI scripts where we want to tightly which indices 'pip' uses).
+          - matrix:
+              include_torch_extra_index: "false"
+            packages:


rapids-dependency-file-generator uses the first matching matrix (see https://github.com/rapidsai/dependency-file-generator?tab=readme-ov-file#how-dependency-lists-are-merged).

This will only affect cases where include_torch_extra_index=false is passed (as in CI here). Other cases (like RAPIDS devcontainers) will fall through to othe groups that pull in --extra-index-url lines.

So this should not break any other uses of this file.

jameslamb · 2026-02-23T20:53:43Z

.github/workflows/pr.yaml

      build_type: pull-request
      script: ci/test_wheel_pylibwholegraph.sh
      matrix_filter: map(select(.ARCH == "amd64"))
+      matrix_type: 'nightly'


Suggested change

matrix_type: 'nightly'

TODO: revert this. Just added here for testing, to confirm this will fix the issues we've been seeing in nightlies.

jameslamb · 2026-02-23T20:53:54Z

.github/workflows/pr.yaml

      build_type: pull-request
      script: ci/test_wheel_cugraph-pyg.sh
      matrix_filter: map(select(.ARCH == "amd64"))
+      matrix_type: 'nightly'


Suggested change

matrix_type: 'nightly'

Revert before merging.

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

Co-authored-by: Bradley Dice <bdice@bradleydice.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

jameslamb · 2026-02-24T15:52:31Z

dependencies.yaml

+      - output_types: [conda]
+        packages:
+          - torchdata
+          - pydantic


Just moving this here so depends_on_pytorch only ever contains torch / pytorch.

This test_python_common group is used everywhere that depends_on_pytorch is.

jameslamb · 2026-02-24T15:57:34Z

dependencies.yaml

+              # 2.6.0 is the oldest version on https://download.pytorch.org/whl/cu126 with CUDA wheels
+              - torch==2.6.0


@alexbarghi-nv see this note.

There aren't CUDA 12 wheels available for PyTorch older than 2.6.0.

pip download \ --isolated \ --no-deps \ --index-url=https://download.pytorch.org/whl/cu126 \ 'torch==2.3.0' # ERROR: Could not find a version that satisfies the requirement torch==2.3.0 # (from versions: 2.6.0+cu126, 2.7.0+cu126, 2.7.1+cu126, 2.8.0+cu126, 2.9.0+cu126, 2.9.1+cu126, 2.10.0+cu126)

Do you want to bump the floor in dependency metadata here to >=2.6.0? Or to leave it at >=2.3 so that these libraries are still installable alongside older PyTorch releases (for example, if people build PyTorch 2.4 from source)?

Your call.

I would vote for bumping the floor to >=2.6.0. It's a little over a year old at this point. https://github.com/pytorch/pytorch/releases/tag/v2.6.0

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

greptile-apps

_{8 files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

Nightly CI here has been failing for a couple weeks, and the root cause is "some jobs are installing incorrect `torch` wheels". That's tracked in #410 and being worked on in #413. That work unfortunately uncovered some other significant compatibility issues that will require RAPIDS-wide fixes: * rapidsai/build-planning#256 * rapidsai/build-planning#257 * rapidsai/cugraph#5443 As a short-term patch, this proposes allowing `cugraph-gnn` nightlies to fail for a few more weeks, so regular PR CI can be unblocked while we focus on the more permanent fix. Targeting the more permanent fix (and reverting this back to 7 days) for the 26.04 release (so over the next few weeks). Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Alex Barghi (https://github.com/alexbarghi-nv) - Gil Forsyth (https://github.com/gforsyth) URL: #419

Contributes to #5443 Related to rapidsai/build-planning#143 `libcugraph.so` dynamically links to several CUDA Toolkit libraries ```console $ ldd /pyenv/versions/3.11.14/lib/python3.11/site-packages/libcugraph/lib64/libcugraph.so ... libcusolver.so.12 => /usr/local/cuda/lib64/libcusolver.so.12 (0x00007c616aba7000) libcublas.so.13 => /usr/local/cuda/lib64/libcublas.so.13 (0x00007c61675d5000) libcublasLt.so.13 => /usr/local/cuda/lib64/libcublasLt.so.13 (0x00007c6143c83000) libcusparse.so.12 => /usr/local/cuda/lib64/libcusparse.so.12 (0x00007c613987c000) libcurand.so.10 => /usr/local/cuda/lib64/libcurand.so.10 (0x00007c6131161000) ... libnvJitLink.so.13 => /usr/local/cuda/lib64/libnvJitLink.so.13 (0x00007c612af5f000) ... ``` This proposes getting them from `cuda-toolkit` wheels, instead of system installations. ## Notes for Reviewers ### Benefits of this change * reduces the risk of multiple copies of the same library being loaded * allows the use of Python package versioning to manage compatibility * consistency with other RAPIDS libraries (see rapidsai/build-planning#35) * reduces the risk of runtime issues with other libraries that use CTK wheels, like `torch` (rapidsai/cugraph-gnn#413 (comment)) Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Kyle Edwards (https://github.com/KyleFromNVIDIA) URL: #5444

jameslamb · 2026-03-12T19:18:53Z

This PR uncovered a bunch of other issues that have spiraled off into other efforts: #410 (comment)

Most of the changes here are now on #425 and the rest will be addressed soon in rapidsai/build-planning#256

Moving this to draft for now, but it might end up getting closed depending on how #425 goes.

jameslamb · 2026-03-17T19:51:42Z

Since this effort started, this work has spun out into a bunch of other PRs (#410 (comment))

I'm going to close this. It'll be replaced by:

ensure 'torch' CUDA wheels are installed in CI, test that 'torch' is an optional dependency #425 (just awaiting one more review)
a future PR similar to enforce a floor on libnvjitlink, build wheels with CUDA 13.0.x, test wheels against mix of CTK versions cugraph#5457

…an optional dependency (#425) Contributes to rapidsai/build-planning#256 and #410 ## Ensures that `torch` CUDA wheels are always installed in CI *"unintentionally installed a CPU-only `torch` from PyPI"* is a failure mode I've seen a few times for CI in this project, and each time it's happened it has take a bit of time to figure out that that was the root cause. This PR tries to fix that by: * using a stricter install pattern to guarantee a CUDA `torch` wheel is installed if a compatible one exists * using local versions like `+cu130` to help prevent pulling in packages from PyPI * adding `dependencies.yaml` items for "oldest" dependencies so CI covers a range of supported versions I tested similar patterns in rapidsai/rmm#2279 and saw them work well there. ## Makes `torch` truly optional We want these packages to be installable and importable without `torch`, for use in RAPIDS DLFW builds (where we don't install `torch` alongside RAPIDS because it's build in other processes). I'd started relying on the assumption that they worked that way in this PR, but quickly realized that that isn't true... `torch` is used unconditionally in many ways in these libraries. This PR fixes that. It makes `torch` optional and adds testing to ensure it stays that way: * copying the `import_optional` machinery from `cugraph-pyg` into `pylibwholegraph` * using `import_optional()` for all `torch` imports in the libraries * using `pytest.importorskip("torch")` in tests * deferring some `torch` inputs from import time to run time * adding the `flake8-tidy-imports:banned-api` check from `ruff` to enforce that `import torch` isn't used anywhere in library code or test code * explicitly testing that the libraries are still installable and at least 1 unit test can run successfully after a `pip uninstall torch` * adding a check in `ci/validate_wheel.sh` confirming that `torch` doesn't make it into any wheel metadata *(which could happen due to mistakes in `dependencies.yaml` / `pyproject.toml`)* ## Notes for Reviewers Pulling these changes out of #413 , so CI in this repo can immediately benefit from them and so #419 can be reverted. When this is merged, #413 will have a smaller diff and just be focused on testing against a range of CTKs. Authors: - James Lamb (https://github.com/jameslamb) - Alex Barghi (https://github.com/alexbarghi-nv) Approvers: - Alex Barghi (https://github.com/alexbarghi-nv) - Kyle Edwards (https://github.com/KyleFromNVIDIA) URL: #425

wheels CI: stricter torch index selection, test oldest versions of de…

974c79a

…pendencies

This comment was marked as resolved.

Sign in to view

fix file key

d705541

jameslamb added improvement Improves an existing functionality non-breaking Introduces a non-breaking change labels Feb 23, 2026

jameslamb commented Feb 23, 2026

View reviewed changes

ci/test_wheel_pylibwholegraph.sh Outdated Show resolved Hide resolved

jameslamb commented Feb 23, 2026

View reviewed changes

jameslamb added 3 commits February 23, 2026 15:11

PIP_CONSTRAINT (singular)

7e15156

paths

43efc44

declare torch testing dependency, name fixes

34b8853

jameslamb changed the title ~~WIP: [NOT READY FOR REVIEW] wheels CI: stricter torch index selection, test oldest versions of dependencies~~ wheels CI: stricter torch index selection, test oldest versions of dependencies Feb 23, 2026

jameslamb requested review from alexbarghi-nv and bdice February 23, 2026 22:57

jameslamb marked this pull request as ready for review February 23, 2026 22:57

jameslamb requested review from a team as code owners February 23, 2026 22:57

This comment was marked as resolved.

Sign in to view

Apply suggestions from code review

25e4fd7

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

This comment was marked as off-topic.

Sign in to view

fix filepath typo

12ec8ca

This comment was marked as resolved.

Sign in to view

Apply suggestions from code review

7a5dd11

Co-authored-by: Bradley Dice <bdice@bradleydice.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

This comment was marked as resolved.

Sign in to view

update 'oldest'

843296e

This comment was marked as off-topic.

Sign in to view

jameslamb commented Feb 24, 2026

View reviewed changes

This comment was marked as off-topic.

Sign in to view

add torch wheel as a requirement

4e923d4

This comment was marked as off-topic.

Sign in to view

try pinning to +gpu versions

c054b60

This comment was marked as resolved.

Sign in to view

typo

29e1769

This comment was marked as resolved.

Sign in to view

Update dependencies.yaml

c39b7b1

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

This comment was marked as resolved.

Sign in to view

jameslamb and others added 2 commits February 25, 2026 12:11

Update dependencies.yaml

fef3fe1

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

comments

333d00b

greptile-apps bot reviewed Feb 25, 2026

View reviewed changes

jameslamb changed the base branch from main to release/26.04 March 12, 2026 19:15

jameslamb marked this pull request as draft March 12, 2026 19:19

jameslamb changed the title ~~wheels CI: stricter torch index selection, test oldest versions of dependencies~~ WIP: wheels CI: stricter torch index selection, test oldest versions of dependencies Mar 12, 2026

jameslamb closed this Mar 17, 2026

jameslamb deleted the fix/index-selection branch March 17, 2026 19:51

		# 2.6.0 is the oldest version on https://download.pytorch.org/whl/cu126 with CUDA wheels
		- torch==2.6.0

Conversation

jameslamb commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as resolved.

Uh oh!

jameslamb Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

jameslamb Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

jameslamb Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

This comment was marked as resolved.

This comment was marked as resolved.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as off-topic.

Uh oh!

jameslamb Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

jameslamb Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

bdice Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

jameslamb commented Mar 12, 2026

Uh oh!

jameslamb commented Mar 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jameslamb commented Feb 23, 2026 •

edited

Loading