Skip to content

Use cuda.bindings and cuda.core for Linker#133

Merged
gmarkall merged 91 commits intoNVIDIA:mainfrom
brandon-b-miller:cuda-core-linker
Jun 27, 2025
Merged

Use cuda.bindings and cuda.core for Linker#133
gmarkall merged 91 commits intoNVIDIA:mainfrom
brandon-b-miller:cuda-core-linker

Conversation

@brandon-b-miller
Copy link
Contributor

WIP
xref #129

@leofang
Copy link
Member

leofang commented Feb 22, 2025

Thanks, @brandon-b-miller. Remember our goal is to drop every Linker subclasses inside Numba, in favor of cuda.core.Linker. The current PR is not what we want. Also note that to help pynvjitlink to phase out, we already have rapidsai/pynvjitlink#111 which is essentially what this PR does today.

@brandon-b-miller brandon-b-miller changed the title Use cuda.bindings and cuda.core for nvjitlink Use cuda.bindings and cuda.core for Linker Feb 24, 2025
@copy-pr-bot
Copy link

copy-pr-bot bot commented Feb 24, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@gmarkall gmarkall added the 2 - In Progress Currently a work in progress label Mar 7, 2025
@brandon-b-miller
Copy link
Contributor Author

/ok to test

@brandon-b-miller
Copy link
Contributor Author

@gmarkall @leofang numba-cuda contains nvjitlink tests, should we maintain support for these as part of this PR or drop them in favor of testing in upstream cuda-python?

@gmarkall
Copy link
Contributor

@gmarkall @leofang numba-cuda contains nvjitlink tests, should we maintain support for these as part of this PR or drop them in favor of testing in upstream cuda-python?

I think we need to maintain the tests that test Numba-CUDA's interaction with the linker, like the TestLinkerUsage class and the test_*_with_linkable_code (tests with names like that). I don't think we need to keep the tests that purely test the PyNvJitLinker API like the ones that test passing different flags etc. to it.

@gmarkall
Copy link
Contributor

Also, I think we can probably delete the PyNvJitLinker class in this PR as well - is there any reason to keep it around?

I'm comfortable with:

  • Using cuda.core.Linker when the user asks for pynvjitlink or the NVIDIA bindings, and
  • Using the ctypes linker otherwise

which is what this PR seems to offer. (correct me if I've read it wrong 🙂)

@brandon-b-miller
Copy link
Contributor Author

Also, I think we can probably delete the PyNvJitLinker class in this PR as well - is there any reason to keep it around?

I'm comfortable with:

  • Using cuda.core.Linker when the user asks for pynvjitlink or the NVIDIA bindings, and
  • Using the ctypes linker otherwise

which is what this PR seems to offer. (correct me if I've read it wrong 🙂)

Correct, this is the outcome I am aiming for.

@brandon-b-miller
Copy link
Contributor Author

@gmarkall on second thought, we might need to leave the MVCLinker in in some capacity as long as we're supporting cuda 11. I don't think that cuda-python supports the functionality that cubinlinker enables.

@gmarkall
Copy link
Contributor

@brandon-b-miller Sorry, yes - I had that in mind but didn't write it down.

@brandon-b-miller
Copy link
Contributor Author

@brandon-b-miller Sorry, yes - I had that in mind but didn't write it down.

Ok, just to have it written down somewhere, after this PR we will:

For cuda 11, maintain the current way of configuring which bindings to use:

  • Default ctypes bindings, optional cuda-python bindings with NUMBA_CUDA_USE_NVIDIA_BINDING=1, optional MVCLinker with NUMBA_CUDA_ENABLE_MINOR_VERSION_COMPATIBILITY=1.

for cuda 12, we will have:

  • Default ctypes bindings, optional cuda-python bindings with NUMBA_CUDA_USE_NVIDIA_BINDING=1
  • Use of pynvjitlink through cuda-python if NUMBA_CUDA_ENBALE_PYNVJITLINK=1

This will leave us with 3 linkers:

  • The ctypes linker which is used by default regardless of cuda version
  • the mvc linker which is used in a cuda 11 environment when mvc is required, regardless of what binding is being used
  • the new linker which is used in a cuda 12 enviornment either the cuda-python bindings or pynvjitlink is enabled

@gmarkall
Copy link
Contributor

Thanks for the summary! To look a little further ahead, we want to end up with only one linker, which is the new linker. This would be achieved by deprecating / removing the other linkers as soon as appropriate:

  • MVCLinker can be removed as soon as CUDA 11 support is dropped.
  • The ctypes linker can be deprecated and removed whenever we can have a hard dependency on cuda.core and had tested the new linker in use for a bit to shake out any issues.

Are you in alignment with the above plan @brandon-b-miller ?

@brandon-b-miller
Copy link
Contributor Author

Thanks for the summary! To look a little further ahead, we want to end up with only one linker, which is the new linker. This would be achieved by deprecating / removing the other linkers as soon as appropriate:

  • MVCLinker can be removed as soon as CUDA 11 support is dropped.
  • The ctypes linker can be deprecated and removed whenever we can have a hard dependency on cuda.core and had tested the new linker in use for a bit to shake out any issues.

Are you in alignment with the above plan @brandon-b-miller ?

Yup this sounds good to me.

@brandon-b-miller
Copy link
Contributor Author

/ok to test 85f8710

@brandon-b-miller
Copy link
Contributor Author

/ok to test 547dab5

@brandon-b-miller
Copy link
Contributor Author

/ok to test 3bd469d

@brandon-b-miller
Copy link
Contributor Author

/ok to test ba5c20a

@brandon-b-miller
Copy link
Contributor Author

/ok to test 134f6ee

@gmarkall
Copy link
Contributor

I thought the code changes looked good but I'm hitting an error disabling the NVIDIA binding locally now with this PR. For example:

$ NUMBA_CUDA_USE_NVIDIA_BINDING=0 python -m numba.runtests numba.cuda.tests -v
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/home/gmarkall/numbadev/numba/numba/runtests.py", line 9, in <module>
    sys.exit(0 if _main(sys.argv) else 1)
                  ^^^^^^^^^^^^^^^
  File "/home/gmarkall/numbadev/numba/numba/testing/_runtests.py", line 25, in _main
    return run_tests(argv, defaultTest='numba.tests',
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/gmarkall/numbadev/numba/numba/testing/__init__.py", line 54, in run_tests
    prog = NumbaTestProgram(argv=argv,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/gmarkall/numbadev/numba/numba/testing/main.py", line 204, in __init__
    super(NumbaTestProgram, self).__init__(*args, **kwargs)
  File "/home/gmarkall/miniforge3/envs/numbadev/lib/python3.11/unittest/main.py", line 101, in __init__
    self.parseArgs(argv)
  File "/home/gmarkall/numbadev/numba/numba/testing/main.py", line 293, in parseArgs
    super(NumbaTestProgram, self).parseArgs(argv)
  File "/home/gmarkall/miniforge3/envs/numbadev/lib/python3.11/unittest/main.py", line 150, in parseArgs
    self.createTests()
  File "/home/gmarkall/miniforge3/envs/numbadev/lib/python3.11/unittest/main.py", line 161, in createTests
    self.test = self.testLoader.loadTestsFromNames(self.testNames,
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/gmarkall/miniforge3/envs/numbadev/lib/python3.11/unittest/loader.py", line 232, in loadTestsFromNames
    suites = [self.loadTestsFromName(name, module) for name in names]
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/gmarkall/miniforge3/envs/numbadev/lib/python3.11/unittest/loader.py", line 232, in <listcomp>
    suites = [self.loadTestsFromName(name, module) for name in names]
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/gmarkall/miniforge3/envs/numbadev/lib/python3.11/unittest/loader.py", line 162, in loadTestsFromName
    module = __import__(module_name)
             ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/gmarkall/numbadev/numba-cuda/numba_cuda/numba/cuda/__init__.py", line 57, in <module>
    raise RuntimeError("nvJitLink requires the NVIDIA CUDA bindings. ")
RuntimeError: nvJitLink requires the NVIDIA CUDA bindings. 

Just looking into why this is and why it doesn't seem to occur on the ctypes binding test in CI.

"in place of pynvjitlink."
)
else:
raise RuntimeError("nvJitLink requires the NVIDIA CUDA bindings. ")
Copy link
Contributor

@gmarkall gmarkall Jun 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because config.CUDA_ENABLE_PYNVJITLINK is enabled automatically if it's found in the environment, disabling the NVIDIA bindings if pynvjitlink is installed now leads to this exception being hit.

Copy link
Contributor

@gmarkall gmarkall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can solve the issue of it not being possible to disable the NVIDIA bindings if pynvjitlink is installed by keeping track of whether it was enabled automatically, and just ignoring it if it was, like:

diff --git a/numba_cuda/numba/cuda/__init__.py b/numba_cuda/numba/cuda/__init__.py
index 430b3b7..e944fe0 100644
--- a/numba_cuda/numba/cuda/__init__.py
+++ b/numba_cuda/numba/cuda/__init__.py
@@ -9,6 +9,9 @@ import warnings
 # 1. Config setting "CUDA_ENABLE_PYNVJITLINK" (highest priority)
 # 2. Environment variable "NUMBA_CUDA_ENABLE_PYNVJITLINK"
 # 3. Auto-detection of pynvjitlink module (lowest priority)
+
+pynvjitlink_auto_enabled = False
+
 if getattr(config, "CUDA_ENABLE_PYNVJITLINK", None) is None:
     if (
         _pynvjitlink_enabled_in_env := _readenv(
@@ -17,9 +20,10 @@ if getattr(config, "CUDA_ENABLE_PYNVJITLINK", None) is None:
     ) is not None:
         config.CUDA_ENABLE_PYNVJITLINK = _pynvjitlink_enabled_in_env
     else:
-        config.CUDA_ENABLE_PYNVJITLINK = (
+        pynvjitlink_auto_enabled = (
             importlib.util.find_spec("pynvjitlink") is not None
         )
+        config.CUDA_ENABLE_PYNVJITLINK = pynvjitlink_auto_enabled
 
 # Upstream numba sets CUDA_USE_NVIDIA_BINDING to 0 by default, so it always
 # exists. Override, but not if explicitly set to 0 in the envioronment.
@@ -53,6 +57,11 @@ if config.CUDA_ENABLE_PYNVJITLINK:
             "NVIDIA bindings are enabled. cuda.core will be used "
             "in place of pynvjitlink."
         )
+    elif pynvjitlink_auto_enabled:
+        # Ignore the fact that pynvjitlink is enabled, because that was an
+        # automatic decision based on discovering pynvjitlink was present; the
+        # user didn't ask for it
+        pass
     else:
         raise RuntimeError("nvJitLink requires the NVIDIA CUDA bindings. ")
 

Does this seem like a workable solution? (It allows me to disable the NVIDIA binding for testing locally)

@leofang
Copy link
Member

leofang commented Jun 27, 2025

I need to catch with the progress here, but are we still keeping both cuda.core.Linker and pynvjiink?

@brandon-b-miller
Copy link
Contributor Author

I need to catch with the progress here, but are we still keeping both cuda.core.Linker and pynvjiink?

No, pynvjitlink won't be required. We're just puzzling over what to do for existing users who try and enable it, or have it in their environment after this PR :)

@brandon-b-miller
Copy link
Contributor Author

/ok to test 99c87f3

@gmarkall gmarkall merged commit 489045f into NVIDIA:main Jun 27, 2025
39 checks passed
gmarkall added a commit to gmarkall/numba-cuda that referenced this pull request Jul 2, 2025
- Updates for recent API changes (NVIDIA#313)
- Fix lineinfo generation when compile_internal used (NVIDIA#271) (NVIDIA#287)
- Build docs with NVIDIA Sphinx theme (NVIDIA#312)
- Don't skip debug tests when LTO enabled by default (NVIDIA#311)
- Use `cuda.bindings` and `cuda.core` for `Linker` (NVIDIA#133)
- Enable LTO by default when pynvjitlink is available (NVIDIA#310)
@gmarkall gmarkall mentioned this pull request Jul 2, 2025
gmarkall added a commit that referenced this pull request Jul 2, 2025
- Updates for recent API changes (#313)
- Fix lineinfo generation when compile_internal used (#271) (#287)
- Build docs with NVIDIA Sphinx theme (#312)
- Don't skip debug tests when LTO enabled by default (#311)
- Use `cuda.bindings` and `cuda.core` for `Linker` (#133)
- Enable LTO by default when pynvjitlink is available (#310)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

4 - Waiting on author Waiting for author to respond to review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants