[Refactor][NFC] Vendor-in cpython components for future CUDA-specific changes #468
[Refactor][NFC] Vendor-in cpython components for future CUDA-specific changes #468atmnp wants to merge 14 commits intoNVIDIA:mainfrom
Conversation
|
Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
|
/ok to test |
|
/ok to test |
|
/ok to test |
|
/ok to test |
|
/ok to test |
|
/ok to test |
|
/ok to test |
| """This module provides the unsafe things for targets/numbers.py""" | ||
|
|
||
| from numba.core import types, errors | ||
| from numba.core.extending import intrinsic |
There was a problem hiding this comment.
Does this work with numba.cuda.extending.intrinsic now?
|
|
||
| from numba.core import types, typing, errors | ||
| from numba.cuda.cgutils import alloca_once | ||
| from numba.core.extending import intrinsic |
There was a problem hiding this comment.
Same question about intrinsic
| # SPDX-License-Identifier: BSD-2-Clause | ||
|
|
||
| """ | ||
| Implementation of enums. |
There was a problem hiding this comment.
Could we vendor a few of the enum related tests? Things appear to work with the CUDA target:
from enum import IntEnum
from numba import cuda
import numpy as np
class Color(IntEnum):
RED = 1
GREEN = 2
BLUE = 3
@cuda.jit(device=True)
def is_primary(color):
return int(color) in (Color.RED, Color.BLUE)
@cuda.jit
def kernel(input, output):
idx = cuda.grid(1)
if idx < input.size:
output[idx] = is_primary(input[idx])
input = cuda.to_device([1, 2, 3])
output = cuda.to_device([0, 0, 0])
kernel[1, 3](input, output)
print(output.copy_to_host())[1 0 1]
We don't need everything but things might map reasonably well to simple kernels that test the CPU implementations of Enum properties on perhaps a single cuda thread
| # SPDX-License-Identifier: BSD-2-Clause | ||
|
|
||
| """ | ||
| Implementation of the range object for fixed-size integers. |
There was a problem hiding this comment.
These also appear to be supported so it would be nice to include so basic tests
@cuda.jit
def kernel(output):
idx = cuda.grid(1)
if idx < output.size:
r = range(10)
sum = 0
for i in r:
sum += 1
output[idx] = sum
output = cuda.to_device([0, 0, 0])
kernel[1, 3](output)
print(output.copy_to_host())[10 10 10]
| # SPDX-License-Identifier: BSD-2-Clause | ||
|
|
||
| """ | ||
| Implementation of tuple objects |
There was a problem hiding this comment.
could use a few test cases since this ports to cuda. Things like iteration, possibly __getitem__, etc.
@cuda.jit
def kernel(output):
idx = cuda.grid(1)
if idx < output.size:
t = (1, 2, 3)
sum = 0
for i in t:
sum += 1
output[idx] = sum
output = cuda.to_device([0, 0, 0])
kernel[1, 3](output)
print(output.copy_to_host())[3 3 3]
brandon-b-miller
left a comment
There was a problem hiding this comment.
This generally looks good. I think we should at least create a few basic test cases for each of the builtin objects for whom lowering is being vendored here. It need not be comprehensive but lets put a few files in place to get things started and we can move more things over if it feels like there's gaps later on.
This change vendors in parts of the cpython components for future CUDA-specific changes. In particular, this contains the implementations that do not explicitly require additional tests from Numba to verify behavior. Additionally, there are a few small cleanup fixes such as removing the defunct CPUTarget that is unnecessary in our implementation of the CUDA target.
This PR has been temporarily abandoned in favor of #493 which is a bulk change.