fix: empty array type mismatch between host and device #612

kaeun97 · 2025-11-27T01:01:51Z

Fixes #483.

As mentioned here, the bug was due to:

empty arrays incorrectly identified as broadcast arrays
incorrect contiguity flags for empty arrays

When running this:

from numba import cuda, typeof
import numpy as np

h_values = np.random.randint(low=0, high=2, size=(10, 0))
d_values = cuda.to_device(h_values)

print(typeof(h_values))
print(typeof(d_values))

assert typeof(h_values) == typeof(d_values)

Below is the output:

array(int64, 2d, C)
array(int64, 2d, C)

copy-pr-bot · 2025-11-27T01:01:54Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

greptile-apps · 2025-11-27T01:26:05Z

Greptile Overview

Greptile Summary

Fixed type mismatch between host and device empty arrays by correctly identifying empty arrays as contiguous and preventing them from being misclassified as broadcast arrays.

Added early check in dummyarray._compute_layout() to return all 0-size arrays as both C and F contiguous
Modified broadcast array detection in devicearray.__type_name__() to exclude empty arrays (size == 0) from being treated as broadcast
Added comprehensive test coverage with 8 different empty array shapes and type inference validation

Confidence Score: 5/5

This PR is safe to merge with minimal risk
The changes are minimal, well-targeted, and follow NumPy's behavior for empty arrays. The fix addresses two specific edge cases with simple boolean conditions, includes comprehensive test coverage across multiple empty array shapes, and aligns with the referenced NumPy implementation
No files require special attention

Important Files Changed

File Analysis

Filename	Score	Overview
numba_cuda/numba/cuda/cudadrv/devicearray.py	5/5	Fixed broadcast array detection to exclude empty arrays by adding `(self.size != 0)` condition
numba_cuda/numba/cuda/cudadrv/dummyarray.py	5/5	Added early return to mark all 0-size arrays as contiguous before broadcast check
numba_cuda/numba/cuda/tests/nocuda/test_dummyarray.py	5/5	Added comprehensive test coverage for empty array contiguity flags and type inference

Sequence Diagram

sequenceDiagram
    participant User
    participant NumPy
    participant cuda.to_device
    participant DeviceNDArrayBase
    participant Array as dummyarray.Array
    participant typeof

    User->>NumPy: Create empty array (e.g., shape=(10, 0))
    NumPy-->>User: Returns array with size=0, strides may contain 0
    
    User->>cuda.to_device: Transfer array to device
    cuda.to_device->>DeviceNDArrayBase: Create device array
    
    DeviceNDArrayBase->>Array: _compute_layout() to set flags
    Note over Array: Check if self.size == 0
    Array-->>DeviceNDArrayBase: Return {C_CONTIGUOUS: True, F_CONTIGUOUS: True}
    
    User->>typeof: Get type of device array
    typeof->>DeviceNDArrayBase: Call __type_name__()
    Note over DeviceNDArrayBase: broadcast = 0 in strides AND size != 0
    Note over DeviceNDArrayBase: For empty arrays: broadcast = False
    DeviceNDArrayBase->>DeviceNDArrayBase: Check flags["C_CONTIGUOUS"]
    DeviceNDArrayBase-->>typeof: Return Array(dtype, ndim, 'C')
    
    User->>User: Compare host and device types
    Note over User: Types now match: array(dtype, ndim, C)

greptile-apps

_{3 files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

gmarkall · 2025-11-27T11:00:19Z

/ok to test e4e1b57

gmarkall

Many thanks for the PR! I think the tests look great, and the fix moves things in the right direction without regressing any functionality.

Whilst examining your changes, I did become quite suspicious of the way we're handling contiguity computation (see comment on the diff). I began to believe that if we can correctly compute contiguity, then we shouldn't need to be special-casing the computation of _numba_type_ in devicearray.py on whether the array is broadcasted or not.

Furthermore, having a zero stride shouldn't also preclude contiguity (from reading the implementation and comments on the NumPy version of this functionality). It made me feel that we ought to be able to apply:

diff --git a/numba_cuda/numba/cuda/cudadrv/devicearray.py b/numba_cuda/numba/cuda/cudadrv/devicearray.py
index 188b2f5a..cfd25b5a 100644
--- a/numba_cuda/numba/cuda/cudadrv/devicearray.py
+++ b/numba_cuda/numba/cuda/cudadrv/devicearray.py
@@ -178,11 +178,9 @@ class DeviceNDArrayBase(_devicearray.DeviceArray):
         # of which will be 0, will not match those hardcoded in for 'C' or 'F'
         # layouts.
 
-        broadcast = 0 in self.strides and (self.size != 0)
-
-        if self.flags["C_CONTIGUOUS"] and not broadcast:
+        if self.flags["C_CONTIGUOUS"]:
             layout = "C"
-        elif self.flags["F_CONTIGUOUS"] and not broadcast:
+        elif self.flags["F_CONTIGUOUS"]:
             layout = "F"
         else:
             layout = "A"
diff --git a/numba_cuda/numba/cuda/cudadrv/dummyarray.py b/numba_cuda/numba/cuda/cudadrv/dummyarray.py
index 16545954..8b0515c4 100644
--- a/numba_cuda/numba/cuda/cudadrv/dummyarray.py
+++ b/numba_cuda/numba/cuda/cudadrv/dummyarray.py
@@ -275,19 +275,11 @@ class Array(object):
         # 13661ac70).
         # https://github.com/numpy/numpy/blob/maintenance/1.19.x/numpy/core/src/multiarray/flagsobject.c#L123-L191
 
+        flags = {"C_CONTIGUOUS": True, "F_CONTIGUOUS": True}
+
         # Records have no dims, and we can treat them as contiguous
         if not self.dims:
-            return {"C_CONTIGUOUS": True, "F_CONTIGUOUS": True}
-
-        # All 0-size arrays are considered contiguous, even if they are multidimensional
-        if self.size == 0:
-            return {"C_CONTIGUOUS": True, "F_CONTIGUOUS": True}
-
-        # If this is a broadcast array then it is not contiguous
-        if any([dim.stride == 0 for dim in self.dims]):
-            return {"C_CONTIGUOUS": False, "F_CONTIGUOUS": False}
-
-        flags = {"C_CONTIGUOUS": True, "F_CONTIGUOUS": True}
+            return flags
 
         # Check C contiguity
         sd = self.itemsize

to allow broadcasted arrays to also be considered contiguous. If we didn't have the check for zero stride then we wouldn't be excluding zero-size arrays from being considered contiguous.

However, this does lead to one failure in test_devicearray_broadcast_host_copy() because the array elements seem to get transposed during the copy over to the device.

I don't want to blow up the scope of this PR and stall a perfectly good fix to jump down a rabbit hole though, so I'd like to instead merge this and potentially follow up on other items later.

gmarkall · 2025-11-27T13:51:33Z

numba_cuda/numba/cuda/cudadrv/dummyarray.py

+            return {"C_CONTIGUOUS": True, "F_CONTIGUOUS": True}
+
        # If this is a broadcast array then it is not contiguous
        if any([dim.stride == 0 for dim in self.dims]):


I noted that the NumPy implementation that this is following (from the _UpdateContiguousFlags implementation referenced above) doesn't have this check for zero strides.

It's not directly related to this PR, but I think it is suspicious that we still differ in our implementation.

Aims to align more closely with NumPy contiguity logic. The example in the commit message from NVIDIA#612 still runs correctly with this change. I think this needs a little more consideration for now.

- Revert NVIDIA#536 "perf: remove context threading in various pointer abstractions" (NVIDIA#611) - fix: empty array type mismatch between host and device (NVIDIA#612) - fix: warp vote operations must use a constant int for the `mode` parameter (NVIDIA#606)

- Revert #536 "perf: remove context threading in various pointer abstractions" (#611) - fix: empty array type mismatch between host and device (#612) - fix: warp vote operations must use a constant int for the `mode` parameter (#606)

fix: 0-size arrays are contiguous

098fbe0

kaeun97 force-pushed the kaeun97/fix-empty-device-array branch from f0ea6c7 to 098fbe0 Compare November 27, 2025 01:10

kaeun97 added 2 commits November 27, 2025 01:11

fix: empty arrays incorrectly typed as broadcast arrays

3e61d1e

feat: add test for empty array

e4e1b57

kaeun97 changed the title ~~fix: 0-size arrays are contiguous~~ fix: Pass NULL pointer for cffi.FFI().from_buffer(empty array) Nov 27, 2025

kaeun97 changed the title ~~fix: Pass NULL pointer for cffi.FFI().from_buffer(empty array)~~ fix: empty array type mismatch between host and device Nov 27, 2025

kaeun97 marked this pull request as ready for review November 27, 2025 01:22

greptile-apps bot reviewed Nov 27, 2025

View reviewed changes

gmarkall added the 3 - Ready for Review Ready for review by team label Nov 27, 2025

gmarkall approved these changes Nov 27, 2025

View reviewed changes

gmarkall merged commit 835845c into NVIDIA:main Nov 27, 2025
71 checks passed

gmarkall mentioned this pull request Nov 27, 2025

{WIP] Follow-on from #612: try to align with NumPy #613

Open

gmarkall mentioned this pull request Nov 27, 2025

Bump version to 0.21.2 #614

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: empty array type mismatch between host and device #612

fix: empty array type mismatch between host and device #612

Uh oh!

kaeun97 commented Nov 27, 2025 •

edited

Loading

Uh oh!

copy-pr-bot bot commented Nov 27, 2025

Uh oh!

greptile-apps bot commented Nov 27, 2025

Uh oh!

greptile-apps bot left a comment

Uh oh!

gmarkall commented Nov 27, 2025

Uh oh!

gmarkall left a comment

Uh oh!

gmarkall Nov 27, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix: empty array type mismatch between host and device #612

fix: empty array type mismatch between host and device #612

Uh oh!

Conversation

kaeun97 commented Nov 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

copy-pr-bot bot commented Nov 27, 2025

Uh oh!

greptile-apps bot commented Nov 27, 2025

Greptile Overview

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

gmarkall commented Nov 27, 2025

Uh oh!

gmarkall left a comment

Choose a reason for hiding this comment

Uh oh!

gmarkall Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kaeun97 commented Nov 27, 2025 •

edited

Loading