[BugFix] Add int64_t support for AtomicAdd #1716

coderabbitai · 2026-01-22T10:31:44Z

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, let's look at the structure of the atomic.h file wc -l src/tl_templates/cuda/atomic.h

Repository: tile-ai/tilelang

Length of output: 94

🏁 Script executed:

# Read the relevant sections of the file to understand the implementation cat -n src/tl_templates/cuda/atomic.h | head -200

Repository: tile-ai/tilelang

Length of output: 8917

🏁 Script executed:

# Search for AtomicMax and AtomicMin definitions and their usage of normalize_atomic_type rg -n "normalize_atomic_type|AtomicMax|AtomicMin" src/tl_templates/cuda/atomic.h -A 5

Repository: tile-ai/tilelang

Length of output: 3496

🏁 Script executed:

# Search for usages of AtomicMax and AtomicMin to see if they're called with int64_t rg -n "AtomicMax|AtomicMin" --type cpp --type py -B 2 -A 2 | grep -v "^src/tl_templates/cuda/atomic.h" | head -100

Repository: tile-ai/tilelang

Length of output: 7421

🏁 Script executed:

# Look for any comments in atomic.h explaining the int64_t design choice cat src/tl_templates/cuda/atomic.h | grep -A 5 -B 5 "int64_t"

Repository: tile-ai/tilelang

Length of output: 333

🏁 Script executed:

# Check git history to understand why this mapping was added git log -p --all -- src/tl_templates/cuda/atomic.h | grep -A 10 -B 10 "int64_t" | head -60

Repository: tile-ai/tilelang

Length of output: 42

🏁 Script executed:

# Search for tests or documentation about int64_t atomic operations find . -name "*.py" -o -name "*.md" -o -name "*.txt" | xargs grep -l "int64.*atomic\|atomic.*int64" 2>/dev/null | head -20

Repository: tile-ai/tilelang

Length of output: 42

🏁 Script executed:

# Check if there are any CUDA driver/runtime constraints on int64_t atomics # Look for comments in the code about why int64_t is normalized to unsigned long long rg -n "normalize_atomic_type|int64.*unsigned" src/ -B 3 -A 3 | grep -A 5 -B 5 "int64"

Repository: tile-ai/tilelang

Length of output: 5124

🏁 Script executed:

# Check what versions of CUDA atomic support int64_t # Look at the CUDART_VERSION guards in atomic.h cat -n src/tl_templates/cuda/atomic.h | sed -n '28,80p'

Repository: tile-ai/tilelang

Length of output: 2259

🏁 Script executed:

# Search for test files that might use atomic operations find . -type f \( -name "*test*.py" -o -name "*test*.cc" -o -name "*test*.h" \) | head -20

Repository: tile-ai/tilelang

Length of output: 1074

🏁 Script executed:

# Check HIP atomic header to see how it handles int64_t cat -n src/tl_templates/hip/atomic.h | head -100

Repository: tile-ai/tilelang

Length of output: 4870

🏁 Script executed:

# Search for any recent commits or PRs that added int64_t normalization git log --all --oneline -- src/tl_templates/cuda/atomic.h | head -20

Repository: tile-ai/tilelang

Length of output: 208

🏁 Script executed:

# Look at the specific commit that added int64_t support git show c11e07d --stat

Repository: tile-ai/tilelang

Length of output: 50372

🏁 Script executed:

# Get the full details of what changed in that commit git show c11e07d

Repository: tile-ai/tilelang

Length of output: 50376

🏁 Script executed:

# Look at the atomic test file to see what's being tested cat -n testing/python/language/test_tilelang_language_atomic.py | grep -A 5 -B 5 "int64\|AtomicMax\|AtomicMin" | head -150

Repository: tile-ai/tilelang

Length of output: 42

🏁 Script executed:

# Search more broadly for actual usage of AtomicMax/Min with int64_t rg "AtomicMax|AtomicMin" testing/ -A 3 -B 3 | grep -i "int64\|i64" -A 5 -B 5 | head -80

Repository: tile-ai/tilelang

Length of output: 42

🏁 Script executed:

# Search for any actual usage of int64_t with atomic operations rg "int64.*Atomic|Atomic.*int64" . -g "!.git" --type cpp --type py | head -30

Repository: tile-ai/tilelang

Length of output: 42

🏁 Script executed:

# Check if there are any tests that combine int64 and atomic operations rg "test.*atomic" . -i --type py | xargs grep -l "int64\|i64" | head -10

Repository: tile-ai/tilelang

Length of output: 2946

The mapping is correct for AtomicAdd, but introduces incorrect semantics for AtomicMax/AtomicMin if used with int64_t.

The normalize_atomic_type<int64_t> specialization maps to unsigned long long for AtomicAdd, which works correctly due to two's complement arithmetic. However, AtomicMax and AtomicMin use cuda::atomic_ref<NT1>::fetch_max() / fetch_min(), which will perform unsigned comparisons instead of signed comparisons when NT1 is unsigned long long.

Example: AtomicMax(&val, 1) where val = int64_t(-1) should return 1, but unsigned comparison yields -1 as the maximum (since 0xFFFFFFFFFFFFFFFF > 1 in unsigned arithmetic).

Since no tests or codebase usage of AtomicMax/AtomicMin with int64_t was found, this appears to be an unsupported use case. Add a comment documenting this limitation:

template <> struct normalize_atomic_type<int64_t> { // Note: This mapping is valid for AtomicAdd (two's complement arithmetic), // but AtomicMax/AtomicMin with int64_t will use unsigned comparison semantics // and are not supported. using type = unsigned long long; };

🤖 Prompt for AI Agents

In `@src/tl_templates/cuda/atomic.h` around lines 34 - 36, Update the normalize_atomic_type<int64_t> specialization to document the signed/unsigned semantics: add a comment above template <> struct normalize_atomic_type<int64_t> stating that mapping int64_t to unsigned long long is done to support AtomicAdd (two's complement) but that AtomicMax/AtomicMin (which call cuda::atomic_ref<NT1>::fetch_max()/fetch_min()) will perform unsigned comparisons and therefore int64_t usage with AtomicMax/AtomicMin is not supported; keep the using type = unsigned long long unchanged.

-Original file line number
+Diff line change
@@ Expand Up / @@ -31,6 +31,10 @@ template <> struct normalize_atomic_type<bfloat16_t> { @@
     };
     #endif
+    template <> struct normalize_atomic_type<int64_t> {
+      using type = unsigned long long;
+    };
     template <typename T1, typename T2> TL_DEVICE T1 cuda_cast(T2 val) {
       return T1(val);
     }
@@ Expand Down @@

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BugFix] Add int64_t support for AtomicAdd #1716

Diff view

Diff view

There are no files selected for viewing

coderabbitai bot Jan 22, 2026

Uh oh!

Uh oh!

[BugFix] Add int64_t support for AtomicAdd #1716

[BugFix] Add int64_t support for AtomicAdd #1716

Uh oh!

Uh oh!

Diff view

Diff view

There are no files selected for viewing

coderabbitai bot Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!