[Compat] Add CUDA version check for __nv_fp8_e8m0 type by LeiWang1999 · Pull Request #1537 · tile-ai/tilelang

LeiWang1999 · 2025-12-25T14:57:57Z

Summary

Add conditional compilation for __nv_fp8_e8m0 type which is only available in CUDA 12.6+
Provide a placeholder struct for older CUDA versions to maintain compatibility
Define TL_HAS_FP8_E8M0 macro to allow runtime feature detection

Test plan

Compile with CUDA < 12.6 to verify no __nv_fp8_e8m0 undefined error
Compile with CUDA >= 12.6 to verify native type is used

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Added support for CUDA fp8_e8_t data type with conditional availability based on CUDA version (12.6+).
- Introduced feature detection capability to identify fp8_e8m0 support availability.
- Implemented backward compatibility for CUDA versions prior to 12.6 with appropriate fallback mechanisms.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

__nv_fp8_e8m0 is only available in CUDA 12.6+. Add conditional compilation to provide a placeholder struct for older CUDA versions. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

github-actions · 2025-12-25T14:58:05Z

👋 Hi! Thank you for contributing to the TileLang project.

Please remember to run pre-commit run --all-files in the root directory of the project to ensure your changes are properly linted and formatted. This will help ensure your contribution passes the format check.

We appreciate you taking this step! Our team will review your contribution, and we look forward to your awesome work! 🚀

coderabbitai · 2025-12-25T14:58:10Z

📝 Walkthrough

Walkthrough

Introduces conditional CUDA fp8_e8_t type support in the cuda_fp8.h header. For CUDA 12.6+, defines fp8_e8_t as an alias to __nv_fp8_e8m0 with a public TL_HAS_FP8_E8M0 macro set to 1. For earlier CUDA versions, provides a placeholder struct and sets the macro to 0.

Changes

Cohort / File(s)	Summary
CUDA FP8 Type Definitions `src/tl_templates/cuda/cuda_fp8.h`	Added conditional typedef for fp8_e8_t aliasing __nv_fp8_e8m0 (CUDA ≥12.6), placeholder struct for earlier versions, and public TL_HAS_FP8_E8M0 macro (1 or 0 depending on CUDA version support).

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~5 minutes

Possibly related PRs

[Compatibility] Support CUDA 11.3 #1290 — Adds conditional compilation guards and includes around cuda_fp8.h and FP8 specializations, complementary to this PR's type definitions.

Poem

🐰 A tiny hop in CUDA's land,

fp8_e8 now in our hand,

Version checks, macros stand tall,

New precision for one and all! ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title '[Compat] Add CUDA version check for __nv_fp8_e8m0 type' directly and clearly describes the main change: adding conditional compilation support for a CUDA type based on version compatibility.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between dff10e5 and d46bd11.

📒 Files selected for processing (1)

src/tl_templates/cuda/cuda_fp8.h

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-11-03T06:24:11.411Z

Learnt from: Rachmanino
Repo: tile-ai/tilelang PR: 1175
File: src/op/math.cc:44-52
Timestamp: 2025-11-03T06:24:11.411Z
Learning: In tilelang's `src/op/math.cc`, the `infinity_op` function uses `std::numeric_limits<float>::infinity()` as a placeholder for all float types (including float64 and bfloat16). The codegen layer (PrintConst:Inf) handles the correct infinity value based on the dtype field of the FloatImm node, so the specific C++ template argument doesn't matter.

Applied to files:

src/tl_templates/cuda/cuda_fp8.h

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: Test for Python 3.12 with Nightly-ROCm-7.1 (on self-hosted-amd)
GitHub Check: Test for Python 3.12 with CUDA-12.8 (on self-hosted-nvidia)
GitHub Check: Test for Python 3.12 with Metal (on macos-latest)

coderabbitai · 2025-12-25T15:01:51Z

src/tl_templates/cuda/cuda_fp8.h

+// __nv_fp8_e8m0 is only available in CUDA 12.6+
+#if __CUDACC_VER_MAJOR__ > 12 ||                                               \
+    (__CUDACC_VER_MAJOR__ == 12 && __CUDACC_VER_MINOR__ >= 6)
 using fp8_e8_t = __nv_fp8_e8m0;
+#define TL_HAS_FP8_E8M0 1
+#else
+// Placeholder for CUDA < 12.6
+struct fp8_e8_t {
+  unsigned char data;
+};
+#define TL_HAS_FP8_E8M0 0
+#endif


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Verify TL_HAS_FP8_E8M0 usage across the codebase echo "=== Searching for TL_HAS_FP8_E8M0 usage (excluding cuda_fp8.h) ===" rg -n 'TL_HAS_FP8_E8M0' --type-list | head -1 rg -n 'TL_HAS_FP8_E8M0' | grep -v 'src/tl_templates/cuda/cuda_fp8.h' echo -e "\n=== Searching for fp8_e8_t usage patterns ===" rg -nP 'fp8_e8_t' --type c --type cpp --type h -C 2 echo -e "\n=== Searching for __nv_fp8_e8m0 usage ===" rg -nP '__nv_fp8_e8m0' --type c --type cpp --type h -C 2

Repository: tile-ai/tilelang

Length of output: 6603

🏁 Script executed:

#!/bin/bash # More thorough search for TL_HAS_FP8_E8M0 across all file types echo "=== All occurrences of TL_HAS_FP8_E8M0 ===" rg -r 'TL_HAS_FP8_E8M0' --type-add 'header:*.h' --type-add 'source:*.cc' --type-add 'cuda:*.cu' --type-add 'cpp:*.cpp' echo -e "\n=== File count containing TL_HAS_FP8_E8M0 ===" rg -l 'TL_HAS_FP8_E8M0' | wc -l echo -e "\n=== Check if hip_fp8.h defines its own fp8_e8_t ===" head -30 src/tl_templates/hip/hip_fp8.h

Repository: tile-ai/tilelang

Length of output: 1427

🏁 Script executed:

#!/bin/bash # Simpler search for TL_HAS_FP8_E8M0 usage echo "=== All lines containing TL_HAS_FP8_E8M0 ===" rg 'TL_HAS_FP8_E8M0' echo -e "\n=== Check for conditional compilation using this macro ===" rg '#if.*TL_HAS_FP8_E8M0|#ifdef.*TL_HAS_FP8_E8M0' echo -e "\n=== Verify placeholder struct size at compile time ===" echo "Note: unsigned char is guaranteed to be 1 byte by C++ standard" echo "sizeof(unsigned char) == 1 (C++ standard guaranteed)"

Repository: tile-ai/tilelang

Length of output: 454

Conditional compilation approach is sound, but the TL_HAS_FP8_E8M0 macro is unused dead code.

The version check correctly identifies CUDA 12.6+, and the placeholder struct is properly sized—unsigned char is guaranteed to be 1 byte by the C++ standard, matching __nv_fp8_e8m0. All operations on fp8_e8_t in the codebase are simple assignments and member access, which are compatible with both the native type and the placeholder struct.

However, the TL_HAS_FP8_E8M0 macro is defined but never referenced anywhere in the codebase. Either remove this macro if feature detection is not needed, or integrate it by using #if TL_HAS_FP8_E8M0 to gate any fp8_e8m0-specific operations (conversions, intrinsics, etc.) that should only run on CUDA 12.6+.

coderabbitai bot reviewed Dec 25, 2025

View reviewed changes

LeiWang1999 merged commit d219f6c into tile-ai:main Dec 25, 2025
7 checks passed

coderabbitai bot mentioned this pull request Dec 29, 2025

[BUG] compilation error when running deepseekv32/sparse_mla_fwd.py under cuda12.1 #1564

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Compat] Add CUDA version check for __nv_fp8_e8m0 type#1537

[Compat] Add CUDA version check for __nv_fp8_e8m0 type#1537
LeiWang1999 merged 1 commit intotile-ai:mainfrom
LeiWang1999:fix/cuda-fp8-e8m0-compat

LeiWang1999 commented Dec 25, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Dec 25, 2025

Uh oh!

coderabbitai bot commented Dec 25, 2025 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Dec 25, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

LeiWang1999 commented Dec 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

Uh oh!

github-actions bot commented Dec 25, 2025

Uh oh!

coderabbitai bot commented Dec 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 25, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

LeiWang1999 commented Dec 25, 2025 •

edited

Loading

coderabbitai bot commented Dec 25, 2025 •

edited

Loading