[BugFix] Fix FP4 related vectorized cast by chaospointer · Pull Request #1741 · tile-ai/tilelang

chaospointer · 2026-01-27T09:04:58Z

Summary by CodeRabbit

Release Notes

Improvements
- Extended CUDA casting support for additional floating-point type combinations, including float16, float32, bfloat16, and FP8 formats.
- Added explicit validation for type conversion operations to ensure correctness.
Tests
- Reorganized floating-point conversion tests into dedicated test groups for improved coverage of specialized type conversion scenarios.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

github-actions · 2026-01-27T09:05:08Z

👋 Hi! Thank you for contributing to the TileLang project.

Please remember to run pre-commit run --all-files in the root directory of the project to ensure your changes are properly linted and formatted. This will help ensure your contribution passes the format check.

We appreciate you taking this step! Our team will review your contribution, and we look forward to your awesome work! 🚀

coderabbitai · 2026-01-27T09:05:17Z

📝 Walkthrough

Walkthrough

Narrowed vectorized FP8/FP4 cast paths by adding explicit 32-bit width checks to casting logic in CUDA codegen, while expanding type-casting rules in utilities to support additional float type conversions. Test suite reorganized to isolate FP8 conversion tests into dedicated blocks.

Changes

Cohort / File(s)	Summary
CUDA Vectorization Logic `src/target/codegen_cuda.cc`, `src/target/utils.cc`	Tightened FP8/FP4 vectorized cast paths with explicit 32-bit width checks on source/target types; expanded IsCudaVectorizableCast to support float16 ↔ float32, bfloat16 ↔ float32, FP8 (E4M3/E5M2/E8M0), and FP4 conversions with appropriate bit-width guards.
Test Suite Reorganization `testing/python/language/test_tilelang_language_vectorized_cast.py`	Removed FP8 conversions from main test_vectorized_cast parameterization and relocated to dedicated FP8-focused test blocks for clearer test organization.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~28 minutes

Possibly related PRs

[Feature] Enhance vectorized conversion support in CUDA codegen #1095 — Directly tightens 32-bit width guards and expands IsCudaVectorizableCast for FP8/FP4 vectorized casts in the same codepaths.
[Feature] Support E8M0 related type conversion and vectorized cast #1731 — Modifies CUDA FP8 conversion handling in codegen_cuda.cc and utils.cc with FP8 (E8M0) support.
[Dtype] Improve host codegen handling for subtype #1517 — Adds 32-bit bitwidth guards and FP4/FP8 vectorized cast paths in the same cast handling code.

Suggested reviewers

LeiWang1999
xwhzz
tzj-fxz

Poem

🐰 Precision paths with guards so tight,
Floats and ints aligned just right,
Thirty-two bits, no more, no less,
Vectorized casts pass the test! ✨

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title '[BugFix] Fix FP4 related vectorized cast' accurately describes the main change: narrowing and expanding vectorized cast paths for FP4 and related float types with explicit bit-width checks.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

SiriusNEO

LGTM. BTW, I think we need methods like is_float32() and is_float64() in tvm DataType.

[BugFix] Fix FP4 related vectorized cast

c5f6b1e

SiriusNEO requested review from LJC00118 and SiriusNEO January 27, 2026 09:06

SiriusNEO approved these changes Jan 27, 2026

View reviewed changes

LeiWang1999 approved these changes Jan 27, 2026

View reviewed changes

LeiWang1999 merged commit f5525ea into tile-ai:main Jan 27, 2026
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BugFix] Fix FP4 related vectorized cast#1741

[BugFix] Fix FP4 related vectorized cast#1741
LeiWang1999 merged 1 commit intotile-ai:mainfrom
chaospointer:fix-fp4-0127

chaospointer commented Jan 27, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

github-actions bot commented Jan 27, 2026

Uh oh!

coderabbitai bot commented Jan 27, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

SiriusNEO left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

chaospointer commented Jan 27, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

github-actions bot commented Jan 27, 2026

Uh oh!

coderabbitai bot commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

SiriusNEO left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

chaospointer commented Jan 27, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 27, 2026 •

edited

Loading