[BugFix] Change default rounding mode for fp4 conversions #1580

LJC00118 · 2025-12-31T07:23:01Z

Summary by CodeRabbit

Bug Fixes
- Improved FP4 conversion accuracy by updating rounding behavior to use nearest-value rounding instead of truncation for multiple data types, resulting in more precise numerical conversions.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

github-actions · 2025-12-31T07:23:11Z

👋 Hi! Thank you for contributing to the TileLang project.

Please remember to run pre-commit run --all-files in the root directory of the project to ensure your changes are properly linted and formatted. This will help ensure your contribution passes the format check.

We appreciate you taking this step! Our team will review your contribution, and we look forward to your awesome work! 🚀

coderabbitai · 2025-12-31T07:23:15Z

📝 Walkthrough

Walkthrough

This change updates CUDA rounding behavior in FP4 conversion functions within a single header file. The rounding mode is replaced from cudaRoundZero (truncation) to cudaRoundNearest across eight conversion wrapper pairs, affecting numerical precision characteristics during type conversions.

Changes

Cohort / File(s)	Summary
FP4 Rounding Mode Updates `src/tl_templates/cuda/cuda_fp4.h`	Replaced `cudaRoundZero` with `cudaRoundNearest` in 8 conversion wrapper pairs: `half↔fp4`, `half2↔fp4x2`, `float↔fp4`, `float2↔fp4x2`, `double↔fp4`, `double2↔fp4x2`, `bfloat16↔fp4`, `bfloat162↔fp4x2`. Changes numerical behavior from truncation toward zero to nearest-value rounding.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐰 The rounding hops from truncation's way,
To nearest values, bright as day,
Eight conversions dance in sync so true,
Where zero yields to rounding new! 🌟

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly describes the main change: switching the CUDA rounding mode for fp4 conversions from truncation to nearest-neighbor rounding.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate docstrings

📜 Recent review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 53ea96c and d65f516.

📒 Files selected for processing (1)

src/tl_templates/cuda/cuda_fp4.h

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Test for Python 3.12 with Metal (on macos-latest)

🔇 Additional comments (5)

src/tl_templates/cuda/cuda_fp4.h (5)

209-216: LGTM! Consistent rounding mode change for float conversions.

The rounding mode change is applied consistently to both scalar and vector float-to-FP4 conversions, matching the pattern used in half conversions.

237-244: LGTM! Consistent rounding mode change for double conversions.

The rounding mode change is applied consistently to both scalar and vector double-to-FP4 conversions.

263-273: LGTM! Consistent rounding mode change for bfloat16 conversions.

The rounding mode change is applied consistently to both scalar and vector bfloat16-to-FP4 conversions, completing the uniform update across all conversion types.

179-273: Test coverage for FP4 rounding mode change is incomplete.

While tests exist in testing/python/language/test_tilelang_language_vectorized_cast.py that verify the conversion functions are called, they explicitly skip numerical validation for FP4 types (line 61-62 early return). All other data types are validated with torch.testing.assert_close(), but FP4 is excluded. Given that this is a behavioral change affecting numerical results (rounding mode from cudaRoundZero to cudaRoundNearest), add numerical validation tests for FP4 conversions to ensure the new rounding behavior produces expected results.

179-188: FP4 conversion rounding mode change is properly supported by CUDA API.

The change from cudaRoundZero to cudaRoundNearest is valid and consistently applied across all 8 conversion functions (half, float, double, and bfloat16 scalars and vectors). cudaRoundNearest implements round-to-nearest-even, which is the IEEE 754 default and improves numerical accuracy compared to truncation-toward-zero. The CUDA Math API fully supports cudaRoundNearest as a valid rounding mode for all __nv_cvt_*_to_fp4* intrinsics.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Change default rounding mode for fp4 conversions

d65f516

LeiWang1999 approved these changes Dec 31, 2025

View reviewed changes

LeiWang1999 merged commit 15c457f into tile-ai:main Dec 31, 2025
7 checks passed

kurisu6912 mentioned this pull request Feb 11, 2026

[LoopVectorize] Loop Independent Var Optimization in IfThenElse Expr kurisu6912/tilelang#2

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BugFix] Change default rounding mode for fp4 conversions #1580

[BugFix] Change default rounding mode for fp4 conversions #1580

Uh oh!

LJC00118 commented Dec 31, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

github-actions bot commented Dec 31, 2025

Uh oh!

coderabbitai bot commented Dec 31, 2025 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[BugFix] Change default rounding mode for fp4 conversions #1580

[BugFix] Change default rounding mode for fp4 conversions #1580

Uh oh!

Conversation

LJC00118 commented Dec 31, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

github-actions bot commented Dec 31, 2025

Uh oh!

coderabbitai bot commented Dec 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

LJC00118 commented Dec 31, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 31, 2025 •

edited

Loading