Skip to content

[Bugfix] Fix incorrect alignment of vectorized subtype#1726

Merged
LeiWang1999 merged 1 commit intotile-ai:mainfrom
LeiWang1999:align_0123
Jan 23, 2026
Merged

[Bugfix] Fix incorrect alignment of vectorized subtype#1726
LeiWang1999 merged 1 commit intotile-ai:mainfrom
LeiWang1999:align_0123

Conversation

@LeiWang1999
Copy link
Member

@LeiWang1999 LeiWang1999 commented Jan 23, 2026

This pull request makes a minor change to the memory alignment of the fp4_e2_4_t struct in cuda_fp4.h, reducing its alignment from 4 bytes to 2 bytes. This may improve memory efficiency or compatibility with certain data layouts.

  • Reduced the alignment of the fp4_e2_4_t struct from 4 bytes to 2 bytes in cuda_fp4.h.

Summary by CodeRabbit

  • Bug Fixes
    • Optimized memory alignment for 4-bit floating-point data types to improve performance and memory efficiency in CUDA operations.

✏️ Tip: You can customize this high-level summary in your review settings.

@github-actions
Copy link

👋 Hi! Thank you for contributing to the TileLang project.

Please remember to run pre-commit run --all-files in the root directory of the project to ensure your changes are properly linted and formatted. This will help ensure your contribution passes the format check.

We appreciate you taking this step! Our team will review your contribution, and we look forward to your awesome work! 🚀

@LeiWang1999 LeiWang1999 merged commit 5fe8b84 into tile-ai:main Jan 23, 2026
3 of 4 checks passed
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 23, 2026

Caution

Review failed

The pull request is closed.

📝 Walkthrough

Walkthrough

The alignment qualifier of the fp4_e2_4_t struct is reduced from 4 bytes to 2 bytes in the CUDA FP4 header file. No functional logic, constructors, or methods are affected.

Changes

Cohort / File(s) Summary
CUDA FP4 Alignment
src/tl_templates/cuda/cuda_fp4.h
Reduced struct alignment from __CUDA_ALIGN__(4) to __CUDA_ALIGN__(2) for fp4_e2_4_t

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Possibly related PRs

Poem

🐰 A struct stands tall with shoulders wide,
Four bytes of padding, side by side.
But now we squeeze with gentle care—
Two bytes align, more compact there! ✨

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants