-
Notifications
You must be signed in to change notification settings - Fork 450
fix(intrinsics): add missing _legalize_to_buffer_region in SM70 emitter #1786
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
👋 Hi! Thank you for contributing to the TileLang project. Please remember to run We appreciate you taking this step! Our team will review your contribution, and we look forward to your awesome work! 🚀 |
📝 WalkthroughWalkthroughA new static method Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related issues
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
There are too many questions about "pytest", and I'm not sure how to solve them. |
|
@Coloured-glaze Thanks, that makes sense. Just a note: some tests haven't been run on sm70, as we are currently using sm90 for testing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Adds the missing _legalize_to_buffer_region helper to the SM70 Tensor Core intrinsic emitter so that Volta/SM70 GEMM lowering no longer fails with AttributeError and behaves consistently with the generic MMA and MFMA emitters.
Changes:
- Add the necessary imports (
tvm.tirastir,Range,BufferLoad, andget_buffer_region_from_load) to support buffer-region legalization. - Implement
TensorCoreIntrinEmitter._legalize_to_buffer_regioninmma_sm70_macro_generator.py, matching the semantics used inmma_macro_generator.pyandmfma_macro_generator.pyand wiring it intoldmatrix_a/ldmatrix_b.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@tilelang/intrinsics/mma_sm70_macro_generator.py`:
- Around line 512-523: The code builds BufferRegion ranges with hard-coded
tir.IntImm("int32", ...) causing mixed dtypes; update the mins/extents to use
tir.const(value, dtype) (or equivalent) using the actual dtype of each
shape/index element so types match TIR (e.g., when constructing mins from
obj.shape use each shape element's dtype, and when constructing ones for indices
use the dtype of obj.indices elements); update both the BufferRegion(obj,
ranges) path (where mins come from obj.shape) and the BufferRegion(obj.buffer,
ranges) fallback for BufferLoad so Range.from_min_extent receives matching-dtype
min and extent values (refer to BufferRegion, Range.from_min_extent, BufferLoad,
get_buffer_region_from_load, obj.shape and obj.indices).
#1782
device1 simple test:
device2 simple test :
pytest testing/python/kernel/ :

https://github.com/tile-ai/tilelang/blob/main/CONTRIBUTING.md#test-locally
Summary by CodeRabbit