[Bugfix] Add predicate to loads inside predicated stores in LowerLDGSTG pass#1767
Conversation
- Introduced a new configuration option to disable loop unswitching. - Enhanced the lowering logic to ensure that loads within predicated stores are also treated as predicated, preventing out-of-bounds memory access. - Added a test case to verify the correct behavior of predicated loads and stores in the transformation pipeline. - Included debug print statements to visualize the module state before and after the lowering process.
|
👋 Hi! Thank you for contributing to the TileLang project. Please remember to run We appreciate you taking this step! Our team will review your contribution, and we look forward to your awesome work! 🚀 |
📝 WalkthroughWalkthroughThe PR adds a new loop unswitching pass configuration flag and enhances the lower_ldg_stg transformation to support predicated load operations within predicated store contexts, along with a corresponding test to validate the behavior. Changes
Sequence DiagramsequenceDiagram
participant BufferLoad as BufferLoad Visitor
participant Ctx as Predicate Context
participant LDG as LDG Lowering
participant Store as Predicated Store
BufferLoad->>Ctx: Check current_predicate_
alt Predicated Context Active
Ctx-->>BufferLoad: predicate exists
BufferLoad->>LDG: Route to LowerToLDGPredicated
LDG-->>BufferLoad: Predicated load instruction
else Non-Predicated Context
Ctx-->>BufferLoad: predicate is null
BufferLoad->>LDG: Route to LowerToLDG
LDG-->>BufferLoad: Standard load instruction
end
Store->>Ctx: Save and set current_predicate_
Store->>BufferLoad: Evaluate store value (with nested loads)
BufferLoad->>Ctx: Check updated predicate context
BufferLoad->>LDG: Nested load uses predicated path
Ctx->>Store: Restore previous predicate context
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary
if (pred) { B[i] = A[i] }is lowered, both load and store should use predicated versions to avoid out-of-bounds memory access whenpredis falseProblem
Before this fix, the generated CUDA code was:
When
predis false, the load could access invalid memory addresses (e.g., negative offsets).Solution
Introduce a
current_predicate_context variable inLowerLDGSTGRewriterto track when we're inside a predicated store. When processing loads in this context, use the predicated load version.After fix:
Test plan
test_predicated_store_with_loadtest case🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Improvements
Tests