Skip to content

[Bugfix] Add predicate to loads inside predicated stores in LowerLDGSTG pass#1767

Merged
LeiWang1999 merged 2 commits intotile-ai:mainfrom
LeiWang1999:fix_predicated_ldg_0202
Feb 2, 2026
Merged

[Bugfix] Add predicate to loads inside predicated stores in LowerLDGSTG pass#1767
LeiWang1999 merged 2 commits intotile-ai:mainfrom
LeiWang1999:fix_predicated_ldg_0202

Conversation

@LeiWang1999
Copy link
Member

@LeiWang1999 LeiWang1999 commented Feb 2, 2026

Summary

  • Fix a bug where loads inside predicated stores were not using predicated load intrinsics
  • When pattern if (pred) { B[i] = A[i] } is lowered, both load and store should use predicated versions to avoid out-of-bounds memory access when pred is false

Problem

Before this fix, the generated CUDA code was:

ulonglong4 v_ = tl::load_global_256(&...);  // unconditional load - may access invalid memory!
tl::store_global_256_conditional(..., pred);  // conditional store

When pred is false, the load could access invalid memory addresses (e.g., negative offsets).

Solution

Introduce a current_predicate_ context variable in LowerLDGSTGRewriter to track when we're inside a predicated store. When processing loads in this context, use the predicated load version.

After fix:

ulonglong4 v_ = tl::load_global_256_conditional(&..., pred);  // conditional load
tl::store_global_256_conditional(..., pred);  // conditional store

Test plan

  • Added test_predicated_store_with_load test case
  • All existing tests pass

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added configuration option to disable loop unswitching in compilation optimization passes
  • Improvements

    • Enhanced support for predicated memory operations: loads and stores now correctly handle nested conditional contexts, improving both correctness and performance for conditional memory access patterns
  • Tests

    • Expanded test coverage for complex predicated memory operations

- Introduced a new configuration option to disable loop unswitching.
- Enhanced the lowering logic to ensure that loads within predicated stores are also treated as predicated, preventing out-of-bounds memory access.
- Added a test case to verify the correct behavior of predicated loads and stores in the transformation pipeline.
- Included debug print statements to visualize the module state before and after the lowering process.
@github-actions
Copy link

github-actions bot commented Feb 2, 2026

👋 Hi! Thank you for contributing to the TileLang project.

Please remember to run pre-commit run --all-files in the root directory of the project to ensure your changes are properly linted and formatted. This will help ensure your contribution passes the format check.

We appreciate you taking this step! Our team will review your contribution, and we look forward to your awesome work! 🚀

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 2, 2026

📝 Walkthrough

Walkthrough

The PR adds a new loop unswitching pass configuration flag and enhances the lower_ldg_stg transformation to support predicated load operations within predicated store contexts, along with a corresponding test to validate the behavior.

Changes

Cohort / File(s) Summary
Pass Configuration
src/op/builtin.cc
Registers a new pass config option kDisableLoopUnswitching of type Bool for controlling loop unswitching behavior.
Predicated Load/Store Lowering
src/transform/lower_ldg_stg.cc
Adds predicate context tracking via current_predicate_ member variable. Enables predicated load paths (LowerToLDGPredicated) for both vectorized (Ramp) and single-element loads when within a predicated context. Propagates predicate context through nested loads within predicated stores.
Test Coverage
testing/python/transform/test_tilelang_transform_lower_ldgstg.py
Adds new test test_predicated_store_with_load that validates predicated stores containing loads generate both predicated load (ldg128) and predicated store (stg128) intrinsics.

Sequence Diagram

sequenceDiagram
    participant BufferLoad as BufferLoad Visitor
    participant Ctx as Predicate Context
    participant LDG as LDG Lowering
    participant Store as Predicated Store
    
    BufferLoad->>Ctx: Check current_predicate_
    alt Predicated Context Active
        Ctx-->>BufferLoad: predicate exists
        BufferLoad->>LDG: Route to LowerToLDGPredicated
        LDG-->>BufferLoad: Predicated load instruction
    else Non-Predicated Context
        Ctx-->>BufferLoad: predicate is null
        BufferLoad->>LDG: Route to LowerToLDG
        LDG-->>BufferLoad: Standard load instruction
    end
    
    Store->>Ctx: Save and set current_predicate_
    Store->>BufferLoad: Evaluate store value (with nested loads)
    BufferLoad->>Ctx: Check updated predicate context
    BufferLoad->>LDG: Nested load uses predicated path
    Ctx->>Store: Restore previous predicate context
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Poem

🐰 With predicates in paw, loads now dance with care,
Nested stores whisper secrets through the looping air,
Unswitching loops, predicated paths so bright,
This rabbit's transformation brings vectorized light! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 55.56% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately captures the main bug fix: enabling predicated loads inside predicated stores in the LowerLDGSTG pass, which is the primary focus of the file changes.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@LeiWang1999 LeiWang1999 merged commit c4748da into tile-ai:main Feb 2, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant