Skip to content

[Analysis] Enhance NestedLoopChecker with tile op cases#1358

Merged
LeiWang1999 merged 2 commits intotile-ai:mainfrom
SiriusNEO:chaofan/fix2
Dec 1, 2025
Merged

[Analysis] Enhance NestedLoopChecker with tile op cases#1358
LeiWang1999 merged 2 commits intotile-ai:mainfrom
SiriusNEO:chaofan/fix2

Conversation

@SiriusNEO
Copy link
Collaborator

@SiriusNEO SiriusNEO commented Dec 1, 2025

This PR explicitly adds check for cases like:

for i in T.Parallel():
    T.tile_op(...)

For now we only let atomic operations like T.atomic_add and TIR operators present in T.Parallel.

Summary by CodeRabbit

  • Tests

    • Added tests validating that tiled operations inside parallel loops are rejected, including matmul/atomic patterns and reference checks.
  • Bug Fixes

    • Enforced validation to prevent TiledOp usage within parallel constructs with clearer semantic errors.
  • Chores

    • Improved debug logging for AST printing.
    • Updated internal operator routing to use the tile-op registration/pathway.

✏️ Tip: You can customize this high-level summary in your review settings.

@github-actions
Copy link

github-actions bot commented Dec 1, 2025

👋 Hi! Thank you for contributing to the TileLang project.

Please remember to run pre-commit run --all-files in the root directory of the project to ensure your changes are properly linted and formatted. This will help ensure your contribution passes the format check.

We appreciate you taking this step! Our team will review your contribution, and we look forward to your awesome work! 🚀

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 1, 2025

Walkthrough

Adds a semantic check that forbids TiledOp calls inside T.Parallel constructs, implements an is_tile_op detection helper and a visit_call_ visitor to raise errors in parallel contexts, updates AST printer logging, adjusts a phase file formatting, and adds tests exercising tiled ops inside parallel loops.

Changes

Cohort / File(s) Summary
Tests
testing/python/analysis/test_tilelang_nested_loop_checker.py
Adds a new test module and helper functions: matmul_with_parallel, run_gemm_tiled_op_with_parallel, tir_op_with_parallel, customize_op_with_parallel, and test_tiled_op_with_parallel. Validates that TiledOp inside T.Parallel triggers a compilation/semantic error and compares kernels against reference behavior.
Nested-loop semantic checker
tilelang/analysis/nested_loop_checker.py
Adds is_tile_op(op: Call) -> bool detection, imports Call from tvm.tir, replaces some recursive body visits with super().visit_for_(op), and implements visit_call_() to raise a semantic error when a tile-op is encountered inside a parallel loop.
AST printer
tilelang/analysis/ast_printer.py
Enhances debug logging to include the full statement representation in pre-visit logs (prints type(statement), statement).
Phase wiring / formatting
tilelang/engine/phase.py
Minor whitespace/formatting adjustment in PreLowerSemanticCheck (removed extra blank lines between checks).
Operator registration updates
src/op/operator.h, src/op/atomic_add.cc, src/op/copy.cc, src/op/fill.cc, src/op/finalize_reducer.cc, src/op/gemm.cc, src/op/gemm_py.cc, src/op/gemm_sp.cc, src/op/reduce.cc, src/op/region.cc, ...
Introduces TIR_REGISTER_TL_TILE_OP usage and changes registered op name prefixes from "tl." to "tl.tileop." across many operator registrations (atomic_add, copy, fill, finalize_reducer, gemm*, reduce, region, etc.). Replaces direct TVM registration for region with the macro-based tileop registration.
Language-level intrinsic key updates
tilelang/language/atomic.py, tilelang/language/copy.py, tilelang/language/fill.py, tilelang/language/gemm.py, tilelang/language/reduce.py, tilelang/language/experimental/gemm_sp.py, tilelang/language/utils.py
Updates tile-language intrinsics to call the new tileop keys (e.g., tl.tileop.atomicadd, tl.tileop.copy, tl.tileop.c2d_im2col, tl.tileop.fill, tl.tileop.gemm*, tl.tileop.reduce, tl.tileop.cumsum, tl.tileop.finalize_reducer, tl.tileop.region). No public signatures changed.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Pay attention to nested_loop_checker.py:
    • correctness of is_tile_op detection (covers all TLOpBuilder paths)
    • proper interaction between visit_for_ overrides and super() traversal to avoid skipping checks
    • error message clarity and test alignment
  • Review operator registration changes:
    • ensure op name prefix changes ("tl.""tl.tileop.") are consistent across registrations and language intrinsics
    • confirm removal of direct TVM_REGISTER_OP("tl.region") and replacement with macro is safe for consumers
  • Verify tests in testing/python/analysis/test_tilelang_nested_loop_checker.py:
    • that they correctly assert compilation/semantic failures and reference comparisons
    • kernel patterns (atomic_add/atomic semantics) align with changed intrinsic names

Possibly related PRs

Suggested reviewers

  • LeiWang1999
  • tzj-fxz

Poem

🐰 I hopped through loops both wide and narrow,
Found tileops dancing in a parallel meadow,
I thumped my foot and whispered "Not here!" —
Now TiledOps wait, outside the peer,
Happy kernels hum in ordered rows. 🥕

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: enhancing NestedLoopChecker to handle tile op cases, which is a core theme across the PR's analysis improvements and operator registration updates.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
tilelang/analysis/nested_loop_checker.py (1)

22-26: Potential IndexError or AttributeError if op.args is empty or first arg is not a string literal.

The function assumes op.args[0] exists and has a .value attribute. If tir.call_extern is somehow invoked with no arguments or a non-string first argument, this will raise an unhandled exception. Consider adding a guard:

 def is_atomic_op(op: Call) -> bool:
     """Check if a call is an atomic operation."""
-
-    return op.op == Op.get("tir.call_extern") and op.args[0].value.startswith("Atomic")
+    if op.op != Op.get("tir.call_extern"):
+        return False
+    if len(op.args) == 0:
+        return False
+    first_arg = op.args[0]
+    return hasattr(first_arg, "value") and isinstance(first_arg.value, str) and first_arg.value.startswith("Atomic")
testing/python/analysis/test_tilelang_nested_loop_checker.py (1)

604-678: Consider extracting shared test setup to reduce duplication.

This function duplicates significant logic from run_gemm_mixed_pp (lines 472-546), including the reference program and compilation setup. Consider extracting shared setup into a helper function.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c6a19fb and 74b7bfa.

📒 Files selected for processing (4)
  • testing/python/analysis/test_tilelang_nested_loop_checker.py (1 hunks)
  • tilelang/analysis/ast_printer.py (1 hunks)
  • tilelang/analysis/nested_loop_checker.py (4 hunks)
  • tilelang/engine/phase.py (0 hunks)
💤 Files with no reviewable changes (1)
  • tilelang/engine/phase.py
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-09-12T09:47:46.474Z
Learnt from: kurisu6912
Repo: tile-ai/tilelang PR: 794
File: tilelang/transform/add_bufstore_wrapper.py:30-33
Timestamp: 2025-09-12T09:47:46.474Z
Learning: In TVM's PyStmtExprMutator, visit_block_ methods typically call super().visit_block_(op) to process child nodes and update internal state, but return the original op when the block itself doesn't need transformation. The pattern `return op` is correct for blocks that serve as containers where mutations happen at deeper levels.

Applied to files:

  • tilelang/analysis/nested_loop_checker.py
📚 Learning: 2025-11-14T07:56:11.098Z
Learnt from: lucifer1004
Repo: tile-ai/tilelang PR: 1256
File: testing/python/jit/test_tilelang_jit_gemm_nvrtc.py:55-115
Timestamp: 2025-11-14T07:56:11.098Z
Learning: In `testing/python/jit/test_tilelang_jit_gemm_nvrtc.py`, the global function `tilelang_callback_cuda_postproc` registered via `tvm.register_global_func(..., override=True)` is intentionally not restored after the test completes, as the persistent behavior is expected.

Applied to files:

  • testing/python/analysis/test_tilelang_nested_loop_checker.py
🪛 Ruff (0.14.6)
tilelang/analysis/nested_loop_checker.py

63-66: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Quick Lint
🔇 Additional comments (7)
tilelang/analysis/ast_printer.py (1)

17-17: LGTM! Enhanced debug output.

The additional statement representation improves debugging visibility during AST traversal.

tilelang/analysis/nested_loop_checker.py (2)

50-52: LGTM! Correct use of super().visit_for_(op).

Using the base class traversal ensures child nodes (including Call nodes) are visited, which is necessary for the new visit_call_ hook to function correctly. This follows TVM's PyStmtExprVisitor patterns.


61-66: Verify whether TIR built-in operations like T.max reach visit_call_ and whether they are correctly classified by is_atomic_op.

The visit_call_ method rejects all non-atomic calls inside a parallel context by checking is_atomic_op(op). However, the test tir_op_with_parallel reportedly uses T.max inside a parallel loop. If T.max produces a Call node and is_atomic_op returns False for it, this would raise an exception. The fact that the test passes suggests either TIR operations bypass this visitor, use different node types, or is_atomic_op handles them differently than the current logic implies.

Additionally, the error message states "Only elementwise operations are allowed" but the check only permits atomic operations—clarify whether these terms are intentionally distinct or if the message should be updated.

testing/python/analysis/test_tilelang_nested_loop_checker.py (4)

597-598: Good test case for the new restriction.

This correctly tests the scenario where a tile operation (T.gemm) is placed inside a T.Parallel(1) block, which should trigger the new semantic check.


681-694: LGTM! Good coverage for TIR operators inside parallel.

This test correctly verifies that TIR built-in operations like T.max are permitted inside parallel loops, distinguishing them from tile operations.


697-711: LGTM! Good coverage for atomic operations inside parallel.

This test correctly verifies that atomic operations (T.atomic_add) are permitted inside parallel loops, as expected by the new checker logic.


714-723: Comprehensive test coverage for the new tile op restriction.

The test function exercises all three key scenarios: rejection of tile operations in parallel, acceptance of TIR built-ins, and acceptance of atomic operations.

def is_atomic_op(op: Call) -> bool:
"""Check if a call is an atomic operation."""

return op.op == Op.get("tir.call_extern") and op.args[0].value.startswith("Atomic")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a bit tricky, we can collect the definition of tile intrins, let me take a look

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tilelang/language/reduce.py (1)

245-257: Update cumsum docstring to match the new tl.tileop.cumsum intrinsic

The implementation now emits tir.op.Op.get("tl.tileop.cumsum"), but the docstring still says it emits the tl.cumsum intrinsic, which is misleading.

Suggest updating the docstring to reference the new intrinsic name:

-    Negative `dim` indices are normalized (Python-style). If `dst` is None, the operation is performed in-place into `src`. Raises ValueError when `dim` is out of bounds for `src.shape`. When `src.scope() == "local.fragment"`, this delegates to `cumsum_fragment`; otherwise it emits the `tl.cumsum` intrinsic.
+    Negative `dim` indices are normalized (Python-style). If `dst` is None, the operation is performed in-place into `src`. Raises ValueError when `dim` is out of bounds for `src.shape`. When `src.scope() == "local.fragment"`, this delegates to `cumsum_fragment`; otherwise it emits the `tl.tileop.cumsum` intrinsic.

Also applies to: 260-310

🧹 Nitpick comments (1)
tilelang/analysis/nested_loop_checker.py (1)

60-65: Clarify the error message to accurately reflect allowed operations.

The error message states "Only elementwise operations are allowed inside a parallel loop," but according to the PR summary, atomic operations should also be permitted within T.Parallel. If atomic operations are indeed allowed, the message should be updated to reflect this.

Consider updating the error message to be more specific:

     def visit_call_(self, op: Call) -> None:
         if self.in_parallel_context and is_tile_op(op):
             raise ValueError("[Tilelang Semantic Check] "
-                             "Only elementwise operations are allowed inside a parallel loop. " \
+                             "Tile operations (except atomic operations) are not allowed inside a parallel loop. " \
                              f"Got a tile-op \"{op.op}\"."
                              )

Note: This suggestion assumes atomic operations should be excluded from the check. Please verify based on the outcome of the verification script in the previous comment.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 74b7bfa and bb54a40.

📒 Files selected for processing (18)
  • src/op/atomic_add.cc (1 hunks)
  • src/op/copy.cc (2 hunks)
  • src/op/fill.cc (1 hunks)
  • src/op/finalize_reducer.cc (1 hunks)
  • src/op/gemm.cc (1 hunks)
  • src/op/gemm_py.cc (1 hunks)
  • src/op/gemm_sp.cc (1 hunks)
  • src/op/operator.h (1 hunks)
  • src/op/reduce.cc (2 hunks)
  • src/op/region.cc (1 hunks)
  • tilelang/analysis/nested_loop_checker.py (4 hunks)
  • tilelang/language/atomic.py (1 hunks)
  • tilelang/language/copy.py (2 hunks)
  • tilelang/language/experimental/gemm_sp.py (1 hunks)
  • tilelang/language/fill.py (1 hunks)
  • tilelang/language/gemm.py (2 hunks)
  • tilelang/language/reduce.py (9 hunks)
  • tilelang/language/utils.py (1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-11-14T07:56:11.098Z
Learnt from: lucifer1004
Repo: tile-ai/tilelang PR: 1256
File: testing/python/jit/test_tilelang_jit_gemm_nvrtc.py:55-115
Timestamp: 2025-11-14T07:56:11.098Z
Learning: In `testing/python/jit/test_tilelang_jit_gemm_nvrtc.py`, the global function `tilelang_callback_cuda_postproc` registered via `tvm.register_global_func(..., override=True)` is intentionally not restored after the test completes, as the persistent behavior is expected.

Applied to files:

  • src/op/gemm_py.cc
📚 Learning: 2025-09-12T09:47:46.474Z
Learnt from: kurisu6912
Repo: tile-ai/tilelang PR: 794
File: tilelang/transform/add_bufstore_wrapper.py:30-33
Timestamp: 2025-09-12T09:47:46.474Z
Learning: In TVM's PyStmtExprMutator, visit_block_ methods typically call super().visit_block_(op) to process child nodes and update internal state, but return the original op when the block itself doesn't need transformation. The pattern `return op` is correct for blocks that serve as containers where mutations happen at deeper levels.

Applied to files:

  • tilelang/analysis/nested_loop_checker.py
🧬 Code graph analysis (11)
tilelang/language/utils.py (1)
tilelang/language/tir/op.py (1)
  • call_intrin (120-145)
tilelang/language/copy.py (1)
tilelang/language/tir/op.py (1)
  • call_intrin (120-145)
src/op/finalize_reducer.cc (2)
tilelang/ir.py (1)
  • FinalizeReducerOp (53-54)
tilelang/language/reduce.py (1)
  • finalize_reducer (313-330)
src/op/gemm_sp.cc (2)
tilelang/ir.py (1)
  • GemmSP (48-49)
tilelang/language/experimental/gemm_sp.py (1)
  • gemm_sp (9-87)
src/op/copy.cc (2)
tilelang/ir.py (2)
  • Copy (20-21)
  • Conv2DIm2ColOp (25-26)
tilelang/language/copy.py (2)
  • copy (13-94)
  • c2d_im2col (97-128)
src/op/fill.cc (3)
src/op/fill.h (1)
  • Fill (45-50)
tilelang/ir.py (1)
  • Fill (10-11)
tilelang/language/fill.py (1)
  • fill (8-36)
src/op/region.cc (1)
tilelang/language/utils.py (1)
  • region (7-10)
tilelang/language/fill.py (1)
tilelang/language/tir/op.py (1)
  • call_intrin (120-145)
src/op/gemm_py.cc (1)
tilelang/tileop/gemm/__init__.py (1)
  • GemmPy (56-192)
tilelang/language/atomic.py (1)
tilelang/language/tir/op.py (1)
  • call_intrin (120-145)
src/op/atomic_add.cc (1)
tilelang/ir.py (1)
  • AtomicAdd (15-16)
🪛 Ruff (0.14.6)
tilelang/analysis/nested_loop_checker.py

62-65: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Test for Python 3.12 with Metal (on macos-latest)
  • GitHub Check: Test for Python 3.12 with ROCm-6.3 (on self-hosted-amd)
  • GitHub Check: Test for Python 3.12 with CUDA-12.8 (on self-hosted-nvidia)
🔇 Additional comments (25)
tilelang/analysis/nested_loop_checker.py (3)

4-4: LGTM!

The Call import is correctly added to support the new is_tile_op function and visit_call_ method.


34-58: LGTM!

The switch from manual traversal (self.visit_stmt(child) and self.visit_stmt(op.body)) to super().visit_for_(op) is a good improvement. This ensures all nested statements are properly visited, making the nested loop checking more thorough and less prone to missing edge cases.


21-24: Verify that atomic operations are not incorrectly flagged as tile-ops.

The PR summary states that "only atomic operations (such as T.atomic_add) and TIR operators" are permitted within T.Parallel. Since atomic operations now use the tl.tileop.* namespace (e.g., tl.tileop.atomicadd), they may have the "TLOpBuilder" attribute set and be incorrectly blocked by this check. This requires verification by examining the atomic operation registrations to confirm whether they set the "TLOpBuilder" attribute, and if so, whether is_tile_op() should explicitly exclude them.

tilelang/language/utils.py (1)

10-10: LGTM!

The namespace migration from tl.region to tl.tileop.region aligns with the broader refactoring across the codebase to standardize tile operations under the tl.tileop.* namespace.

tilelang/language/gemm.py (2)

119-119: LGTM!

Namespace migration from tl.gemm to tl.tileop.gemm is consistent with the PR's objective to standardize tile operations.


148-148: LGTM!

Namespace migration from tl.gemm_py to tl.tileop.gemm_py is consistent with the PR's objective to standardize tile operations.

tilelang/language/experimental/gemm_sp.py (1)

73-73: LGTM!

Namespace migration from tl.gemm_sp to tl.tileop.gemm_sp is consistent with the broader refactoring to standardize tile operations.

src/op/reduce.cc (2)

481-484: LGTM!

The registration macro change from TIR_REGISTER_TL_OP to TIR_REGISTER_TL_TILE_OP aligns with the PR's migration to tile-based operator registrations. The inputs and attributes remain unchanged.


566-569: LGTM!

The registration macro change from TIR_REGISTER_TL_OP to TIR_REGISTER_TL_TILE_OP for CumSumOp is consistent with the broader migration to tile-based operator registrations.

tilelang/language/fill.py (1)

35-36: LGTM!

Namespace migration from tl.fill to tl.tileop.fill aligns with the PR's standardization of tile operations under the tl.tileop.* namespace.

src/op/fill.cc (1)

212-215: LGTM!

The registration macro change from TIR_REGISTER_TL_OP to TIR_REGISTER_TL_TILE_OP is consistent with the broader migration to tile-based operator registrations across the codebase.

tilelang/language/atomic.py (1)

214-218: LGTM, but note potential interaction with nested loop checker.

The namespace migration from tl.atomicadd to tl.tileop.atomicadd is consistent with the broader refactoring. However, this change is relevant to the concern raised in tilelang/analysis/nested_loop_checker.py: if tile-region-based atomic operations now have the "TLOpBuilder" attribute (due to being registered as tile ops), they may be incorrectly blocked inside T.Parallel loops, despite the PR summary stating that atomic operations should be permitted.

This approval is contingent on verifying that the nested loop checker properly handles atomic operations (see verification request in nested_loop_checker.py review).

src/op/gemm_sp.cc (1)

305-308: LGTM! Operator registration macro updated consistently.

The registration macro change from TIR_REGISTER_TL_OP to TIR_REGISTER_TL_TILE_OP correctly aligns with the namespace migration to tl.tileop.*. The operator's inputs and attributes remain unchanged, maintaining backward compatibility at the functional level.

src/op/gemm.cc (1)

829-832: LGTM! Registration macro updated consistently.

The operator registration correctly uses the new TIR_REGISTER_TL_TILE_OP macro, migrating from tl.gemm to tl.tileop.gemm namespace while preserving all operator attributes.

src/op/gemm_py.cc (1)

321-324: LGTM! GemmPy registration updated correctly.

The registration macro change maintains consistency with the broader tile-op namespace migration. All operator attributes remain unchanged.

src/op/region.cc (1)

79-82: LGTM! RegionOp registration migrated to tile-op namespace.

The macro change correctly registers RegionOp under tl.tileop.region, consistent with the project-wide namespace migration. The operator's configuration (num_inputs=-1, TCallEffectKind=kPure) remains intact.

src/op/finalize_reducer.cc (1)

162-165: LGTM! FinalizeReducerOp registration updated.

The operator registration correctly migrates to the tile-op namespace while preserving all attributes (num_inputs=1, TCallEffectKind=kOpaque).

src/op/copy.cc (2)

2040-2043: LGTM! Copy operator registration updated.

Registration correctly uses the new tile-op macro and namespace.


2065-2068: LGTM! Conv2DIm2ColOp registration updated.

Consistent macro migration for the im2col operator, maintaining all attributes.

tilelang/language/copy.py (2)

93-94: LGTM! Python namespace updated to match C++ registration.

The copy intrinsic call correctly references tl.tileop.copy, aligning with the C++ operator registration namespace.


127-128: LGTM! Conv2D im2col namespace updated.

The c2d_im2col intrinsic call correctly uses the new tl.tileop.c2d_im2col namespace, maintaining consistency with the C++ backend.

src/op/operator.h (1)

80-88: Key infrastructure change: macro renamed and namespace migrated.

This macro definition change is the foundation for the tile-op categorization:

  • Macro renamed: TIR_REGISTER_TL_OPTIR_REGISTER_TL_TILE_OP
  • Namespace migrated: "tl.""tl.tileop."

The change enables the NestedLoopChecker and other analysis passes to distinguish tile operations from other TL operations.

Verify that all references to the old "tl.{op}" namespace have been migrated to "tl.tileop.{op}" throughout the codebase:

  • Search for remaining TIR_REGISTER_TL_OP macro usages (old macro name)
  • Search for Op::Get("tl.[^t]...) patterns in C++ (old namespace without "tileop")
  • Search for old namespace references in Python files (Op.get and tir.op.Op.get calls)
tilelang/language/reduce.py (2)

16-16: Centralizing the reduce intrinsic key via _REDUCE_OP_KEY looks good

Using a single _REDUCE_OP_KEY = "tl.tileop.reduce" and wiring all tir.call_intrin sites through tir.op.Op.get(_REDUCE_OP_KEY) is consistent and reduces the risk of string mismatches across the various reduce buffer-scope combinations. No functional issues spotted.

Also applies to: 56-62, 71-77, 84-90, 95-101


313-330: Finalize-reducer intrinsic rename is consistent and self‑documenting

The docstring and implementation both now target tl.tileop.finalize_reducer via tir.op.Op.get("tl.tileop.finalize_reducer"), aligning with the rest of the tileop-prefixed intrinsics and keeping the Python API unchanged. This looks correct.

src/op/atomic_add.cc (1)

542-545: Registration macro migration to TIR_REGISTER_TL_TILE_OP is consistent

Switching AtomicAdd to TIR_REGISTER_TL_TILE_OP(AtomicAdd, atomicadd) while keeping set_num_inputs(2) and the opaque call effect aligns this op with the new tl.tileop.* registration scheme without altering semantics. No issues from this change alone.

@LeiWang1999 LeiWang1999 merged commit b10ef75 into tile-ai:main Dec 1, 2025
6 checks passed
SigureMo pushed a commit to PFCCLab/tilelang-paddle that referenced this pull request Dec 5, 2025
* [Analysis] Enhance NestedLoopChecker with tile op cases

* fix tileop issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants