[Refactor] Backup Analyzer to get the appropriate arith informations #1311

LeiWang1999 · 2025-11-21T13:10:19Z

This pull request introduces several improvements and refactorings to the loop vectorization infrastructure, with a focus on making vectorization analysis more robust and analyzer-aware. The most significant changes are the extension of vectorization-related functions to accept an arith::Analyzer, updates to the vectorization planning logic, and corresponding changes throughout the codebase to utilize the enhanced APIs.

Key changes include:

Vectorization API and Infrastructure Improvements

Added overloads for GetVectorizeSize and VectorizeLoop to accept an arith::Analyzer*, enabling more accurate and context-aware vectorization analysis. (src/transform/loop_vectorize.h, src/transform/loop_vectorize.cc) [1] [2] [3]
Refactored VectorizePlanner and related classes to use IRMutatorWithAnalyzer instead of IRVisitorWithAnalyzer, allowing for mutating analysis and better integration with analyzers. (src/transform/loop_vectorize.cc) [1] [2] [3] [4]

Propagation of Analyzer Context

Updated all call sites of VectorizeLoop and GetVectorizeSize to pass the appropriate analyzer argument, ensuring that vectorization decisions are made with full context. This includes changes in copy, fill, and parallel op lowering, as well as in layout inference and legalization passes. (src/op/copy.cc, src/op/fill.cc, src/op/parallel.cc, src/transform/layout_inference.cc, src/transform/legalize_vectorized_loop.cc) [1] [2] [3] [4] [5] [6] [7]

Layout Inference Enhancements

Modified BufferUseDefCollector in layout inference to maintain a vector of analyzers (analyzer_vec_), ensuring that each inference step uses the correct analyzer context. This includes proper cloning, size checks, and passing analyzers to InferLayout. (src/transform/layout_inference.cc) [1] [2] [3] [4] [5] [6]

Vectorization Analysis Logic

Improved the logic for determining if indices can be vectorized by simplifying expressions before divisibility checks, leading to more robust vectorization decisions. (src/transform/loop_vectorize.cc) [1] [2]

Dependency Updates

Updated the TVM submodule to a newer commit, pulling in upstream improvements and fixes. (3rdparty/tvm)

Summary by CodeRabbit

Refactor
- Loop vectorization and planning are now analyzer-aware, improving vector-size decisions, nested-loop handling, and consistency of transformations across inference runs.
- Layout inference now maintains per-inference analyzer state to preserve analysis context during inference and transformations.
- Vectorization APIs extended with analyzer-aware variants to enable finer-grained optimization control.
Chores
- Updated third-party submodule reference.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

- Modified `VectorizeLoop` and related functions to accept an `arith::Analyzer` parameter, enhancing their capability to perform analysis during vectorization. - Updated multiple instances in `copy.cc`, `fill.cc`, `parallel.cc`, and layout inference files to utilize the new analyzer parameter for improved performance and correctness. - Ensured consistency across vectorization logic by integrating the analyzer into existing workflows, facilitating better optimization opportunities.

…e_1121

github-actions · 2025-11-21T13:10:30Z

👋 Hi! Thank you for contributing to the TileLang project.

Please remember to run pre-commit run --all-files in the root directory of the project to ensure your changes are properly linted and formatted. This will help ensure your contribution passes the format check.

We appreciate you taking this step! Our team will review your contribution, and we look forward to your awesome work! 🚀

coderabbitai · 2025-11-21T13:10:35Z

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Per-inference arith::Analyzer instances are added and threaded through layout inference and lowering; vectorization APIs and planner were refactored to accept analyzer pointers; multiple call sites and the TVM submodule pointer were updated. (≤50 words)

Changes

Cohort / File(s)	Summary
Submodule update `3rdparty/tvm`	Submodule pointer advanced from `bc31e7ad9f9fafd7659dfabafe359fd55a0ffc1e` to `cd2b2b6013d155b5822300b0a0740fa65320dd9e`.
Lowering call-site updates `src/op/copy.cc`, `src/op/fill.cc`, `src/transform/legalize_vectorized_loop.cc`	Updated `VectorizeLoop` call sites to pass an additional `arith::Analyzer*` argument.
Parallel/vectorize size usage `src/op/parallel.cc`	`GetVectorizeSize` call sites changed to pass `T.analyzer` (threaded analyzer into vector-size computation).
Layout inference analyzer management `src/transform/layout_inference.cc`	Added per-infer `analyzer_vec_` (vector of `unique_ptr<arith::Analyzer>`), clone/push during visitation, thread `cur_analyzer` into `InferLayout`/`RunInferStep`, and added runtime size validation.
Vectorization refactor & API `src/transform/loop_vectorize.cc`, `src/transform/loop_vectorize.h`	Refactored planner/visitors to analyzer-aware mutator/visitor, added overloads `GetVectorizeSize(const For&, arith::Analyzer)` and `VectorizeLoop(const For&, arith::Analyzer, int)`, adjusted visitor/mutator return types, and added analyzer ownership/constructors.

Sequence Diagram

sequenceDiagram
    autonumber
    participant LI as LayoutInference
    participant AV as analyzer_vec_
    participant RI as RunInferStep
    participant VV as VectorizePlanner / VectorizeLoop

    LI->>AV: clone current analyzer and push per-infer
    LI->>RI: RunInferStep(cur_infer_id)
    RI->>AV: retrieve analyzer_vec_[cur_infer_id]
    AV-->>RI: cur_analyzer
    RI->>VV: VectorizeLoop(for_node, cur_analyzer)
    VV->>VV: use analyzer for GetVectorizeSize & expr simplification
    VV-->>RI: return transformed loop (vectorized or original)

    note right of AV: assert analyzer_vec_.size() == infer_list_.size()

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Pay extra attention to:
- src/transform/loop_vectorize.cc / .h (visitor → mutator refactor, new overloads, analyzer ownership)
- src/transform/layout_inference.cc (analyzer lifecycle, cloning, indexing and assertion)
- All updated call sites (src/op/copy.cc, src/op/fill.cc, src/op/parallel.cc, src/transform/legalize_vectorized_loop.cc) for consistent analyzer passing

Possibly related PRs

[Bugfix][Enhancement] Fix a bug in previous commit and enhance cuda backend #887 — modifies loop_vectorize implementation (planner/visitor changes) similar to this PR.
[TMA] Automatically lower 1d tma in appropriate cases #788 — threads arith::Analyzer through lowering/inference and updates related call sites.
[Bugfix] Resolve mixed stride dtype issue (inconsistent int32/int64 values) #1119 — related edits to loop_vectorize (functions like VectorizeLoop/GetVectorizeSize/VectorizePlanner).

Suggested reviewers

chengyupku
xysmlx

Poem

🐰 I hopped through loops and cloned a mind,

Per-infer analyzers now I find.
Mutators hum where visitors played,
Vector lanes carved with care and braid.
Hop, simplify, and ship — carrot-lined! 🥕

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 16.67% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title references backing up/propagating analyzer for arithmetic information, which directly aligns with the main refactoring objective of making vectorization analysis analyzer-aware.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ab44b74 and 6221326.

📒 Files selected for processing (1)

src/transform/loop_vectorize.cc (8 hunks)

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-09-12T09:47:46.474Z

Learnt from: kurisu6912
Repo: tile-ai/tilelang PR: 794
File: tilelang/transform/add_bufstore_wrapper.py:30-33
Timestamp: 2025-09-12T09:47:46.474Z
Learning: In TVM's PyStmtExprMutator, visit_block_ methods typically call super().visit_block_(op) to process child nodes and update internal state, but return the original op when the block itself doesn't need transformation. The pattern `return op` is correct for blocks that serve as containers where mutations happen at deeper levels.

Applied to files:

src/transform/loop_vectorize.cc

🧬 Code graph analysis (1)

src/transform/loop_vectorize.cc (3)

src/transform/common/loop_fusion_utils.h (2)

VisitStmt_ (61-67)

VisitStmt_ (104-222)

src/transform/loop_partition.cc (8)

node (41-48)

node (41-41)

node (49-56)

node (49-49)

node (171-184)

node (171-171)

node (234-242)

node (234-234)

src/transform/atomicadd_vectorize.cc (2)

UpdateVectorSize (111-151)

UpdateVectorSize (111-112)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Quick Lint

🔇 Additional comments (3)

src/transform/loop_vectorize.cc (3)

100-116: Nested loop detection logic is now correct.

The refactored nested loop detection visits only node->body (line 100), which correctly excludes the current ForNode itself. The conditional GCD adjustment (lines 106-116) will now properly run for true innermost loops.

247-254: Analyzer ownership pattern is safe and well-designed.

The two overloads provide flexibility: one creates a local stack-allocated analyzer for convenience (lines 247-250), while the other accepts an external analyzer for context-aware analysis (lines 252-254). The stack allocation ensures the analyzer outlives the planner usage without leaks.

325-347: VectorizeLoop overloads follow consistent pattern.

Both overloads correctly handle analyzer ownership and provide appropriate defaults when vectorize_hint <= 0. The implementation is consistent with the GetVectorizeSize overload pattern.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Copilot

Pull Request Overview

This pull request refactors the loop vectorization infrastructure to make vectorization analysis more robust and analyzer-aware. The changes enable vectorization functions to accept an arith::Analyzer parameter, allowing for more accurate context-aware vectorization decisions based on arithmetic information available outside the vectorization pass.

Extends GetVectorizeSize and VectorizeLoop to accept an optional arith::Analyzer* parameter for context-aware analysis
Refactors VectorizePlanner to use IRMutatorWithAnalyzer instead of IRVisitorWithAnalyzer to support mutating analysis
Propagates analyzer context throughout the codebase to all vectorization call sites

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
src/transform/loop_vectorize.h	Adds overloaded function declarations for `GetVectorizeSize` and `VectorizeLoop` that accept an `arith::Analyzer*` parameter
src/transform/loop_vectorize.cc	Refactors `VectorizePlanner` to inherit from `IRMutatorWithAnalyzer` and implements analyzer-aware vectorization logic with improved expression simplification
src/transform/legalize_vectorized_loop.cc	Updates `VectorizeLoop` call to pass the analyzer context
src/transform/layout_inference.cc	Maintains a vector of analyzers and propagates appropriate analyzer to `InferLayout` and `VectorizeLoop` calls
src/op/parallel.cc	Updates `GetVectorizeSize` call to pass the analyzer from layout inference arguments
src/op/fill.cc	Updates all `VectorizeLoop` calls to pass the analyzer parameter
src/op/copy.cc	Updates `VectorizeLoop` calls to pass the analyzer parameter
3rdparty/tvm	Updates TVM submodule to newer commit

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-11-21T13:11:40Z

src/transform/loop_vectorize.cc

+    bool contains_nested_for = false;
+    // Must analysis vectorization on the innermost loop
+    PostOrderVisit(Downcast<Stmt>(node), [&](const ObjectRef &obj) {
+      if (obj.as<ForNode>()) {


The logic for detecting nested loops incorrectly includes the current ForNode itself. PostOrderVisit will visit node and set contains_nested_for = true even when there are no nested loops. This causes the vectorization analysis to be skipped for all loops. Consider checking if the visited ForNode is different from node before setting the flag.

Suggested change

if (obj.as<ForNode>()) {

const ForNode* for_node = obj.as<ForNode>();

if (for_node && for_node != node) {

Copilot · 2025-11-21T13:11:40Z

src/transform/loop_vectorize.cc

-  if (!analyzer->CanProveEqual(
-          FloorMod(Substitute(expr, {{var, zero}}), target_size_for_expr), 0)) {
+  if (!analyzer->CanProveEqual(FloorMod(simplified_expr, target_size_for_expr),
+                               zero)) {


Comparing FloorMod(simplified_expr, target_size_for_expr) against zero is incorrect. The result of FloorMod has the same dtype as simplified_expr (i.e., expr.dtype()), but zero has type var.dtype(). These types may differ, causing the comparison to fail. Use make_const(expr.dtype(), 0) instead of zero.

Suggested change

zero)) {

make_const(expr.dtype(), 0))) {

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (2)

src/op/parallel.cc (1)

453-491: Free-mode vector-size planning now analyzer-aware

Switching to GetVectorizeSize(maybe_remapped_root_, T.analyzer) lets ParallelOpNode::InferLayout reuse the per-infer analyzer state, aligning vectorization planning with layout inference. Assuming LayoutInferArgs::analyzer is always populated (as done in BufferUseDefCollector::RunInferStep), this is a sound change. You might optionally add an ICHECK(T.analyzer) near the top of InferLayout for robustness.

src/transform/loop_vectorize.cc (1)

170-199: Analyzer-aware helpers and improved base-offset check in vectorization are solid

UpdateVectorSize now correctly passes analyzer_ into CanProveIndependent and IndiceCanVectorize, matching the new signatures.

In IndiceCanVectorize, simplifying Substitute(expr, {{var, zero}}) before checking FloorMod(..., target_size_for_expr) is a good robustness improvement: it tests divisibility of the base offset rather than the full symbolic expression, which is often unprovable.

The analyzer-aware overloads of GetVectorizeSize and VectorizeLoop are thin wrappers that cleanly reuse VectorizePlanner(analyzer) and the existing VectorizeRewriter, preserving behavior while allowing callers to thread in their own analyzers.

Once the planner’s analyzer ownership and innermost-loop detection are fixed, this analyzer plumbing should make vectorization decisions more stable and context-aware.

Also applies to: 272-321, 248-252, 334-344

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bf90a5f and d9560ea.

📒 Files selected for processing (8)

3rdparty/tvm (1 hunks)
src/op/copy.cc (2 hunks)
src/op/fill.cc (2 hunks)
src/op/parallel.cc (1 hunks)
src/transform/layout_inference.cc (8 hunks)
src/transform/legalize_vectorized_loop.cc (1 hunks)
src/transform/loop_vectorize.cc (8 hunks)
src/transform/loop_vectorize.h (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (7)

src/transform/legalize_vectorized_loop.cc (2)

src/transform/loop_vectorize.cc (4)

VectorizeLoop (323-332)

VectorizeLoop (323-323)

VectorizeLoop (334-344)

VectorizeLoop (334-335)

src/transform/vectorize_loop.cc (2)

VectorizeLoop (867-879)

VectorizeLoop (867-867)

src/op/parallel.cc (1)

src/transform/loop_vectorize.cc (4)

GetVectorizeSize (248-248)

GetVectorizeSize (248-248)

GetVectorizeSize (250-252)

GetVectorizeSize (250-250)

src/op/fill.cc (1)

src/transform/loop_vectorize.cc (4)

VectorizeLoop (323-332)

VectorizeLoop (323-323)

VectorizeLoop (334-344)

VectorizeLoop (334-335)

src/op/copy.cc (1)

src/transform/loop_vectorize.cc (4)

VectorizeLoop (323-332)

VectorizeLoop (323-323)

VectorizeLoop (334-344)

VectorizeLoop (334-335)

src/transform/loop_vectorize.h (1)

src/transform/loop_vectorize.cc (8)

GetVectorizeSize (248-248)

GetVectorizeSize (248-248)

GetVectorizeSize (250-252)

GetVectorizeSize (250-250)

VectorizeLoop (323-332)

VectorizeLoop (323-323)

VectorizeLoop (334-344)

VectorizeLoop (334-335)

src/transform/layout_inference.cc (1)

src/transform/loop_vectorize.cc (4)

VectorizeLoop (323-332)

VectorizeLoop (323-323)

VectorizeLoop (334-344)

VectorizeLoop (334-335)

src/transform/loop_vectorize.cc (2)

src/transform/common/loop_fusion_utils.h (2)

VisitStmt_ (61-67)

VisitStmt_ (104-222)

src/transform/loop_partition.cc (8)

node (41-48)

node (41-41)

node (49-56)

node (49-49)

node (171-184)

node (171-171)

node (234-242)

node (234-234)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Quick Lint

🔇 Additional comments (7)

src/transform/legalize_vectorized_loop.cc (1)

68-77: Analyzer-aware vectorization hookup looks correct

Passing analyzer_ into VectorizeLoop keeps this legalizer aligned with the new analyzer-aware overload and reuses the same analysis context used during mutation. No additional issues here.

src/op/fill.cc (1)

203-237: Consistent use of analyzer-aware VectorizeLoop

All three updated call sites (local.fragment, local, shared/global) correctly pass the existing arith::Analyzer* into VectorizeLoop, reusing the same context already used for PartitionLoop and layout inference. This is aligned with the new API.

src/transform/loop_vectorize.h (1)

36-44: Analyzer-aware vectorization API is well-shaped

The added overloads for GetVectorizeSize and VectorizeLoop cleanly extend the API to take an arith::Analyzer* while keeping existing signatures intact, which should preserve compatibility for older call sites.

src/op/copy.cc (1)

842-869: Vectorization now correctly reuses the lowering analyzer

Both CPU and GPU branches now pass the existing arith::Analyzer* into VectorizeLoop, so vector-size planning can leverage the same simplification and bounds information used for MakeSIMTLoop, fusion, layout inference, and PartitionLoop. This looks consistent and correct.

src/transform/layout_inference.cc (2)

15-17: Per-infer analyzer propagation in layout inference is consistent

The new analyzer_vec_ machinery (clone on each operator/parallel-for, size checks in Run(), and retrieval in RunInferStep) cleanly associates a dedicated arith::Analyzer instance with each infer_list_ entry. Using cur_analyzer in LayoutInferArgs ensures operators get a stable, context-appropriate analyzer during InferLayout. The use of std::unique_ptr plus the explicit size invariants makes indexing safe. This design looks sound.

Also applies to: 89-116, 262-276, 434-461, 531-553, 682-695

1034-1036: Layout-time vectorization now uses the pass analyzer

Changing VectorizeLoop(for_node) to VectorizeLoop(for_node, analyzer_) in LayoutInferencer::VisitStmt_(const ForNode*) ensures that vectorization planning during layout inference reuses the same arithmetic analyzer as the mutator itself, which is aligned with the rest of the analyzer-aware API.

src/transform/loop_vectorize.cc (1)

48-71: Visitor→mutator refactor for access detection and planning is reasonable

Switching VectorizeFindGlobalAccess to StmtExprVisitor and adapting the various VisitStmt_/VisitExpr_ overrides in VectorizePlanner to call IRMutatorWithAnalyzer::Visit* is consistent with the mutator-based planner design. The behavior of global/shared access detection and the special handling of extern calls and casts is preserved while simplifying dependencies on analyzer-aware visitors.

Also applies to: 121-158, 164-168

coderabbitai · 2025-11-21T13:19:10Z

3rdparty/tvm

@@ -1 +1 @@
-Subproject commit bc31e7ad9f9fafd7659dfabafe359fd55a0ffc1e
+Subproject commit cd2b2b6013d155b5822300b0a0740fa65320dd9e


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify TVM submodule commit validity, stability, and compatibility.

This change updates the TVM submodule pointer to a new commit. Since this PR makes vectorization API changes (adding arith::Analyzer* parameters), we need to ensure:

The commit cd2b2b6013d155b5822300b0a0740fa65320dd9e is valid, reachable, and stable (not a WIP branch).

The TVM version at this commit is compatible with all the vectorization-related changes in this PR.

There are no breaking API changes in TVM that would affect the modified call sites across src/op/copy.cc, src/op/fill.cc, src/op/parallel.cc, and src/transform/*.cc.

The specific commit choice is documented (e.g., in the PR description or a commit message).

Consider whether pinning to a stable TVM release tag (instead of a commit SHA) would improve maintainability and reproducibility.

Run the following script to verify the TVM submodule commit:

🏁 Script executed:

#!/bin/bash # Description: Verify TVM submodule commit validity and fetch commit details. # Navigate to TVM submodule and fetch commit information cd 3rdparty/tvm 2>/dev/null || { echo "TVM submodule not found"; exit 1; } # Check if commit exists and is reachable git log --oneline -1 cd2b2b6013d155b5822300b0a0740fa65320dd9e 2>/dev/null && echo "✓ Commit found" || echo "✗ Commit not found" # Get commit message and author echo -e "\nCommit details:" git log -1 --format="%H%n%an%n%ai%n%s" cd2b2b6013d155b5822300b0a0740fa65320dd9e 2>/dev/null || echo "Unable to retrieve commit details" # Check if commit is on main or master branch echo -e "\nBranch status:" git branch -r --contains cd2b2b6013d155b5822300b0a0740fa65320dd9e 2>/dev/null | head -5 || echo "Unable to determine branch" # Get the diff summary between old and new commits echo -e "\nChanges between commits (if both are reachable):" git diff --stat bc31e7ad9f9fafd7659dfabafe359fd55a0ffc1e cd2b2b6013d155b5822300b0a0740fa65320dd9e 2>/dev/null | head -20 || echo "Unable to compute diff"

Length of output: 135

I need to verify whether the TVM commit exists in the public repository. The local submodule clone may not have fetched it. Let me search for information about this commit.

Let me verify the submodule reference itself and check for additional context:

Critical: TVM submodule points to non-existent commit

The TVM submodule pointer cd2b2b6013d155b5822300b0a0740fa65320dd9e cannot be found in the TVM repository. Verification shows:

Local submodule clone reports commit not found

Public TVM commit history (recent main branch) contains no such commit

The commit is not reachable from any TVM branch

This invalid reference will break reproducibility and cause build failures. Before merging, either:

Verify the correct TVM commit hash and update the reference

Confirm whether a stable TVM release tag should be used instead

Document the rationale for this specific commit choice if intentional

Without a valid, publicly-accessible commit, this PR cannot be built or reproduced.

🤖 Prompt for AI Agents

In 3rdparty/tvm around lines 1 to 1, the submodule is pointing to a non-existent commit (cd2b2b6013d155b5822300b0a0740fa65320dd9e) which will break builds; verify the correct TVM commit or tag by checking the upstream TVM repo (git fetch --all and git ls-remote or search for the intended commit/tag), then update the submodule pointer to a publicly-accessible commit or stable release (checkout the correct commit/tag in the submodule, git add it, and commit the updated submodule SHA and any .gitmodules changes), or if the unreachable commit is intentional, replace it with an explicit stable tag and add a short note in the PR or repository docs explaining why that exact ref is required.

src/transform/loop_vectorize.cc

- Updated the PostOrderVisit function to analyze the body of the loop node instead of the node itself, ensuring proper handling of nested loops during vectorization analysis.

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (3)

src/transform/loop_vectorize.cc (3)
73-78: Heap-allocated analyzer leaks memory.

The default constructor allocates an arith::Analyzer on the heap with new, but IRMutatorWithAnalyzer does not take ownership of this pointer, resulting in a memory leak. Consider making the planner own a stack-allocated arith::Analyzer member and passing its address to the base class in the default constructor.

Based on learnings

97-119: Nested loop detection incorrectly includes the current node.

The PostOrderVisit traversal includes node itself, so contains_nested_for is always set to true even for innermost loops. This prevents the extent-based ZeroAwareGCD adjustment from ever running, which can lead to ICHECK failures when extent % vector_size_ != 0.

Apply this diff to exclude the current node:
-    PostOrderVisit(Downcast<Stmt>(node->body), [&](const ObjectRef &obj) {
-      if (obj.as<ForNode>()) {
+    PostOrderVisit(Downcast<Stmt>(node->body), [&](const ObjectRef &obj) {
+      if (auto *inner = obj.as<ForNode>()) {
+        if (inner != node) {
-        contains_nested_for = true;
+          contains_nested_for = true;
+        }
       }
     });
291-294: Type mismatch in divisibility check.

The constant zero has type var.dtype() (line 286), but FloorMod(simplified_expr, target_size_for_expr) has type expr.dtype(). This type mismatch can cause the comparison to fail when the types differ.

Apply this diff:
-                               zero)) {
+                               make_const(expr.dtype(), 0))) {

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d9560ea and 7484f05.

📒 Files selected for processing (1)

src/transform/loop_vectorize.cc (8 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

src/transform/loop_vectorize.cc (3)

src/op/parallel.cc (4)

VisitStmt_ (131-147)

VisitStmt_ (131-131)

VisitStmt_ (149-161)

VisitStmt_ (149-149)

src/transform/common/loop_fusion_utils.h (2)

VisitStmt_ (61-67)

VisitStmt_ (104-222)

src/transform/loop_partition.cc (8)

node (41-48)

node (41-41)

node (49-56)

node (49-49)

node (171-184)

node (171-171)

node (234-242)

node (234-234)

🔇 Additional comments (7)

src/transform/loop_vectorize.cc (7)

121-135: LGTM!

The refactor to return PrimExpr and use IRMutatorWithAnalyzer is correct, and the method properly delegates to the base class.

137-143: LGTM!

The refactor to return Stmt and use IRMutatorWithAnalyzer is correct.

145-148: LGTM!

The refactor is correct.

150-158: LGTM!

The refactor is correct.

164-168: LGTM!

The refactor is correct.

250-252: LGTM!

The new analyzer-aware overload correctly propagates the analyzer to the planner.

334-344: LGTM!

The new analyzer-aware overload correctly propagates the analyzer to the planner and maintains the same vectorization logic as the original function.

src/transform/loop_vectorize.cc

LeiWang1999 · 2025-11-21T15:13:53Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

src/transform/layout_inference.cc

…ile-ai#1311) * [Refactor] Update Vectorization Functions to Accept Analyzer Parameter - Modified `VectorizeLoop` and related functions to accept an `arith::Analyzer` parameter, enhancing their capability to perform analysis during vectorization. - Updated multiple instances in `copy.cc`, `fill.cc`, `parallel.cc`, and layout inference files to utilize the new analyzer parameter for improved performance and correctness. - Ensured consistency across vectorization logic by integrating the analyzer into existing workflows, facilitating better optimization opportunities. * [Fix] Corrected PostOrderVisit call in loop_vectorize.cc - Updated the PostOrderVisit function to analyze the body of the loop node instead of the node itself, ensuring proper handling of nested loops during vectorization analysis. * fix * lint fix * fix

LeiWang1999 added 2 commits November 21, 2025 21:07

Merge branch 'main' of https://github.com/tile-ai/tilelang into assum…

d9560ea

…e_1121

LeiWang1999 requested a review from Copilot November 21, 2025 13:10

Copilot AI reviewed Nov 21, 2025

View reviewed changes

coderabbitai bot reviewed Nov 21, 2025

View reviewed changes

[Fix] Corrected PostOrderVisit call in loop_vectorize.cc

7484f05

- Updated the PostOrderVisit function to analyze the body of the loop node instead of the node itself, ensuring proper handling of nested loops during vectorization analysis.

coderabbitai bot reviewed Nov 21, 2025

View reviewed changes

src/transform/loop_vectorize.cc Show resolved Hide resolved

src/transform/loop_vectorize.cc Show resolved Hide resolved

fix

be0d319

chatgpt-codex-connector bot reviewed Nov 21, 2025

View reviewed changes

src/transform/layout_inference.cc Show resolved Hide resolved

LeiWang1999 added 2 commits November 21, 2025 23:29

lint fix

ab44b74

fix

6221326

LeiWang1999 merged commit 9f7bac4 into tile-ai:main Nov 23, 2025
6 checks passed

	if (obj.as<ForNode>()) {
	const ForNode* for_node = obj.as<ForNode>();
	if (for_node && for_node != node) {

		@@ -1 +1 @@
		Subproject commit bc31e7ad9f9fafd7659dfabafe359fd55a0ffc1e
		Subproject commit cd2b2b6013d155b5822300b0a0740fa65320dd9e

[Refactor] Backup Analyzer to get the appropriate arith informations #1311

[Refactor] Backup Analyzer to get the appropriate arith informations #1311

Uh oh!

Conversation

LeiWang1999 commented Nov 21, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Vectorization API and Infrastructure Improvements

Propagation of Analyzer Context

Layout Inference Enhancements

Vectorization Analysis Logic

Dependency Updates

Summary by CodeRabbit

Uh oh!

github-actions bot commented Nov 21, 2025

Uh oh!

coderabbitai bot commented Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Other AI code review bot(s) detected

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

LeiWang1999 commented Nov 21, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

LeiWang1999 commented Nov 21, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 21, 2025 •

edited

Loading