[Bugfix] Resolve mixed stride dtype issue (inconsistent int32/int64 values) #1119

LeiWang1999 · 2025-10-24T08:17:46Z

This pull request introduces a new helper utility for argument binding in the TVM codebase, specifically under the tvm::tl namespace. The main addition is the implementation of ArgBinder, which provides a consistent way to match and bind function arguments, handle symbolic buffers, and generate necessary assertions and initializations. Several supporting changes were made to integrate this new utility and improve type safety in related vectorization logic.

Key changes:

New Argument Binding Utility

Added new files arg_binder.cc and arg_binder.h implementing the ArgBinder class, which provides methods to bind primitive expressions, arrays, buffers, and DLTensor handles, while generating necessary variable definitions, assertions, and initialization statements. This utility is designed to standardize argument binding and constraint checking across TVM transformations. [1] [2]

Integration and Refactoring

Updated include statements in make_packed_api.cc to use the new arg_binder.h path, replacing the previous include from tir/transforms/arg_binder.h with the local header.

Vectorization Improvements

Improved type safety and correctness in IndiceCanVectorize within loop_vectorize.cc by ensuring that all constants and variables used in vectorization checks and substitutions match the relevant data types. This reduces the risk of subtle bugs due to type mismatches.

Minor Cleanups

Minor whitespace cleanup in make_packed_api.cc for consistency.

Summary by CodeRabbit

New Features
- Added a comprehensive argument binding and validation system to improve shape, dtype, device and buffer checks during transforms.
Refactor
- Aligned vectorization logic to use type-consistent size handling for safer and more correct transformations.
Style
- Adjusted static analysis configuration to refine which headers are linted.
Chores
- Minor include path and formatting adjustments.

github-actions · 2025-10-24T08:18:00Z

👋 Hi! Thank you for contributing to the TileLang project.

Please remember to run pre-commit run --all-files in the root directory of the project to ensure your changes are properly linted and formatted. This will help ensure your contribution passes the format check.

We appreciate you taking this step! Our team will review your contribution, and we look forward to your awesome work! 🚀

coderabbitai · 2025-10-24T08:22:01Z

Walkthrough

Adds a new ArgBinder utility implementing argument-to-value binding and validations (arrays, Buffer, DLTensor), updates vectorization checks to use dtype-consistent size constants, and adjusts an include path and clang-tidy header filter.

Changes

Cohort / File(s)	Summary
New ArgBinder utility `src/transform/arg_binder.h`, `src/transform/arg_binder.cc`	Adds ArgBinder class and `BinderAddAssert` to bind/validate single args, arrays, Buffer ↔ Buffer mappings, and DLTensor-like handles. Generates defs, asserts, and nested init stmts (LetStmt, DeclBuffer, AssertStmt, IfThenElse) and uses arith::Analyzer for simplification.
Type-consistent vectorization `src/transform/loop_vectorize.cc`	Replaces raw `target_vectorized_size` uses with dtype-specific constants (`target_size_for_iter`, `target_size_for_expr`, `target_size_for_var`) and updates substitutions, thread-range bindings, and Vectorizer construction for type-aligned computations.
Include path update `src/transform/make_packed_api.cc`	Changes include from `"tir/transforms/arg_binder.h"` to `"arg_binder.h"` and removes an adjacent empty line.
Clang-tidy header filter `.clang-tidy`	Replaces `ExcludeHeaderFilterRegex` with `HeaderFilterRegex` using negative lookahead to exclude 3rdparty and tvm headers from analysis.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant CallSite as Call Site
  participant ArgBinder as ArgBinder
  participant Analyzer as arith::Analyzer
  participant AST as AST Builder

  Note over CallSite,ArgBinder: Binding request (arg, value, name)

  CallSite->>ArgBinder: Bind / BindArray / BindBuffer / BindDLTensor
  ArgBinder->>Analyzer: Simplify conditions / compute simplified exprs
  Analyzer-->>ArgBinder: simplified exprs
  opt validation checks
    ArgBinder->>ArgBinder: generate assertions (BinderAddAssert)
  end
  ArgBinder->>AST: emit LetStmt / DeclBuffer / AssertStmt / IfThenElse
  AST-->>ArgBinder: nested init stmts
  ArgBinder-->>CallSite: defs, asserts, init_nest

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I bind the shapes with careful paws,
I check the strides and mind the laws,
Buffers hum, DLTensors sing,
Assertions keep the correctness ring,
A vector hops in type-safe claws. 🎉

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 8.33% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The PR title "[Bugfix] Resolve mixed stride dtype issue (inconsistent int32/int64 values)" is partially related to the changeset. The title refers to a real aspect of the changes—dtype consistency improvements do appear in the PR, particularly in the BindDLTensor method's stride validation and in the IndiceCanVectorize function's type-consistent comparisons in loop_vectorize.cc. However, the title does not capture the primary focus of the changeset, which is the introduction of the new ArgBinder utility class that provides comprehensive argument binding and validation for expressions, arrays, buffers, and DLTensor handles. While the dtype-related improvements are genuine components of the changes, the title's framing as a targeted bugfix somewhat understates the significance of the new ArgBinder feature.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (3)

src/transform/arg_binder.h (2)
88-97: Docs: clarify fuzzy_match semantics.

Implementation allows value to have extra leading 1-dims compared to arg (value.rank >= arg.rank). Update comment accordingly.
- * \param fuzzy_match If enabled, we allow value's dimension to be smaller
- * than arg, as long as arg's higher dimensions are of 1.
+ * \param fuzzy_match If enabled, allow value to have extra leading dimensions of size 1
+ * (i.e., value.rank >= arg.rank, with value.shape[0:diff] == 1).
112-126: Docs: minor typos.

Fix “statemtn” -> “statement”, “Intializing” -> “Initializing”.
src/transform/arg_binder.cc (1)

40-51: Error text polish (optional).

"Bind have an unmet assertion" → "Binding has an unmet assertion" for clearer logs. Low priority.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 50e789d and becde64.

📒 Files selected for processing (4)

src/transform/arg_binder.cc (1 hunks)
src/transform/arg_binder.h (1 hunks)
src/transform/loop_vectorize.cc (1 hunks)
src/transform/make_packed_api.cc (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (2)

src/transform/arg_binder.cc (2)

tilelang/language/ast/ir.py (3)

LetStmt (880-908)

handle (1467-1497)

decl_buffer (1137-1205)

tilelang/language/tir/op.py (3)

truncmod (3047-3070)

isnullptr (2649-2665)

if_then_else (2907-2937)

src/transform/arg_binder.h (1)

src/transform/arg_binder.cc (10)

Bind (78-81)

Bind (78-79)

BindArray (83-93)

BindArray (83-85)

BindBuffer (95-156)

BindBuffer (95-96)

BindDLTensor (163-373)

BindDLTensor (163-165)

Bind_ (54-76)

Bind_ (54-55)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Test for Python 3.12 with Metal (on macos-latest)

🔇 Additional comments (1)

src/transform/make_packed_api.cc (1)

39-39: Include path consistency verified across the codebase.

The verification confirms that all references to arg_binder.h use the consistent local include path (#include "arg_binder.h"). Both arg_binder.cc and make_packed_api.cc follow the same pattern, and no stale old-style includes remain in the repository.

src/transform/arg_binder.cc

coderabbitai · 2025-10-24T08:28:21Z

src/transform/arg_binder.h

+ * \brief Helper utility to generate match and bind of arguments.
+ *
+ * \note There is many places in TVM IR where we need argument bindings.
+ *
+ *  Consider a function f(tA(shape=var(n)), tB(shape=3), tC(shape=(n+2)).
+ *  Here n is a undefined variable that is decided by the outside, tB imposes
+ *  a constraint such that it can only take tensor with shape 3, tC imposes
+ *  another constraint that it's shape must equals n + 2.
+ *  So if we call it with f(bufferA, bufferB, bufferC), we need to generate
+ *  the following binding sequence:
+ *  - define n = bufferA.shape[0]
+ *  - assert bufferB.shape[0] == 3
+ *  - assert bufferB.shape[1] == n + 3
+ *
+ *  In general, this is a constraint solving problem. We have simplified
+ * assumption over the binding declaration, such that we require the variable
+ * occurred in constraint must be declared in argument list. So it is illegal to
+ * have signature f(tA(shape=(n+3))) without any argument variable corresponds
+ * to n, even though it is already enough to derive n from the input argument.
+ */


⚠️ Potential issue | 🟡 Minor

Docs: fix example typos and mismatches.

Several doc issues can mislead:

"binded" -> "bound".

Example says tC(shape=(n+2)) but later mentions "== n + 3" and uses bufferB. Should be bufferC and n + 2.

Proposed patch:

- * \brief Helper utility to generate match and bind of arguments. + * \brief Helper utility to generate match and bind of arguments. ... - * So if we call it with f(bufferA, bufferB, bufferC), we need to generate + * So if we call it with f(bufferA, bufferB, bufferC), we need to generate * the following binding sequence: * - define n = bufferA.shape[0] * - assert bufferB.shape[0] == 3 - * - assert bufferB.shape[1] == n + 3 + * - assert bufferC.shape[0] == n + 2

🤖 Prompt for AI Agents

In src/transform/arg_binder.h around lines 41 to 60, fix the doc typos and mismatches: change "binded" to "bound", correct the example so tC(shape=(n+2)) corresponds to bufferC and its assertion reads "assert bufferC.shape[1] == n + 2" (not bufferB and not n + 3), and ensure the text consistently refers to bufferC where appropriate.

coderabbitai · 2025-10-24T08:28:21Z

src/transform/loop_vectorize.cc

+  PrimExpr target_size_for_iter =
+      make_const(iter_var_size.dtype(), target_vectorized_size);
+  PrimExpr target_size_for_expr =
+      make_const(expr.dtype(), target_vectorized_size);
+  PrimExpr target_size_for_var =
+      make_const(var.dtype(), target_vectorized_size);
+  PrimExpr zero = make_const(var.dtype(), 0);
+


⚠️ Potential issue | 🟠 Major

Finish the dtype cleanup: use typed zero and cast Range extent.

Great move to make per-dtype target constants. Two follow-ups to avoid int32/int64 mismatches:

Compare against typed zero instead of raw 0.

Cast the Range extent for v1 to var.dtype() to avoid mixed dtypes (var vs iter_var_size).

Apply this diff:

- PrimExpr target_size_for_iter = - make_const(iter_var_size.dtype(), target_vectorized_size); - PrimExpr target_size_for_expr = - make_const(expr.dtype(), target_vectorized_size); - PrimExpr target_size_for_var = - make_const(var.dtype(), target_vectorized_size); - PrimExpr zero = make_const(var.dtype(), 0); + PrimExpr target_size_for_iter = make_const(iter_var_size.dtype(), target_vectorized_size); + PrimExpr target_size_for_expr = make_const(expr.dtype(), target_vectorized_size); + PrimExpr target_size_for_var = make_const(var.dtype(), target_vectorized_size); + PrimExpr zero_var = make_const(var.dtype(), 0); + PrimExpr zero_iter = make_const(iter_var_size.dtype(), 0); + PrimExpr zero_expr = make_const(expr.dtype(), 0); - if (!analyzer->CanProveEqual(FloorMod(iter_var_size, target_size_for_iter), - 0)) + if (!analyzer->CanProveEqual(FloorMod(iter_var_size, target_size_for_iter), zero_iter)) return false; - if (!analyzer->CanProveEqual( - FloorMod(Substitute(expr, {{var, zero}}), target_size_for_expr), 0)) { + if (!analyzer->CanProveEqual( + FloorMod(Substitute(expr, {{var, zero_var}}), target_size_for_expr), zero_expr)) { return false; } - Var v0("v0", var.dtype()), v1("v1", var.dtype()); - analyzer->Bind(v0, Range(zero, target_size_for_var)); - analyzer->Bind(v1, Range(zero, analyzer->Simplify(FloorDiv( - iter_var_size, target_size_for_iter)))); + Var v0("v0", var.dtype()), v1("v1", var.dtype()); + analyzer->Bind(v0, Range(zero_var, target_size_for_var)); + PrimExpr v1_extent = analyzer->Simplify(FloorDiv(iter_var_size, target_size_for_iter)); + analyzer->Bind(v1, Range(zero_var, cast(var.dtype(), v1_extent))); - PrimExpr expr_transformed = analyzer->Simplify( - Substitute(expr, {{var, v0 + v1 * target_size_for_var}})); + PrimExpr expr_transformed = + analyzer->Simplify(Substitute(expr, {{var, v0 + v1 * target_size_for_var}}));

Also applies to: 273-276, 279-291

🤖 Prompt for AI Agents

In src/transform/loop_vectorize.cc around lines 265-272 (and similarly for 273-276 and 279-291), the code uses an untyped literal 0 and mixed dtypes when constructing target sizes and comparing Range extents; change the zero to a typed zero created with make_const(var.dtype(), 0) (already declared as zero) and cast the Range extent (v1) to var.dtype() when used (e.g., wrap v1 with Cast(var.dtype(), v1) or make_const with var.dtype()) so all comparisons and size constants use the same dtype and avoid int32/int64 mismatches.

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (1)

src/transform/arg_binder.cc (1)
221-227: Fix inconsistent Var usage and type annotation for shape handle.

The past review comment correctly identifies this issue. The code creates v_shape on line 221 and maps it in def_handle_dtype_ on line 222, but the actual buffer uses buf_shape->data (line 224). This inconsistency means downstream code expecting buf_shape->data in the map will not find it.

Additionally, line 222 uses make_const(tvm_shape_type, 0) while strides (line 242) and data (line 367) use tir::TypeAnnotation(...), creating another inconsistency.

Apply this diff to fix both issues:
- Var v_shape(shape_handle_name(), DataType::Handle());
- def_handle_dtype_.Set(v_shape, make_const(tvm_shape_type, 0));
+ def_handle_dtype_.Set(buf_shape->data, tir::TypeAnnotation(tvm_shape_type));
Based on past review comments.

🧹 Nitpick comments (1)

.clang-tidy (1)
7-7: Consider simplifying the regex pattern for clarity.

The negative lookahead pattern (?!.*(?:/|^)(3rdparty|tvm)/) works but is somewhat unclear because ^ inside (?:/|^) always refers to the string start, making the grouping confusing.

A clearer equivalent would be:
-HeaderFilterRegex: '^(?!.*(?:/|^)(3rdparty|tvm)/).*'
+HeaderFilterRegex: '^(?!.*/(?:3rdparty|tvm)/|^(?:3rdparty|tvm)/).*'
This explicitly separates the two cases: paths containing /3rdparty/ or /tvm/, and paths starting with 3rdparty/ or tvm/.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between becde64 and ebbc8a9.

📒 Files selected for processing (2)

.clang-tidy (1 hunks)
src/transform/arg_binder.cc (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

src/transform/arg_binder.cc (2)

tilelang/language/ast/ir.py (3)

LetStmt (880-908)

handle (1467-1497)

decl_buffer (1137-1205)

tilelang/language/tir/op.py (3)

truncmod (3047-3070)

isnullptr (2649-2665)

if_then_else (2907-2937)

🔇 Additional comments (6)

src/transform/arg_binder.cc (6)

40-52: LGTM!

The assertion helper correctly simplifies conditions and handles three cases: always-false (fatal), always-true (no-op), and conditional (runtime assert).

54-81: LGTM!

The binding logic correctly handles new variable introductions, existing bindings, and non-variable arguments with appropriate dtype checks and assertions.

83-93: LGTM!

Array binding correctly validates size equality and binds each element with proper naming.

95-156: LGTM!

Buffer binding comprehensively validates scope, dtype, alignment, offset factors, and shape/strides with appropriate fuzzy matching support.

228-237: Verify the sub-byte dtype special case is intentional.

The code skips shape element binding for Int(4), UInt(4), and Int(1) dtypes by breaking out of the loop early. Please confirm this is the intended behavior, as it means shape elements are not bound or validated for these sub-byte types.

If this special handling is necessary, consider adding a comment explaining why sub-byte types bypass shape binding.

238-373: LGTM!

The stride, offset, device, and data pointer binding logic is comprehensive:

Handles three buffer types (compact, auto-broadcast, explicit strides) correctly

Properly distinguishes constant vs. variable offsets

Includes appropriate NULL checks with size-0 array special case

Consistently uses TypeAnnotation for strides and data pointer (lines 242, 367)

LeiWang1999 · 2025-10-24T10:26:13Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2025-10-24T10:31:26Z

src/transform/arg_binder.cc

+void ArgBinder::BindBuffer(const Buffer &arg, const Buffer &value,
+                           const std::string &arg_name, bool fuzzy_match) {
+  ICHECK_EQ(arg.scope(), value.scope())
+      << "Argument " << arg_name << " Buffer bind scope mismatch";
+  ICHECK_EQ(arg->dtype, value->dtype)
+      << "Argument " << arg_name << " Buffer bind data type mismatch";
+  if (value->data_alignment % arg->data_alignment != 0) {
+    LOG(WARNING) << "Trying to bind buffer to another one with lower alignment "
+                    "requirement "


Avoid modulo by zero when comparing buffer alignments

The new ArgBinder::BindBuffer warns when the provided buffer’s alignment is smaller than the required one by computing value->data_alignment % arg->data_alignment. However, BufferNode::data_alignment is zero by default when no alignment is requested. If a symbolic buffer without an explicit alignment is bound here (the common case), this expression executes a modulo with a zero divisor and triggers undefined behaviour before any warning is issued. The guard should skip the check when arg->data_alignment is zero or treat it as 1/“no requirement” to avoid a runtime crash in the binder.

Useful? React with 👍 / 👎.

…alues) (tile-ai#1119) * fix int32 dtype issue * lint fix * lint * lint fix --------- Co-authored-by: Zhiwen Mo <[email protected]>

Hamerlate added 2 commits October 24, 2025 16:14

fix int32 dtype issue

be08b4c

lint fix

becde64

coderabbitai bot reviewed Oct 24, 2025

View reviewed changes

Hamerlate added 2 commits October 24, 2025 16:31

lint

6e49df2

lint fix

ebbc8a9

coderabbitai bot reviewed Oct 24, 2025

View reviewed changes

chatgpt-codex-connector bot reviewed Oct 24, 2025

View reviewed changes

LeiWang1999 merged commit 65c4711 into tile-ai:main Oct 24, 2025
6 checks passed

coderabbitai bot mentioned this pull request Oct 25, 2025

[FFI] Rebase tvm to v0.22.0 to utilize tvm-ffi #1108

Merged

This was referenced Nov 5, 2025

[Feat] Add A Pass to Handle Negative Index #1192

Merged

[Fix] Fix buffer re-import typo in tilelang.languge #1214

Merged

[Fix] Fix a type that make wrong T.macro backtrace #1234

Merged

kurisu6912 mentioned this pull request Nov 12, 2025

[Language] Add type stubs for tir op #1239

Merged

coderabbitai bot mentioned this pull request Nov 14, 2025

[FFI] Use tvm ffi as the default execution backend #1259

Merged

This was referenced Nov 21, 2025

[Feat] Add missing support for uint32x2, add unsigned implicit cast in bitwise op, add T.Ref as macro annotation #1302

Closed

[Fix] Remove unused let_bindings_ in CodeGenC to fix #1300 #1305

Merged

[Fix] Fix frame scope error in T.macro #1308

Merged

coderabbitai bot mentioned this pull request Nov 21, 2025

[Refactor] Backup Analyzer to get the appropriate arith informations #1311

Merged

This was referenced Nov 27, 2025

[Refactor] Improve assertion handling in CodeGenCHost and ArgBinder #1352

Merged

[Enhancement] Improve error handling and assertion messages across runtime and argument binding #1356

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bugfix] Resolve mixed stride dtype issue (inconsistent int32/int64 values) #1119

[Bugfix] Resolve mixed stride dtype issue (inconsistent int32/int64 values) #1119

Uh oh!

LeiWang1999 commented Oct 24, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

github-actions bot commented Oct 24, 2025

Uh oh!

coderabbitai bot commented Oct 24, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot Oct 24, 2025

Uh oh!

coderabbitai bot Oct 24, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

LeiWang1999 commented Oct 24, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Oct 24, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[Bugfix] Resolve mixed stride dtype issue (inconsistent int32/int64 values) #1119

[Bugfix] Resolve mixed stride dtype issue (inconsistent int32/int64 values) #1119

Uh oh!

Conversation

LeiWang1999 commented Oct 24, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

New Argument Binding Utility

Integration and Refactoring

Vectorization Improvements

Minor Cleanups

Summary by CodeRabbit

Uh oh!

github-actions bot commented Oct 24, 2025

Uh oh!

coderabbitai bot commented Oct 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

LeiWang1999 commented Oct 24, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

LeiWang1999 commented Oct 24, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 24, 2025 •

edited

Loading