Skip to content

[Bugfix] Avoid reusing local.var allocations (#1678)#1683

Closed
LudovicoYIN wants to merge 1 commit intotile-ai:mainfrom
LudovicoYIN:Bugfix/alloc-var-local-var-reuse
Closed

[Bugfix] Avoid reusing local.var allocations (#1678)#1683
LudovicoYIN wants to merge 1 commit intotile-ai:mainfrom
LudovicoYIN:Bugfix/alloc-var-local-var-reuse

Conversation

@LudovicoYIN
Copy link

@LudovicoYIN LudovicoYIN commented Jan 16, 2026

Summary

This PR fixes incorrect local variable reuse in StorageRewrite that caused
multiple T.alloc_var() declarations to be merged into a single local variable,
leading to unexpected overwrites in generated CUDA code.

Root Cause

StorageRewrite previously allowed reuse of local.var allocations.
However, T.alloc_var() is semantically closer to a distinct scalar declaration
rather than a reusable buffer.

When multiple alloc_var are present in the same scope, reuse breaks
declaration and initialization ordering, producing invalid code such as:

int i = 0;
float tmp_row[4];
i = 1;
i = max(i, tmp_row[0]);  // overwrites previous alloc_var

Fix

  • Disable reuse for local.var in StorageRewrite (storage_rewrite.cc)
  • Ensure each T.alloc_var() results in a unique local variable
  • Preserve correct declaration and initialization semantics

This change only affects local.var and does not impact buffer or shared memory reuse.

Tests

  • test_tilelang_issue_1678.py

Related Issue

Fixes #1678

Summary by CodeRabbit

Release Notes

  • Bug Fixes

    • Improved allocation handling for specific scope scenarios.
  • Tests

    • Added test coverage for allocation with mixed data type scenarios in CUDA environments.

✏️ Tip: You can customize this high-level summary in your review settings.

@github-actions
Copy link

👋 Hi! Thank you for contributing to the TileLang project.

Please remember to run pre-commit run --all-files in the root directory of the project to ensure your changes are properly linted and formatted. This will help ensure your contribution passes the format check.

We appreciate you taking this step! Our team will review your contribution, and we look forward to your awesome work! 🚀

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 16, 2026

📝 Walkthrough

Walkthrough

Fixed a storage allocation bug in TileLang where multiple T.alloc_var() declarations caused variable overwrites. The fix modifies the FindAlloc function to force new allocations for ".var" scope tags instead of reusing previous entries. A new test case validates the fix for mixed-dtype allocation codegen.

Changes

Cohort / File(s) Summary
Core allocation logic fix
src/transform/storage_rewrite.cc
Added short-circuit path in FindAlloc to unconditionally allocate new storage when scope.tag == ".var", bypassing reuse eligibility checks and const_nbits logic for these scopes.
Test coverage for mixed-dtype allocation
testing/python/issue/test_tilelang_issue_1678.py
New test module with JIT kernel factory and test function that validates generated CUDA code correctly preserves type declarations ("int i" and "float amax_local") for mixed-dtype allocations.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Suggested reviewers

  • LeiWang1999

Poem

🐰 A tangled ".var" caused much dismay,
Variables lost in the allocation fray,
But now with fresh paths, each one stands tall,
Mixed types and precision, no overwrites at all! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title clearly describes the main change: avoiding reuse of local.var allocations, directly addressing the bugfix objective of preventing multiple T.alloc_var() declarations from being merged.
Linked Issues check ✅ Passed The PR changes directly address issue #1678: storage_rewrite.cc now prevents reuse for local.var allocations, and the test validates that distinct T.alloc_var() declarations produce separate local variable declarations in generated CUDA code.
Out of Scope Changes check ✅ Passed All changes are directly scoped to fixing issue #1678: modifications to FindAlloc in storage_rewrite.cc prevent local.var reuse, and the test validates the fix without introducing unrelated changes.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

🧹 Recent nitpick comments
testing/python/issue/test_tilelang_issue_1678.py (1)

21-26: Tighten source assertions to avoid false positives.

Plain substring checks like "int i" can match identifiers such as int idx. Consider word‑boundary regex (or more specific patterns) so the test fails if the declaration is missing.

♻️ Proposed tweak
+import re
 import tilelang
 import tilelang.testing
 from tilelang import language as T
@@
-    assert "int i" in source
-    assert "float amax_local" in source
+    assert re.search(r"\bint\s+i\b", source)
+    assert re.search(r"\bfloat\s+amax_local\b", source)

📜 Recent review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 65d5aec and 52e500d.

📒 Files selected for processing (2)
  • src/transform/storage_rewrite.cc
  • testing/python/issue/test_tilelang_issue_1678.py
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2026-01-06T05:20:51.649Z
Learnt from: silentCoder-dev
Repo: tile-ai/tilelang PR: 1606
File: testing/python/transform/test_tilelang_transform_hoist_broadcast_values.py:30-30
Timestamp: 2026-01-06T05:20:51.649Z
Learning: In `testing/python/transform/test_tilelang_transform_hoist_broadcast_values.py`, the test validates that the `hoist_broadcast_values` transformation pass correctly identifies and hoists broadcast operations by checking for patterns in the generated kernel source code. The specific literal values used (e.g., 430) are not important for the test's purpose, as it does not validate numerical precision or actual stored tensor values.

Applied to files:

  • testing/python/issue/test_tilelang_issue_1678.py
📚 Learning: 2025-12-18T04:50:00.512Z
Learnt from: silentCoder-dev
Repo: tile-ai/tilelang PR: 1464
File: testing/python/language/test_tilelang_language_rand.py:14-14
Timestamp: 2025-12-18T04:50:00.512Z
Learning: In `testing/python/language/test_tilelang_language_rand.py`, the TileLang kernel uses `blk_M = M` (single block) and calls `rng_rand()` four times per element to align results with the Triton implementation, which uses `blk_M = 128` (multiple blocks) and calls the RNG once per element. These differences compensate for internal RNG behavior differences between TileLang and Triton.

Applied to files:

  • testing/python/issue/test_tilelang_issue_1678.py
📚 Learning: 2025-11-14T07:56:11.098Z
Learnt from: lucifer1004
Repo: tile-ai/tilelang PR: 1256
File: testing/python/jit/test_tilelang_jit_gemm_nvrtc.py:55-115
Timestamp: 2025-11-14T07:56:11.098Z
Learning: In `testing/python/jit/test_tilelang_jit_gemm_nvrtc.py`, the global function `tilelang_callback_cuda_postproc` registered via `tvm.register_global_func(..., override=True)` is intentionally not restored after the test completes, as the persistent behavior is expected.

Applied to files:

  • testing/python/issue/test_tilelang_issue_1678.py
🧬 Code graph analysis (2)
src/transform/storage_rewrite.cc (3)
src/target/codegen_cuda.cc (6)
  • op (407-422)
  • op (407-407)
  • op (2466-2468)
  • op (2466-2466)
  • op (2469-2471)
  • op (2469-2469)
src/transform/merge_shared_memory_allocations.cc (2)
  • op (77-84)
  • op (77-77)
src/transform/lower_opaque_block.cc (8)
  • op (62-109)
  • op (62-62)
  • op (110-116)
  • op (110-110)
  • op (118-152)
  • op (118-118)
  • op (171-184)
  • op (171-171)
testing/python/issue/test_tilelang_issue_1678.py (4)
tilelang/language/kernel.py (1)
  • threads (215-219)
tilelang/language/dtypes.py (2)
  • int32 (255-255)
  • float32 (310-310)
tilelang/language/allocate.py (1)
  • alloc_local (57-68)
tilelang/language/eager/builder.py (1)
  • source (727-728)
🔇 Additional comments (2)
src/transform/storage_rewrite.cc (1)

1097-1099: LGTM: .var allocations now always unique.

This enforces distinct alloc_var declarations and prevents reuse-related overwrites.

testing/python/issue/test_tilelang_issue_1678.py (1)

6-18: LGTM: minimal mixed‑dtype alloc_var repro is clear.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@LeiWang1999
Copy link
Member

Thanks for your contribution! We dug a bit deeper and decided to disable storage reuse to address the root cause of this issue. Therefore, we are closing this PR. We look forward to your future contributions!

@LudovicoYIN LudovicoYIN deleted the Bugfix/alloc-var-local-var-reuse branch January 20, 2026 01:11
@LudovicoYIN
Copy link
Author

Thanks for the explanation and review! Looking forward to contributing again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Multiple T.alloc_var() Declarations Cause Unexpected Behavior in TileLang

2 participants