Skip to content

[Bugfix] Do not consider local.var as local buffer during LowerTileOP#1628

Merged
LeiWang1999 merged 2 commits intotile-ai:mainfrom
LeiWang1999:tilefix_0107
Jan 7, 2026
Merged

[Bugfix] Do not consider local.var as local buffer during LowerTileOP#1628
LeiWang1999 merged 2 commits intotile-ai:mainfrom
LeiWang1999:tilefix_0107

Conversation

@LeiWang1999
Copy link
Member

@LeiWang1999 LeiWang1999 commented Jan 7, 2026

as title, which may lead to significant performance regression when we assign value into var.

Summary by CodeRabbit

Release Notes

  • Bug Fixes

    • Enhanced local buffer detection in parallel loop optimization, enabling thread binding and vectorization in additional scenarios.
  • Tests

    • Added runtime verification of generated kernel source code in test suite.

✏️ Tip: You can customize this high-level summary in your review settings.

@github-actions
Copy link

github-actions bot commented Jan 7, 2026

👋 Hi! Thank you for contributing to the TileLang project.

Please remember to run pre-commit run --all-files in the root directory of the project to ensure your changes are properly linted and formatted. This will help ensure your contribution passes the format check.

We appreciate you taking this step! Our team will review your contribution, and we look forward to your awesome work! 🚀

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 7, 2026

Caution

Review failed

The pull request is closed.

📝 Walkthrough

Walkthrough

Broadens local buffer recognition in lower_tile_op.cc by replacing IsLocalBuffer(..., true) calls with IsLocalBuffer(...), affecting store detection, register-only checks, and non-local buffer validation. This propagates into parallel-loop partitioning and vectorization logic. Additionally, adds runtime kernel source verification in a test.

Changes

Cohort / File(s) Summary
Core compiler logic
src/transform/lower_tile_op.cc
Removes second argument from all IsLocalBuffer() calls across multiple functions (store_into_local, local_register_only, has_non_local, parallel_loop, should_vectorize). Broadens buffer classification to enable parallel-loop partitioning and vectorization in additional cases by relaxing local-buffer detection criteria.
Test verification
testing/python/issue/test_tilelang_issue_1549.py
Adds post-execution runtime introspection: retrieves generated kernel source via get_kernel_source() and verifies that a specific multi-line loop snippet is present in the compiled code.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • kurisu6912

Poem

🐰 Buffers hop with newfound cheer,
Local checks grow crystal clear,
Thread partitions dance with glee,
Vectorized loops wild and free,
No strict guards to hold us back—
Compilation speeds on track! 🚀

✨ Finishing touches
  • 📝 Generate docstrings

📜 Recent review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a8bf4f6 and b3297d1.

📒 Files selected for processing (2)
  • src/transform/lower_tile_op.cc
  • testing/python/issue/test_tilelang_issue_1549.py

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@LeiWang1999 LeiWang1999 merged commit 358f899 into tile-ai:main Jan 7, 2026
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant