[Refactor] Treat local.var as local buffers when deciding vectorization for stable actions#1835
Conversation
|
👋 Hi! Thank you for contributing to the TileLang project. Please remember to run We appreciate you taking this step! Our team will review your contribution, and we look forward to your awesome work! 🚀 |
📝 WalkthroughWalkthroughThe PR modifies buffer locality detection in the tile operation lowering pass to treat var-mapped buffers as local, and adds a test case verifying vectorization behavior with var buffers during parallel execution. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@regression-perf |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@testing/python/language/test_tilelang_language_parallel.py`:
- Around line 69-77: The test uses the low-level T.parallel() in
_parallel_vectorize_local_and_var while other tests use the TileLang high-level
T.Parallel(); update the for loop in _parallel_vectorize_local_and_var to use
T.Parallel() instead of T.parallel() so the file remains consistent with the
TileLang API (search for the function _parallel_vectorize_local_and_var and
replace the T.parallel(...) usage with T.Parallel(...)).
🧹 Nitpick comments (1)
testing/python/language/test_tilelang_language_parallel.py (1)
79-82: Consider adding a CUDA-availability guard.Other tests in this file skip gracefully when CUDA is unavailable (
_require_cuda_tensor). Ifget_kernel_source()requires a CUDA-capable environment for compilation, this test will hard-fail in CPU-only CI. Consider wrapping it with a skip check similar to the other tests.
| @tilelang.jit | ||
| def _parallel_vectorize_local_and_var(): | ||
| with T.Kernel(1) as _: | ||
| x = T.alloc_fragment([256], T.float32) | ||
| y = T.alloc_fragment([256], T.float32) | ||
| z = T.alloc_var(T.float32) | ||
| for i in T.parallel(256): | ||
| y[i] = x[i] * z | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, find and read the test file to verify the actual casing
fd test_tilelang_language_parallel.py --type fRepository: tile-ai/tilelang
Length of output: 118
🏁 Script executed:
# Read the specific lines mentioned in the review
if [ -f "testing/python/language/test_tilelang_language_parallel.py" ]; then
echo "=== Lines 69-77 (the disputed code) ===" && \
sed -n '69,77p' testing/python/language/test_tilelang_language_parallel.py && \
echo -e "\n=== Lines 18, 33, 36 (for comparison) ===" && \
sed -n '18p;33p;36p' testing/python/language/test_tilelang_language_parallel.py
fiRepository: tile-ai/tilelang
Length of output: 538
🏁 Script executed:
# Search for T.parallel and T.Parallel definitions in the codebase
rg "def parallel|def Parallel" --type py -C 2Repository: tile-ai/tilelang
Length of output: 10557
🏁 Script executed:
# Check how T is imported and what it refers to
rg "^import T|^from.*import.*T|^T = " testing/python/language/test_tilelang_language_parallel.py -C 1Repository: tile-ai/tilelang
Length of output: 42
🏁 Script executed:
# Check the imports at the top of the test file
head -30 testing/python/language/test_tilelang_language_parallel.pyRepository: tile-ai/tilelang
Length of output: 783
🏁 Script executed:
# Search for T usage and imports more broadly
rg "^import|^from" testing/python/language/test_tilelang_language_parallel.py | head -20Repository: tile-ai/tilelang
Length of output: 156
🏁 Script executed:
# Check what T module contains and where Parallel is accessible from
cat -n tilelang/language/loop.py | head -50Repository: tile-ai/tilelang
Length of output: 2267
🏁 Script executed:
# Check if parallel is aliased to Parallel or available in the same module
rg "parallel|Parallel" tilelang/language/__init__.py -C 3Repository: tile-ai/tilelang
Length of output: 375
🏁 Script executed:
# Check what else is exported from __init__.py, especially any lowercase parallel
rg "parallel" tilelang/language/__init__.pyRepository: tile-ai/tilelang
Length of output: 42
🏁 Script executed:
# Check if there's an alias or if parallel is defined elsewhere
rg "^parallel" tilelang/language/ -A 2Repository: tile-ai/tilelang
Length of output: 42
🏁 Script executed:
# Let's also check the specific loop.py file more thoroughly for any lowercase parallel
rg -i "parallel" tilelang/language/loop.py | grep -v "^\s*#"Repository: tile-ai/tilelang
Length of output: 999
🏁 Script executed:
# Check if parallel exists in tir/ir.py and ast/ir.py that was mentioned in the earlier grep
cat -n tilelang/language/tir/ir.py | grep -A 10 "^.*def parallel"Repository: tile-ai/tilelang
Length of output: 436
🏁 Script executed:
# Check if tir/ir.py or parallel (lowercase) is imported in __init__.py
cat tilelang/language/__init__.pyRepository: tile-ai/tilelang
Length of output: 4252
🏁 Script executed:
# Verify the full tir/ir.py file to understand its structure
wc -l tilelang/language/tir/ir.py
head -100 tilelang/language/tir/ir.pyRepository: tile-ai/tilelang
Length of output: 2726
🏁 Script executed:
# Check if parallel is explicitly exported from tir module
rg "from.*tir" tilelang/language/__init__.py -C 3Repository: tile-ai/tilelang
Length of output: 569
🏁 Script executed:
# Verify the function signatures one more time to be absolutely certain
grep -A 5 "def parallel(" tilelang/language/tir/ir.py | head -10
grep -A 5 "def Parallel(" tilelang/language/loop.py | head -10Repository: tile-ai/tilelang
Length of output: 454
🏁 Script executed:
# Check if there are other uses of T.parallel vs T.Parallel in the test file to see the pattern
rg "T\.parallel|T\.Parallel" testing/python/language/test_tilelang_language_parallel.py -nRepository: tile-ai/tilelang
Length of output: 227
Use T.Parallel() for consistency with the rest of the file.
Line 75 uses T.parallel() (the low-level TVM IR API), while lines 18, 33, and 36 use T.Parallel() (the TileLang high-level API). Both are valid, but the file should use T.Parallel() consistently for readability and to maintain the TileLang abstraction level.
🤖 Prompt for AI Agents
In `@testing/python/language/test_tilelang_language_parallel.py` around lines 69 -
77, The test uses the low-level T.parallel() in
_parallel_vectorize_local_and_var while other tests use the TileLang high-level
T.Parallel(); update the for loop in _parallel_vectorize_local_and_var to use
T.Parallel() instead of T.parallel() so the file remains consistent with the
TileLang API (search for the function _parallel_vectorize_local_and_var and
replace the T.parallel(...) usage with T.Parallel(...)).
Performance Regression Test ReportTriggered by: @LeiWang1999 Results
Artifacts
|
Summary by CodeRabbit
Refactor
Tests