-
Notifications
You must be signed in to change notification settings - Fork 15.5k
[LoopCacheAnalysis] Replace delinearization for fixed size array #164798
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LoopCacheAnalysis] Replace delinearization for fixed size array #164798
Conversation
|
@llvm/pr-subscribers-llvm-transforms @llvm/pr-subscribers-llvm-analysis Author: Ryotaro Kasuga (kasuga-fj) ChangesThis patch replaces the delinearization function used in LoopCacheAnalysis, switching from one that depends on type information in GEPs to one that does not. Once this patch and #161822 are landed, we can delete Full diff: https://github.com/llvm/llvm-project/pull/164798.diff 3 Files Affected:
diff --git a/llvm/include/llvm/Analysis/LoopCacheAnalysis.h b/llvm/include/llvm/Analysis/LoopCacheAnalysis.h
index 3e22487e5e349..70ccd8aaed20f 100644
--- a/llvm/include/llvm/Analysis/LoopCacheAnalysis.h
+++ b/llvm/include/llvm/Analysis/LoopCacheAnalysis.h
@@ -102,7 +102,8 @@ class IndexedReference {
/// Attempt to delinearize \p AccessFn for fixed-size arrays.
bool tryDelinearizeFixedSize(const SCEV *AccessFn,
- SmallVectorImpl<const SCEV *> &Subscripts);
+ SmallVectorImpl<const SCEV *> &Subscripts,
+ const SCEV *ElementSize);
/// Return true if the index reference is invariant with respect to loop \p L.
bool isLoopInvariant(const Loop &L) const;
diff --git a/llvm/lib/Analysis/LoopCacheAnalysis.cpp b/llvm/lib/Analysis/LoopCacheAnalysis.cpp
index 050c32707596a..c1ce736bb51b0 100644
--- a/llvm/lib/Analysis/LoopCacheAnalysis.cpp
+++ b/llvm/lib/Analysis/LoopCacheAnalysis.cpp
@@ -355,22 +355,18 @@ CacheCostTy IndexedReference::computeRefCost(const Loop &L,
}
bool IndexedReference::tryDelinearizeFixedSize(
- const SCEV *AccessFn, SmallVectorImpl<const SCEV *> &Subscripts) {
- SmallVector<int, 4> ArraySizes;
- if (!tryDelinearizeFixedSizeImpl(&SE, &StoreOrLoadInst, AccessFn, Subscripts,
- ArraySizes))
+ const SCEV *AccessFn, SmallVectorImpl<const SCEV *> &Subscripts,
+ const SCEV *ElementSize) {
+ const SCEV *Offset = SE.removePointerBase(AccessFn);
+ if (!delinearizeFixedSizeArray(SE, Offset, Subscripts, Sizes, ElementSize)) {
+ Sizes.clear();
return false;
+ }
- // Populate Sizes with scev expressions to be used in calculations later.
- for (auto Idx : seq<unsigned>(1, Subscripts.size()))
- Sizes.push_back(
- SE.getConstant(Subscripts[Idx]->getType(), ArraySizes[Idx - 1]));
-
- LLVM_DEBUG({
- dbgs() << "Delinearized subscripts of fixed-size array\n"
- << "GEP:" << *getLoadStorePointerOperand(&StoreOrLoadInst)
- << "\n";
- });
+ // Drop the last element of Sizes which is the same as ElementSize.
+ assert(!Sizes.empty() && Sizes.back() == ElementSize &&
+ "Expecting the last one to be the element size");
+ Sizes.pop_back();
return true;
}
@@ -397,7 +393,7 @@ bool IndexedReference::delinearize(const LoopInfo &LI) {
bool IsFixedSize = false;
// Try to delinearize fixed-size arrays.
- if (tryDelinearizeFixedSize(AccessFn, Subscripts)) {
+ if (tryDelinearizeFixedSize(AccessFn, Subscripts, ElemSize)) {
IsFixedSize = true;
// The last element of Sizes is the element size.
Sizes.push_back(ElemSize);
diff --git a/llvm/test/Analysis/LoopCacheAnalysis/interchange-refcost-overflow.ll b/llvm/test/Analysis/LoopCacheAnalysis/interchange-refcost-overflow.ll
index 7b6529601da32..30554d08a6181 100644
--- a/llvm/test/Analysis/LoopCacheAnalysis/interchange-refcost-overflow.ll
+++ b/llvm/test/Analysis/LoopCacheAnalysis/interchange-refcost-overflow.ll
@@ -9,8 +9,8 @@
; A[c][d][d] = 0;
; }
-; CHECK: Loop 'outer.loop' has cost = 9223372036854775807
; CHECK: Loop 'inner.loop' has cost = 9223372036854775807
+; CHECK: Loop 'outer.loop' has cost = 10000
@A = local_unnamed_addr global [11 x [11 x [11 x i32]]] zeroinitializer, align 16
|
|
|
||
| ; CHECK: Loop 'outer.loop' has cost = 9223372036854775807 | ||
| ; CHECK: Loop 'inner.loop' has cost = 9223372036854775807 | ||
| ; CHECK: Loop 'outer.loop' has cost = 10000 | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test was added in #111807. Based on the description of the PR, the test appears to ensure that no assertion errors are triggered, so I believe this change is not particularly significant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sjoerdmeijer Can you confirm? Seems that it is basically ceiling the cost number to UINT64_MAX, so inner.loop is still testing that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sjoerdmeijer ping
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay, thanks for the ping, this dropped off my radar.
Hmmm, yeah, this is a bit of an interesting test case change. This change reveals that the test is fragile: apparently the accuracy of the analysis has improved, and it is now able to more bound the cost a lot better. The result is a small cost value, which is not going to overflow. So, from that point of view, the test is no longer testing what it is supposed to test, i.e. it is saturating to UINT_MAX and it is able to hold value 9223372036854775807.
Again, that is a problem with the test, not this change. But just wondering if you'd already looked into the test by any chance. I.e., do you see an easy way to modify that test and keep the behaviour so that it is not able to cost the loop and keep this behaviour?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you see an easy way to modify that test and keep the behaviour so that it is not able to cost the loop and keep this behaviour?
No. It seems like it's worth trying. I'll give it a shot. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the test (added one more test). Does it make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I think it does.
| << "GEP:" << *getLoadStorePointerOperand(&StoreOrLoadInst) | ||
| << "\n"; | ||
| }); | ||
| // Drop the last element of Sizes which is the same as ElementSize. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I think this is self-evident from the assert and line 369, but maybe the more interesting question is why we drop that element, so maybe you can leave a little comment about that.
| }); | ||
| // Drop the last element of Sizes which is the same as ElementSize. | ||
| assert(!Sizes.empty() && Sizes.back() == ElementSize && | ||
| "Expecting the last one to be the element size"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was going leave a nit if we can make the message a little bit more descriptive, but now looking at this, my first question is why we need to pass ElementSize if it is last element in the Sizes list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have the following LLVM,
...
%gep = getelementptr i8, ptr %p, i64 %offset
store i32 0, %gepto delinearize %gep as an i32 array, %offset must be a multiple of 4. We need to tell this information to delinearization function, and it's now passed as ElementSize. For historical reasons, delinearizer appends it to the end of Sizes.
41a1523 to
3cf3107
Compare
sjoerdmeijer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, thanks, LGTM.
Maybe give @Meinersbur one more chance to look at this before merging it.
🐧 Linux x64 Test Results
|
|
I'll wait a few days, and if there are no objections, I'll go ahead and merge. |
Meinersbur
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I had a first try removing getIndexExpressionsFromGEP from Polly, only updating tests so far.
|
Thanks. I'll go ahead and remove |
|
Hello @kasuga-fj The following starts crashing with this patch: |
|
@mikaelholmen Thanks for the reproducer, I'll take a look. |
…m#164798) This patch replaces the delinearization function used in LoopCacheAnalysis, switching from one that depends on type information in GEPs to one that does not. Once this patch and llvm#161822 are landed, we can delete `tryDelinearizeFixedSize` from Delienarization, which is an optimization heuristic guided by GEP type information. After Polly eliminates its use of `getIndexExpressionsFromGEP`, we will be able to completely delete GEP-driven heuristics from Delinearization.
…m#164798) This patch replaces the delinearization function used in LoopCacheAnalysis, switching from one that depends on type information in GEPs to one that does not. Once this patch and llvm#161822 are landed, we can delete `tryDelinearizeFixedSize` from Delienarization, which is an optimization heuristic guided by GEP type information. After Polly eliminates its use of `getIndexExpressionsFromGEP`, we will be able to completely delete GEP-driven heuristics from Delinearization.
Fix the assertion failure after llvm#164798. The issue is that the comparison `Sizes.back() == ElementSize` can fail when their types are different. We should cast them to the wider type before the comparison.
`tryDelinearizeFixedSizeImpl` is a heuristic function relying on GEP's type information. Using these information to drive an optimization heuristic is not allowed, so this function should be removed. As llvm#161822 and llvm#164798 have eliminated all calls to this, this patch removes the function itself.
Fix the assertion failure after llvm#164798. The issue is that the comparison `Sizes.back() == ElementSize` can fail when their types are different. We should cast them to the wider type before the comparison.
`tryDelinearizeFixedSizeImpl` is a heuristic function relying on GEP's type information. Using these information to drive an optimization heuristic is not allowed, so this function should be removed. As llvm#161822 and llvm#164798 have eliminated all calls to this, this patch removes the function itself.
…m#164798) This patch replaces the delinearization function used in LoopCacheAnalysis, switching from one that depends on type information in GEPs to one that does not. Once this patch and llvm#161822 are landed, we can delete `tryDelinearizeFixedSize` from Delienarization, which is an optimization heuristic guided by GEP type information. After Polly eliminates its use of `getIndexExpressionsFromGEP`, we will be able to completely delete GEP-driven heuristics from Delinearization.
Fix the assertion failure after llvm#164798. The issue is that the comparison `Sizes.back() == ElementSize` can fail when their types are different. We should cast them to the wider type before the comparison.
`tryDelinearizeFixedSizeImpl` is a heuristic function relying on GEP's type information. Using these information to drive an optimization heuristic is not allowed, so this function should be removed. As llvm#161822 and llvm#164798 have eliminated all calls to this, this patch removes the function itself.
Fix the assertion failure after llvm#164798. The issue is that the comparison `Sizes.back() == ElementSize` can fail when their types are different. We should cast them to the wider type before the comparison.
`tryDelinearizeFixedSizeImpl` is a heuristic function relying on GEP's type information. Using these information to drive an optimization heuristic is not allowed, so this function should be removed. As llvm#161822 and llvm#164798 have eliminated all calls to this, this patch removes the function itself.
This patch replaces the delinearization function used in LoopCacheAnalysis, switching from one that depends on type information in GEPs to one that does not. Once this patch and #161822 are landed, we can delete
tryDelinearizeFixedSizefrom Delienarization, which is a optimization heuristic guided by GEP type information. After Polly eliminates its use ofgetIndexExpressionsFromGEP, we will be able to completely delete GEP-driven heuristics from Delinearization.