[BACKEND] Convert layout illegal mem access fix#2287
Conversation
|
|
|
|
|
|
| // Order | ||
| auto inOrder = triton::gpu::getOrder(srcEncoding); | ||
| auto outOrder = triton::gpu::getOrder(resSharedLayout); | ||
| assert(outVec * (maxPhase - 1) <= srcShape[outOrder[0]] && |
There was a problem hiding this comment.
How this formula is derived?
There was a problem hiding this comment.
Seems to me outVec * maxPhase makes more sense? Just curious
There was a problem hiding this comment.
There's only phases 0, 1, ... maxPhase - 1, if each one increments by vec, then the largest address is vec * (maxPhase - 1)
There was a problem hiding this comment.
The largest address would be vec * maxPhase - 1 ? I meant the largest starting addresses is vec * (maxPhase - 1) , but we actually have accessed addresses until vec * maxPhase
There was a problem hiding this comment.
That's right, I was trying to get away from cases where we have outVec > 1, maxPhase = 1 and srcShape[outOrder[0]] = 1. In that case, we only use address range [0, 1], regardless of what outVec is, since we don't really swizzle. Maybe that case should also be illegal, even though in practice the code just works since we don't swizzle.
There was a problem hiding this comment.
So the condition should be maxPhase == 1 || vec * maxPhase <= srcShape[outOrder[0]], what do you think ?
* Unify slow/fast reduce codegen This aligns with the upstream changes in triton-lang/triton#2220 * Rebase from upstream #2287 triton-lang/triton#2287 * Port from (#2292) triton-lang/triton#2292 * Fix the JIT error when the signature is empty * Fix the issue where threadsPerWarp Logic changed * Delete unused code * Update expected failure case * Update Triton Commit
No description provided.