-
Notifications
You must be signed in to change notification settings - Fork 12.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AMDGPU] Assertion `(!LeaveBefore || Idx <= LeaveBefore) && "Interference"' failed #109294
Comments
@llvm/issue-subscribers-backend-amdgpu Author: Jay Foad (jayfoad)
With [this test case](https://github.com/user-attachments/files/17061455/zz.txt) I get:
```
llc -x mir zz.txt -start-after unreachable-mbb-elimination
llc: /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/SplitKit.cpp:1661: void llvm::SplitEditor::splitLiveThroughBlock(unsigned int, unsigned int, llvm::SlotIndex, unsigned int, llvm::SlotIndex): Assertion `(!LeaveBefore || Idx <= LeaveBefore) && "Interference"' failed.
```
|
The same assertion failed in #87721. |
I tried reducing the MIR with I tried running with
|
Yes, this is just a register definition issue (and we should probably just get rid of the sp_reg pseudo register at this point) |
PRs llvm#69924 and llvm#72140 modified SIInstrInfo::isBasicBlockPrologue to skip over EXEC modifications and spills when allocating VGPRs. But treating VGPR spills as part of the prologue can confuse the register allocator as in llvm#109294, so restrict it to SGPR spills, which were inserted during SGPR allocation which is done in an earlier pass. Fixes: llvm#109294 Fixes: SWDEV-485841
I can take a stab at reducing the MIR, I still have a number of patches that never made it upstream. I forgot that I never pushed the block reduction for instance |
The problem here is also avoidable, too much of the machineFunctionInfo was deleted. Needs to set sp_reg to the real value |
Thanks. I should not have deleted the definition of |
I've made some progress reducing this, but this is a tough one. It's another case where it depends on the exact slot indexes matter, so you have to run multiple passes to avoid the fresh LiveIntervals |
Still really big issue109294.mir.zip |
I think this is so sensitive because the SIMachineFunctionFields used for WWM spilling are still not serialized. We're losing reserved registers by resuming the compile |
Update comment regarding barrier gfx908 hack: revert scheduling region when llvm/llvm-project#109294 is fixed
…gue (#109439) PRs #69924 and #72140 modified SIInstrInfo::isBasicBlockPrologue to skip over EXEC modifications and spills when allocating VGPRs. But treating VGPR spills as part of the prologue can confuse the register allocator as in #109294, so restrict it to SGPR spills, which were inserted during SGPR allocation which is done in an earlier pass. Fixes: #109294 Fixes: SWDEV-485841
…gue (llvm#109439) PRs llvm#69924 and llvm#72140 modified SIInstrInfo::isBasicBlockPrologue to skip over EXEC modifications and spills when allocating VGPRs. But treating VGPR spills as part of the prologue can confuse the register allocator as in llvm#109294, so restrict it to SGPR spills, which were inserted during SGPR allocation which is done in an earlier pass. Fixes: llvm#109294 Fixes: SWDEV-485841
PRs llvm#69924 and llvm#72140 modified SIInstrInfo::isBasicBlockPrologue to skip over EXEC modifications and spills when allocating VGPRs. But treating VGPR spills as part of the prologue can confuse the register allocator as in llvm#109294, so restrict it to SGPR spills, which were inserted during SGPR allocation which is done in an earlier pass. Fixes: llvm#109294 Fixes: SWDEV-485841 Change-Id: I328fb2edfca8110ea36c94812e60bc1d7663c266
The Kernel Storage V2 achieves the following goals Major changes: * Move GPU kernel image to separate files + Now organized as `aotriton.images/<vendor>-<arch>/<kernel_family>/<kernel_name>/FONLY__<functionals>___<GPU>.aks2'` - `<GPU>` can be a family of GPUs, e.g., `MI300X/MI300A/MI325X` + This enables per-architecture delivery + No more linking errors when binaries are bloated. * Introduce `AKS2` file format to compress GPU kernels with LZMA. Reduce the total package down to 200MB (MI200/300/Navi31) + AKS2 means "Aotriton Kernel Storage version 2" + LZMA is picked over Zstandard for much better compression ratio (~7% vs ~12%) and acceptable performance (<0.1s) * Look up kernel image relative to the `.so` file (achieved through `dladdr`) * Can only build the C++ part by setting `cmake` option `AOTRITON_NOIMAGE_MODE` to `ON` Minor changes: * Release GIL in pybind11 * `aks2.py` tool is added to create `AKS2` file. * Update Triton compiler to Oct/23/2024 + This is to fix bug ```Assertion `(!LeaveBefore || Idx <= LeaveBefore) && "Interference"' failed``` + See llvm/llvm-project#109294 for more details
…gue (llvm#109439) PRs llvm#69924 and llvm#72140 modified SIInstrInfo::isBasicBlockPrologue to skip over EXEC modifications and spills when allocating VGPRs. But treating VGPR spills as part of the prologue can confuse the register allocator as in llvm#109294, so restrict it to SGPR spills, which were inserted during SGPR allocation which is done in an earlier pass. Fixes: llvm#109294 Fixes: SWDEV-485841 Change-Id: Ice1ab75074aa380c13e07c452a2854f78ff37ce7
…plicit_def Previously this would delete the IMPLICIT_DEF and not introduce the undef flag on the use operand. Fixes sub-issue found while reducing #109294
…plicit_def Previously this would delete the IMPLICIT_DEF and not introduce the undef flag on the use operand. Fixes sub-issue found while reducing #109294
With this test case I get:
The text was updated successfully, but these errors were encountered: