[AMDGPU][Scheduler] Revert all regions when remat fails to increase occ.#7
Closed
lucas-rami wants to merge 1 commit intousers/lucas-rami/stack/rematerialization-rollback-logi/2from
Conversation
When the rematerialization stage fails to increase occupancy in all regions, the current implementation only reverts the effect of re-scheduling in regions in which the increased occupancy target could not be achieved. However, given that re-scheduling with a higher occupancy target puts more pressure on the scheduler to achieve lower maximum RP at the cost of potentially lower ILP as well, region schedules made with higher occupancy targets are generally less desirable if the whole function is not able to meet that target. Therefore, if at least one region cannot reach its target, it makes sense to revert re-scheduling in all affected regions to go back to a schedule that was made with a lower occupancy target. This implements such logic for the rematerialization stage, and adds a test to showcase that re-scheduling is indeed interrupted/reverted as soon as a re-scheduled region that does not meet the increased target occupancy is encountered. As a minor improvement, this also sets higher occupancy targets for re-scheduling at the end of stage initialization in some cases. In cases where rematerializations alone are not able to achieve the target, this can push the scheduler to be more aggressive in reducing RP and achieve the target. stack-info: PR: #7, branch: users/lucas-rami/stack/rematerialization-rollback-logi/3
df77c9d to
f34930e
Compare
73f37c0 to
014a308
Compare
lucas-rami
pushed a commit
that referenced
this pull request
Jan 23, 2026
**This patch adds a marker to make hidden frames more explicit.** --- Hidden frames can be confusing for some users, who see that the indexes of the frames in a backtrace are not contiguous. This patch aims to lessen the confusion by adding a delimiter for the first and last non hidden frame, i.e the boundaries. IDE's like Xcode and VSCode represent those in the UI by having the hidden frames either greyed out or collapsed. It's not possible to do this in the CLI, therefore, this patch makes use of 2 unicode characters to mark the beginning and end of the hidden frames range. This patch depends on: - llvm#168603 # Examples In the example below, frame `#2` to `#7` are is hidden, and therefore, frame `#1` is the first non hidden frame of the range while frame `#8` is the last non hidden frame: <img width="488" height="112" alt="Screenshot 2025-11-18 at 18 41 11" src="https://github.com/user-attachments/assets/a21431da-9729-4cf0-a6bc-024aa306fc45" /> If the selected frame is one of the 2 boundary frames, we replace the delimiter character with the select character (`*`). <img width="487" height="111" alt="Screenshot 2025-11-18 at 18 41 03" src="https://github.com/user-attachments/assets/5616fa81-6db6-457d-9d1e-bbe46e710c26" /> <img width="488" height="111" alt="Screenshot 2025-11-18 at 18 40 55" src="https://github.com/user-attachments/assets/93dfa6cf-0956-4718-b31c-f965ec72b56d" />
lucas-rami
pushed a commit
that referenced
this pull request
Jan 23, 2026
… all redeclarations (llvm#176188) Fix handling of `lifetimebound` attributes on implicit `this` parameters across function redeclarations. Previously, the lifetime analysis would miss `lifetimebound` attributes on implicit `this` parameters if they were only present on certain redeclarations of a method. This could lead to false negatives in the lifetime safety analysis. This change ensures that if any redeclaration of a method has the attribute, it will be properly detected and used in the analysis. I can't seem to work around the crash in the earlier attempt llvm#172146. Reproducer of the original crash: ```cpp struct a { a &b() [[_Clang::__lifetimebound__]]; }; a &a::b() {} ``` This only crashes with `-target i686-w64-mingw32`. `bin/clang++ -c a.cpp` works fine. Problematic merging logic: ```cpp // If Old has lifetimebound but New doesn't, add it to New. if (OldLBAttr && !NewLBAttr) { QualType NewMethodType = New->getType(); QualType AttributedType = S.Context.getAttributedType(OldLBAttr, NewMethodType, NewMethodType); TypeLocBuilder TLB; TLB.pushFullCopy(NewTSI->getTypeLoc()); AttributedTypeLoc TyLoc = TLB.push<AttributedTypeLoc>(AttributedType); // Crashes. TyLoc.setAttr(OldLBAttr); New->setType(AttributedType); New->setTypeSourceInfo(TLB.getTypeSourceInfo(S.Context, AttributedType)); } ``` <details> <summary>Crash</summary> ``` clang++: /REDACTED//llvm-project/clang/lib/Sema/TypeLocBuilder.cpp:89: TypeLoc clang::TypeLocBuilder::pushImpl(QualType, size_t, unsigned int): Assertion `TLast == LastTy && "mismatch between last type and new type's inner type"' failed. PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script. Stack dump: 0. Program arguments: bin/clang++ -target i686-w64-mingw32 -c /REDACTED//a.cpp 1. /REDACTED//a.cpp:4:11: current parser token '{' #0 0x000055971cfcb838 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /REDACTED//llvm-project/llvm/lib/Support/Unix/Signals.inc:842:13 #1 0x000055971cfc9374 llvm::sys::RunSignalHandlers() /REDACTED//llvm-project/llvm/lib/Support/Signals.cpp:109:18 #2 0x000055971cfcaf0c llvm::sys::CleanupOnSignal(unsigned long) /REDACTED//llvm-project/llvm/lib/Support/Unix/Signals.inc:0:3 #3 0x000055971cf38116 (anonymous namespace)::CrashRecoveryContextImpl::HandleCrash(int, unsigned long) /REDACTED//llvm-project/llvm/lib/Support/CrashRecoveryContext.cpp:73:5 #4 0x000055971cf38116 CrashRecoverySignalHandler(int) /REDACTED//llvm-project/llvm/lib/Support/CrashRecoveryContext.cpp:390:51 #5 0x00007fe9ebe49df0 (/lib/x86_64-linux-gnu/libc.so.6+0x3fdf0) #6 0x00007fe9ebe9e95c __pthread_kill_implementation ./nptl/pthread_kill.c:44:76 #7 0x00007fe9ebe49cc2 raise ./signal/../sysdeps/posix/raise.c:27:6 #8 0x00007fe9ebe324ac abort ./stdlib/abort.c:81:3 llvm#9 0x00007fe9ebe32420 __assert_perror_fail ./assert/assert-perr.c:31:1 llvm#10 0x000055971f969ade clang::TypeLocBuilder::pushImpl(clang::QualType, unsigned long, unsigned int) /REDACTED//llvm-project/clang/lib/Sema/TypeLocBuilder.cpp:93:3 llvm#11 0x000055971f237255 clang::QualType::hasLocalQualifiers() const /REDACTED//llvm-project/clang/include/clang/AST/TypeBase.h:1065:37 llvm#12 0x000055971f237255 clang::ConcreteTypeLoc<clang::UnqualTypeLoc, clang::AttributedTypeLoc, clang::AttributedType, clang::AttributedLocInfo>::isKind(clang::TypeLoc const&) /REDACTED//llvm-project/clang/include/clang/AST/TypeLoc.h:392:26 llvm#13 0x000055971f237255 clang::AttributedTypeLoc clang::TypeLoc::castAs<clang::AttributedTypeLoc>() const /REDACTED//llvm-project/clang/include/clang/AST/TypeLoc.h:79:5 llvm#14 0x000055971f237255 clang::AttributedTypeLoc clang::TypeLocBuilder::push<clang::AttributedTypeLoc>(clang::QualType) /REDACTED//llvm-project/clang/lib/Sema/TypeLocBuilder.h:106:47 llvm#15 0x000055971f280cc8 clang::AttributedTypeLoc::setAttr(clang::Attr const*) /REDACTED//llvm-project/clang/include/clang/AST/TypeLoc.h:1035:30 llvm#16 0x000055971f280cc8 mergeLifetimeBoundAttrOnMethod(clang::Sema&, clang::CXXMethodDecl*, clang::CXXMethodDecl const*) /REDACTED//llvm-project/clang/lib/Sema/SemaDecl.cpp:4497:11 llvm#17 0x000055971f280cc8 clang::Sema::MergeCompatibleFunctionDecls(clang::FunctionDecl*, clang::FunctionDecl*, clang::Scope*, bool) /REDACTED//llvm-project/clang/lib/Sema/SemaDecl.cpp:4528:5 llvm#18 0x000055971f27eb1f clang::Sema::MergeFunctionDecl(clang::FunctionDecl*, clang::NamedDecl*&, clang::Scope*, bool, bool) /REDACTED//llvm-project/clang/lib/Sema/SemaDecl.cpp:0:0 llvm#19 0x000055971f29c256 clang::Sema::CheckFunctionDeclaration(clang::Scope*, clang::FunctionDecl*, clang::LookupResult&, bool, bool) /REDACTED//llvm-project/clang/lib/Sema/SemaDecl.cpp:12371:9 llvm#20 0x000055971f28dab0 clang::Declarator::setRedeclaration(bool) /REDACTED//llvm-project/clang/include/clang/Sema/DeclSpec.h:2738:51 llvm#21 0x000055971f28dab0 clang::Sema::ActOnFunctionDeclarator(clang::Scope*, clang::Declarator&, clang::DeclContext*, clang::TypeSourceInfo*, clang::LookupResult&, llvm::MutableArrayRef<clang::TemplateParameterList*>, bool&) /REDACTED//llvm-project/clang/lib/Sema/SemaDecl.cpp:10877:9 llvm#22 0x000055971f2890fc clang::Sema::HandleDeclarator(clang::Scope*, clang::Declarator&, llvm::MutableArrayRef<clang::TemplateParameterList*>) /REDACTED//llvm-project/clang/lib/Sema/SemaDecl.cpp:0:11 llvm#23 0x000055971f2aab99 clang::Sema::ActOnStartOfFunctionDef(clang::Scope*, clang::Declarator&, llvm::MutableArrayRef<clang::TemplateParameterList*>, clang::SkipBodyInfo*, clang::Sema::FnBodyKind) /REDACTED//llvm-project/clang/lib/Sema/SemaDecl.cpp:15904:15 llvm#24 0x000055971efab286 clang::Parser::ParseFunctionDefinition(clang::ParsingDeclarator&, clang::Parser::ParsedTemplateInfo const&, clang::Parser::LateParsedAttrList*) /REDACTED//llvm-project/clang/lib/Parse/Parser.cpp:1364:23 llvm#25 0x000055971f013b40 clang::Parser::ParseDeclGroup(clang::ParsingDeclSpec&, clang::DeclaratorContext, clang::ParsedAttributes&, clang::Parser::ParsedTemplateInfo&, clang::SourceLocation*, clang::Parser::ForRangeInit*) /REDACTED//llvm-project/clang/lib/Parse/ParseDecl.cpp:2268:18 llvm#26 0x000055971efaa54f clang::Parser::ParseDeclOrFunctionDefInternal(clang::ParsedAttributes&, clang::ParsedAttributes&, clang::ParsingDeclSpec&, clang::AccessSpecifier) /REDACTED//llvm-project/clang/lib/Parse/Parser.cpp:0:10 llvm#27 0x000055971efa9e36 clang::Parser::ParseDeclarationOrFunctionDefinition(clang::ParsedAttributes&, clang::ParsedAttributes&, clang::ParsingDeclSpec*, clang::AccessSpecifier) /REDACTED//llvm-project/clang/lib/Parse/Parser.cpp:1202:12 llvm#28 0x000055971efa8df8 clang::Parser::ParseExternalDeclaration(clang::ParsedAttributes&, clang::ParsedAttributes&, clang::ParsingDeclSpec*) /REDACTED//llvm-project/clang/lib/Parse/Parser.cpp:0:14 llvm#29 0x000055971efa7574 clang::Parser::ParseTopLevelDecl(clang::OpaquePtr<clang::DeclGroupRef>&, clang::Sema::ModuleImportState&) /REDACTED//llvm-project/clang/lib/Parse/Parser.cpp:743:10 llvm#30 0x000055971ef9c0ee clang::ParseAST(clang::Sema&, bool, bool) /REDACTED//llvm-project/clang/lib/Parse/ParseAST.cpp:169:5 llvm#31 0x000055971dbcdad6 clang::FrontendAction::Execute() /REDACTED//llvm-project/clang/lib/Frontend/FrontendAction.cpp:1317:10 llvm#32 0x000055971db3c5fd llvm::Error::getPtr() const /REDACTED//llvm-project/llvm/include/llvm/Support/Error.h:278:42 llvm#33 0x000055971db3c5fd llvm::Error::operator bool() /REDACTED//llvm-project/llvm/include/llvm/Support/Error.h:241:16 llvm#34 0x000055971db3c5fd clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) /REDACTED//llvm-project/clang/lib/Frontend/CompilerInstance.cpp:1006:23 llvm#35 0x000055971dcb4f9c clang::ExecuteCompilerInvocation(clang::CompilerInstance*) /REDACTED//llvm-project/clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp:310:25 llvm#36 0x000055971a5e655e cc1_main(llvm::ArrayRef<char const*>, char const*, void*) /REDACTED//llvm-project/clang/tools/driver/cc1_main.cpp:304:15 llvm#37 0x000055971a5e29cb ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>) /REDACTED//llvm-project/clang/tools/driver/driver.cpp:226:12 llvm#38 0x000055971a5e4c1d clang_main(int, char**, llvm::ToolContext const&)::$_0::operator()(llvm::SmallVectorImpl<char const*>&) const /REDACTED//llvm-project/clang/tools/driver/driver.cpp:0:12 llvm#39 0x000055971a5e4c1d int llvm::function_ref<int (llvm::SmallVectorImpl<char const*>&)>::callback_fn<clang_main(int, char**, llvm::ToolContext const&)::$_0>(long, llvm::SmallVectorImpl<char const*>&) /REDACTED//llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:46:12 llvm#40 0x000055971d9bfe79 clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const::$_0::operator()() const /REDACTED//llvm-project/clang/lib/Driver/Job.cpp:442:30 llvm#41 0x000055971d9bfe79 void llvm::function_ref<void ()>::callback_fn<clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const::$_0>(long) /REDACTED//llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:46:12 llvm#42 0x000055971cf37dbe llvm::function_ref<void ()>::operator()() const /REDACTED//llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:0:12 llvm#43 0x000055971cf37dbe llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) /REDACTED//llvm-project/llvm/lib/Support/CrashRecoveryContext.cpp:426:3 llvm#44 0x000055971d9bf5ac clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const /REDACTED//llvm-project/clang/lib/Driver/Job.cpp:442:7 llvm#45 0x000055971d98422c clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&, bool) const /REDACTED//llvm-project/clang/lib/Driver/Compilation.cpp:196:15 llvm#46 0x000055971d984447 clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&, bool) const /REDACTED//llvm-project/clang/lib/Driver/Compilation.cpp:246:13 llvm#47 0x000055971d99ee08 llvm::SmallVectorBase<unsigned int>::empty() const /REDACTED//llvm-project/llvm/include/llvm/ADT/SmallVector.h:83:46 llvm#48 0x000055971d99ee08 clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&) /REDACTED//llvm-project/clang/lib/Driver/Driver.cpp:2265:23 llvm#49 0x000055971a5e2303 clang_main(int, char**, llvm::ToolContext const&) /REDACTED//llvm-project/clang/tools/driver/driver.cpp:414:21 llvm#50 0x000055971a5f2527 main /usr/local/google/home/usx/build/tools/clang/tools/driver/clang-driver.cpp:17:10 llvm#51 0x00007fe9ebe33ca8 __libc_start_call_main ./csu/../sysdeps/nptl/libc_start_call_main.h:74:3 llvm#52 0x00007fe9ebe33d65 call_init ./csu/../csu/libc-start.c:128:20 llvm#53 0x00007fe9ebe33d65 __libc_start_main ./csu/../csu/libc-start.c:347:5 llvm#54 0x000055971a5e0361 _start (bin/clang+++0x6636361) clang++: error: clang frontend command failed with exit code 134 (use -v to see invocation) clang version 23.0.0git (https://github.com/llvm/llvm-project.git 282a065) Target: i686-w64-windows-gnu Thread model: posix InstalledDir: /usr/local/google/home/usx/build/bin Build config: +assertions clang++: note: diagnostic msg: ******************** ``` </details>
lucas-rami
pushed a commit
that referenced
this pull request
Jan 30, 2026
This reverts commit 99fab01. llc was crashing in kernel-args.ll after this patch: ``` .---command stderr------------ | LLVM ERROR: Cannot select: t3: f32,ch = load<(non-temporal dereferenceable invariant load (s16), align 4, addrspace 7), sext from f16> t0, Constant:i32<36>, undef:i32 | In function: f16_arg | PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace and instructions to reproduce the bug. | Stack dump: | 0. Program arguments: /b/ml-opt-devrel-x86-64-b1/build/bin/llc -mtriple=r600 -mcpu=redwood | 1. Running pass 'Function Pass Manager' on module '<stdin>'. | 2. Running pass 'Unnamed pass: implement Pass::getPassName()' on function '@f16_arg' | #0 0x0000561402607438 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/b/ml-opt-devrel-x86-64-b1/build/bin/llc+0x81a7438) | #1 0x0000561402604b75 llvm::sys::RunSignalHandlers() (/b/ml-opt-devrel-x86-64-b1/build/bin/llc+0x81a4b75) | #2 0x00005614026081b1 SignalHandler(int, siginfo_t*, void*) Signals.cpp:0:0 | #3 0x00007f55eb45a050 (/lib/x86_64-linux-gnu/libc.so.6+0x3c050) | #4 0x00007f55eb4a8eec (/lib/x86_64-linux-gnu/libc.so.6+0x8aeec) | #5 0x00007f55eb459fb2 raise (/lib/x86_64-linux-gnu/libc.so.6+0x3bfb2) | #6 0x00007f55eb444472 abort (/lib/x86_64-linux-gnu/libc.so.6+0x26472) | #7 0x0000561402567005 llvm::report_fatal_error(llvm::Twine const&, bool) (/b/ml-opt-devrel-x86-64-b1/build/bin/llc+0x8107005) | #8 0x00005614023e7ba7 llvm::SelectionDAGISel::CannotYetSelect(llvm::SDNode*) (/b/ml-opt-devrel-x86-64-b1/build/bin/llc+0x7f87ba7) | llvm#9 0x00005614023e6a7d llvm::SelectionDAGISel::SelectCodeCommon(llvm::SDNode*, unsigned char const*, unsigned int) (/b/ml-opt-devrel-x86-64-b1/build/bin/llc+0x7f86a7d) | llvm#10 0x00005614023dae94 llvm::SelectionDAGISel::DoInstructionSelection() (/b/ml-opt-devrel-x86-64-b1/build/bin/llc+0x7f7ae94) | llvm#11 0x00005614023d9e6a llvm::SelectionDAGISel::CodeGenAndEmitDAG() (/b/ml-opt-devrel-x86-64-b1/build/bin/llc+0x7f79e6a) | llvm#12 0x00005614023d7b5e llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&) (/b/ml-opt-devrel-x86-64-b1/build/bin/llc+0x7f77b5e) | llvm#13 0x00005614023d4c30 llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&) (/b/ml-opt-devrel-x86-64-b1/build/bin/llc+0x7f74c30) | llvm#14 0x00005614023d22e0 llvm::SelectionDAGISelLegacy::runOnMachineFunction(llvm::MachineFunction&) (/b/ml-opt-devrel-x86-64-b1/build/bin/llc+0x7f722e0) | llvm#15 0x0000561401611793 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (/b/ml-opt-devrel-x86-64-b1/build/bin/llc+0x71b1793) | llvm#16 0x0000561401b790e5 llvm::FPPassManager::runOnFunction(llvm::Function&) (/b/ml-opt-devrel-x86-64-b1/build/bin/llc+0x77190e5) | llvm#17 0x0000561401b80f72 llvm::FPPassManager::runOnModule(llvm::Module&) (/b/ml-opt-devrel-x86-64-b1/build/bin/llc+0x7720f72) | llvm#18 0x0000561401b79b56 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/b/ml-opt-devrel-x86-64-b1/build/bin/llc+0x7719b56) | llvm#19 0x00005613ff4858f4 compileModule(char**, llvm::SmallVectorImpl<llvm::PassPlugin>&, llvm::LLVMContext&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>&) llc.cpp:0:0 | llvm#20 0x00005613ff482ed3 main (/b/ml-opt-devrel-x86-64-b1/build/bin/llc+0x5022ed3) | llvm#21 0x00007f55eb44524a (/lib/x86_64-linux-gnu/libc.so.6+0x2724a) | llvm#22 0x00007f55eb445305 __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x27305) | llvm#23 0x00005613ff47ea21 _start (/b/ml-opt-devrel-x86-64-b1/build/bin/llc+0x501ea21) `----------------------------- ```
lucas-rami
pushed a commit
that referenced
this pull request
Feb 2, 2026
…lvm#178069) Kernel panic is a special case, and there is no signal or exception for that so we need to rely on special workaround called `dumptid`. FreeBSDKernel plugin is supposed to find this thread and set it manually through `SetStopInfo()` in `CalculateStopInfo()` like Mach core plugin does. Before (We had to find and select crashed thread list otherwise thread 1 was selected by default): ``` ➜ sudo lldb /boot/panic/kernel -c /var/crash/vmcore.last (lldb) target create "/boot/panic/kernel" --core "/var/crash/vmcore.last" Core file '/var/crash/vmcore.last' (x86_64) was loaded. (lldb) bt * thread #1, name = '(pid 12991) dtrace' * frame #0: 0xffffffff80bf9322 kernel`sched_switch(td=0xfffff8015882f780, flags=259) at sched_ule.c:2448:26 frame #1: 0xffffffff80bd38d2 kernel`mi_switch(flags=259) at kern_synch.c:530:2 frame #2: 0xffffffff80c29799 kernel`sleepq_switch(wchan=0xfffff8014edff300, pri=0) at subr_sleepqueue.c:608:2 frame #3: 0xffffffff80c29b76 kernel`sleepq_catch_signals(wchan=0xfffff8014edff300, pri=0) at subr_sleepqueue.c:523:3 frame #4: 0xffffffff80c29d32 kernel`sleepq_timedwait_sig(wchan=<unavailable>, pri=<unavailable>) at subr_sleepqueue.c:704:11 frame #5: 0xffffffff80bd2e2d kernel`_sleep(ident=0xfffff8014edff300, lock=0xffffffff81df2880, priority=768, wmesg="uwait", sbt=2573804118162, pr=0, flags=512) at kern_synch.c:215:10 frame #6: 0xffffffff80be8622 kernel`umtxq_sleep(uq=0xfffff8014edff300, wmesg="uwait", timo=0xfffffe0279cb3d20) at kern_umtx.c:843:11 frame #7: 0xffffffff80bef87a kernel`do_wait(td=0xfffff8015882f780, addr=<unavailable>, id=0, timeout=0xfffffe0279cb3d90, compat32=1, is_private=1) at kern_umtx.c:1316:12 frame #8: 0xffffffff80bed264 kernel`__umtx_op_wait_uint_private(td=0xfffff8015882f780, uap=0xfffffe0279cb3dd8, ops=<unavailable>) at kern_umtx.c:3990:10 frame llvm#9: 0xffffffff80beaabe kernel`sys__umtx_op [inlined] kern__umtx_op(td=<unavailable>, obj=<unavailable>, op=<unavailable>, val=<unavailable>, uaddr1=<unavailable>, uaddr2=<unavailable>, ops=<unavailable>) at kern_umtx.c:4999:10 frame llvm#10: 0xffffffff80beaa89 kernel`sys__umtx_op(td=<unavailable>, uap=<unavailable>) at kern_umtx.c:5024:10 frame llvm#11: 0xffffffff81122cd1 kernel`amd64_syscall [inlined] syscallenter(td=0xfffff8015882f780) at subr_syscall.c:165:11 frame llvm#12: 0xffffffff81122c19 kernel`amd64_syscall(td=0xfffff8015882f780, traced=0) at trap.c:1208:2 frame llvm#13: 0xffffffff810f1dbb kernel`fast_syscall_common at exception.S:570 ``` After: ``` ➜ sudo ./build/bin/lldb /boot/panic/kernel -c /var/crash/vmcore.last (lldb) target create "/boot/panic/kernel" --core "/var/crash/vmcore.last" Core file '/var/crash/vmcore.last' (x86_64) was loaded. (lldb) bt * thread llvm#18, name = '(pid 5409) powerd (crashed)', stop reason = kernel panic * frame #0: 0xffffffff80bc6c91 kernel`__curthread at pcpu_aux.h:57:2 [inlined] frame #1: 0xffffffff80bc6c91 kernel`doadump(textdump=0) at kern_shutdown.c:399:2 frame #2: 0xffffffff804b3b7a kernel`db_dump(dummy=<unavailable>, dummy2=<unavailable>, dummy3=<unavailable>, dummy4=<unavailable>) at db_command.c:596:10 frame #3: 0xffffffff804b396d kernel`db_command(last_cmdp=<unavailable>, cmd_table=<unavailable>, dopager=true) at db_command.c:508:3 frame #4: 0xffffffff804b362d kernel`db_command_loop at db_command.c:555:3 frame #5: 0xffffffff804b7026 kernel`db_trap(type=<unavailable>, code=<unavailable>) at db_main.c:267:3 frame #6: 0xffffffff80c16aaf kernel`kdb_trap(type=3, code=0, tf=0xfffffe01b605b930) at subr_kdb.c:790:13 frame #7: 0xffffffff8112154e kernel`trap(frame=<unavailable>) at trap.c:614:8 frame #8: 0xffffffff810f14c8 kernel`calltrap at exception.S:285 frame llvm#9: 0xffffffff81da2290 kernel`cn_devtab + 64 frame llvm#10: 0xfffffe01b605b8b0 frame llvm#11: 0xffffffff84001c43 dtrace.ko`dtrace_panic(format=<unavailable>) at dtrace.c:652:2 frame llvm#12: 0xffffffff84005524 dtrace.ko`dtrace_action_panic(ecb=0xfffff80539cad580) at dtrace.c:7022:2 [inlined] frame llvm#13: 0xffffffff840054de dtrace.ko`dtrace_probe(id=88998, arg0=14343377283488, arg1=<unavailable>, arg2=<unavailable>, arg3=<unavailable>, arg4=<unavailable>) at dtrace.c:7665:6 frame llvm#14: 0xffffffff83e5213d systrace.ko`systrace_probe(sa=<unavailable>, type=<unavailable>, retval=<unavailable>) at systrace.c:226:2 frame llvm#15: 0xffffffff8112318d kernel`syscallenter(td=0xfffff801318d5780) at subr_syscall.c:160:4 [inlined] frame llvm#16: 0xffffffff81123112 kernel`amd64_syscall(td=0xfffff801318d5780, traced=0) at trap.c:1208:2 frame llvm#17: 0xffffffff810f1dbb kernel`fast_syscall_common at exception.S:570 ```
lucas-rami
pushed a commit
that referenced
this pull request
Feb 9, 2026
…n splitBlockBefore() (llvm#178895) (llvm#179392) llvm#178895 caused a clang crash(see https://lab.llvm.org/buildbot/#/builders/210/builds/8229), reverted in 6d52d26. The crash is assertion `DT && "DT should be available to update LoopInfo!"' failed. https://github.com/llvm/llvm-project/blob/ad8d5349d46734826aaeae4a2ebdc6f427a5bad8/llvm/lib/Transforms/Utils/BasicBlockUtils.cpp#L1106 ``` #7 0x00007f5a380254e3 __assert_perror_fail (/usr/lib/libc.so.6+0x254e3) #8 0x0000563df5d8fde1 UpdateAnalysisInformation(llvm::BasicBlock*, llvm::BasicBlock*, llvm::ArrayRef<llvm::BasicBlock*>, llvm::DomTreeUpdater*, llvm::DominatorTree*, llvm::LoopInfo*, llvm::MemorySSAUpdater*, bool, bool&) BasicBlockUtils.cpp:0:0 llvm#9 0x0000563df5d8f3bb llvm::splitBlockBefore(llvm::BasicBlock*, llvm::ilist_iterator_w_bits<llvm::ilist_detail::node_options<llvm::Instruction, true, false, void, true, llvm::BasicBlock>, false, false>, llvm::DomTreeUpdater*, llvm::LoopInfo*, llvm::MemorySSAUpdater*, llvm::Twine const&) llvm#10 0x0000563df5d8cb08 llvm::SplitEdge(llvm::BasicBlock*, llvm::BasicBlock*, llvm::DominatorTree*, llvm::LoopInfo*, llvm::MemorySSAUpdater*, llvm::Twine const&) llvm#11 0x0000563df4ff5b59 (anonymous namespace)::CodeGenPrepare::splitLargeGEPOffsets()::$_1::operator()(long, llvm::Value*, llvm::GetElementPtrInst*) const CodeGenPrepare.cpp:0:0 llvm#12 0x0000563df4fc0ec8 (anonymous namespace)::CodeGenPrepare::_run(llvm::Function&) CodeGenPrepare.cpp:0:0 llvm#13 0x0000563df4fbb36c (anonymous namespace)::CodeGenPrepareLegacyPass::runOnFunction(llvm::Function&) CodeGenPrepare.cpp:0:0 ``` I think this happened when we get DominatorTree with `DT.get()` in `splitLargeGEPOffsets()` but `DT.reset()` already setting it to nullptr in https://github.com/llvm/llvm-project/blob/ad8d5349d46734826aaeae4a2ebdc6f427a5bad8/llvm/lib/CodeGen/CodeGenPrepare.cpp#L660. To fix this assertion failure, use `getDT()` for `splitLargeGEPOffsets()` to build the DominatorTree if it is set to nullptr by `DT.reset()`. I don't have a RSIC-V environment, so no reproducer. Checked that the crash is fixed by rerunning buildbot with this patch https://lab.llvm.org/buildbot/#/builders/210/builds/8248
lucas-rami
pushed a commit
that referenced
this pull request
Feb 10, 2026
…8306) In FreeBSD, allproc is a prepend list and new processes are appended at head. This results in reverse pid order, so we first need to order pid incrementally then print threads according to the correct order. Before: ``` Process 0 stopped * thread #1: tid = 101866, 0xffffffff80bf9322 kernel`sched_switch(td=0xfffff8015882f780, flags=259) at sched_ule.c:2448:26, name = '(pid 12991) dtrace' thread #2: tid = 101915, 0xffffffff80bf9322 kernel`sched_switch(td=0xfffff80158825780, flags=259) at sched_ule.c:2448:26, name = '(pid 11509) zsh' thread #3: tid = 101942, 0xffffffff80bf9322 kernel`sched_switch(td=0xfffff80142599000, flags=259) at sched_ule.c:2448:26, name = '(pid 11504) ftcleanup' thread #4: tid = 101545, 0xffffffff80bf9322 kernel`sched_switch(td=0xfffff80131898000, flags=259) at sched_ule.c:2448:26, name = '(pid 5599) zsh' thread #5: tid = 100905, 0xffffffff80bf9322 kernel`sched_switch(td=0xfffff80131899000, flags=259) at sched_ule.c:2448:26, name = '(pid 5598) sshd-session' thread #6: tid = 101693, 0xffffffff80bf9322 kernel`sched_switch(td=0xfffff8015886e780, flags=259) at sched_ule.c:2448:26, name = '(pid 5595) sshd-session' thread #7: tid = 101626, 0xffffffff80bf9322 kernel`sched_switch(td=0xfffff801588be000, flags=259) at sched_ule.c:2448:26, name = '(pid 5592) sh' ... ``` After: ``` (lldb) thread list Process 0 stopped * thread #1: tid = 100000, 0xffffffff80bf9322 kernel`sched_switch(td=0xffffffff81abe840, flags=259) at sched_ule.c:2448:26, name = '(pid 0) kernel' thread #2: tid = 100035, 0xffffffff80bf9322 kernel`sched_switch(td=0xfffff801052d9780, flags=259) at sched_ule.c:2448:26, name = '(pid 0) kernel/softirq_0' thread #3: tid = 100036, 0xffffffff80bf9322 kernel`sched_switch(td=0xfffff801052d9000, flags=259) at sched_ule.c:2448:26, name = '(pid 0) kernel/softirq_1' thread #4: tid = 100037, 0xffffffff80bf9322 kernel`sched_switch(td=0xfffff801052d8780, flags=259) at sched_ule.c:2448:26, name = '(pid 0) kernel/softirq_2' thread #5: tid = 100038, 0xffffffff80bf9322 kernel`sched_switch(td=0xfffff801052d8000, flags=259) at sched_ule.c:2448:26, name = '(pid 0) kernel/softirq_3' thread #6: tid = 100039, 0xffffffff80bf9322 kernel`sched_switch(td=0xfffff801052d7780, flags=259) at sched_ule.c:2448:26, name = '(pid 0) kernel/softirq_4' thread #7: tid = 100040, 0xffffffff80bf9322 kernel`sched_switch(td=0xfffff801052d7000, flags=259) at sched_ule.c:2448:26, name = '(pid 0) kernel/softirq_5' ... ``` Signed-off-by: Minsoo Choo <minsoochoo0122@proton.me>
lucas-rami
pushed a commit
that referenced
this pull request
Feb 20, 2026
…er. (llvm#181941) The progress event reporter has a thread that reports events every 250 millisecond. and is destroyed in its destructor. When in event reporter desctructor, the event reporter may have pending event but the call mutex is destroyed leading to the crash. Relevant stack trace from CI. ``` [2026-02-13T17:46:13.577Z] libc++abi: terminating due to uncaught exception of type std::__1::system_error: mutex lock failed: Invalid argument [2026-02-13T17:46:13.577Z] PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash report from ~/Library/Logs/DiagnosticReports/. [2026-02-13T17:46:13.577Z] #0 0x0000000102b6943c llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake-os-verficiation/lldb-build/bin/lldb-dap+0x10008943c) [2026-02-13T17:46:13.577Z] #1 0x0000000102b67368 llvm::sys::RunSignalHandlers() (/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake-os-verficiation/lldb-build/bin/lldb-dap+0x100087368) [2026-02-13T17:46:13.577Z] #2 0x0000000102b69f20 SignalHandler(int, __siginfo*, void*) (/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake-os-verficiation/lldb-build/bin/lldb-dap+0x100089f20) [2026-02-13T17:46:13.577Z] #3 0x000000018bbdb744 (/usr/lib/system/libsystem_platform.dylib+0x1804e3744) [2026-02-13T17:46:13.577Z] #4 0x000000018bbd1888 (/usr/lib/system/libsystem_pthread.dylib+0x1804d9888) [2026-02-13T17:46:13.577Z] #5 0x000000018bad6850 (/usr/lib/system/libsystem_c.dylib+0x1803de850) [2026-02-13T17:46:13.577Z] #6 0x000000018bb85858 (/usr/lib/libc++abi.dylib+0x18048d858) [2026-02-13T17:46:13.577Z] #7 0x000000018bb744bc (/usr/lib/libc++abi.dylib+0x18047c4bc) [2026-02-13T17:46:13.577Z] #8 0x000000018b7a0424 (/usr/lib/libobjc.A.dylib+0x1800a8424) [2026-02-13T17:46:13.577Z] llvm#9 0x000000018bb84c2c (/usr/lib/libc++abi.dylib+0x18048cc2c) [2026-02-13T17:46:13.577Z] llvm#10 0x000000018bb88394 (/usr/lib/libc++abi.dylib+0x180490394) [2026-02-13T17:46:13.577Z] llvm#11 0x000000018bb8833c (/usr/lib/libc++abi.dylib+0x18049033c) [2026-02-13T17:46:13.577Z] llvm#12 0x000000018bb01b90 (/usr/lib/libc++.1.dylib+0x180409b90) [2026-02-13T17:46:13.577Z] llvm#13 0x000000018bb01b34 (/usr/lib/libc++.1.dylib+0x180409b34) [2026-02-13T17:46:13.577Z] llvm#14 0x000000018bb038a0 (/usr/lib/libc++.1.dylib+0x18040b8a0) [2026-02-13T17:46:13.577Z] llvm#15 0x0000000102b6fbac lldb_dap::DAP::Send(std::__1::variant<lldb_dap::protocol::Request, lldb_dap::protocol::Response, lldb_dap::protocol::Event> const&) (/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake-os-verficiation/lldb-build/bin/lldb-dap+0x10008fbac) [2026-02-13T17:46:13.577Z] llvm#16 0x0000000102b6f890 lldb_dap::DAP::SendJSON(llvm::json::Value const&) (/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake-os-verficiation/lldb-build/bin/lldb-dap+0x10008f890) [2026-02-13T17:46:13.577Z] llvm#17 0x0000000102b78788 std::__1::__function::__func<lldb_dap::DAP::DAP(lldb_dap::Log&, lldb_dap::ReplMode, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>>, bool, llvm::StringRef, lldb_private::transport::JSONTransport<lldb_dap::ProtocolDescriptor>&, lldb_private::MainLoopPosix&)::$_0, std::__1::allocator<lldb_dap::DAP::DAP(lldb_dap::Log&, lldb_dap::ReplMode, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>>, bool, llvm::StringRef, lldb_private::transport::JSONTransport<lldb_dap::ProtocolDescriptor>&, lldb_private::MainLoopPosix&)::$_0>, void (lldb_dap::ProgressEvent&)>::operator()(lldb_dap::ProgressEvent&) (/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake-os-verficiation/lldb-build/bin/lldb-dap+0x100098788) [2026-02-13T17:46:13.577Z] llvm#18 0x0000000102b8939c lldb_dap::ProgressEventManager::ReportIfNeeded() (/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake-os-verficiation/lldb-build/bin/lldb-dap+0x1000a939c) [2026-02-13T17:46:13.577Z] llvm#19 0x0000000102b8982c lldb_dap::ProgressEventReporter::ReportStartEvents() (/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake-os-verficiation/lldb-build/bin/lldb-dap+0x1000a982c) [2026-02-13T17:46:13.577Z] llvm#20 0x0000000102b8a038 void* std::__1::__thread_proxy[abi:nn200100]<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, lldb_dap::ProgressEventReporter::ProgressEventReporter(std::__1::function<void (lldb_dap::ProgressEvent&)>)::$_0>>(void*) (/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake-os-verficiation/lldb-build/bin/lldb-dap+0x1000aa038) [2026-02-13T17:46:13.577Z] llvm#21 0x000000018bbd1c08 (/usr/lib/system/libsystem_pthread.dylib+0x1804d9c08) [2026-02-13T17:46:13.577Z] llvm#22 0x000000018bbccba8 (/usr/lib/system/libsystem_pthread.dylib+0x1804d4ba8) ``` rdar://170331108
lucas-rami
pushed a commit
that referenced
this pull request
Feb 27, 2026
Using code/ideas from the x86 backend to optimize a select on a bitcast integer. The previous aarch64 approach was to individually extract the bits from the mask, which is kind of terrible. https://rust.godbolt.org/z/576sndT66 ```llvm define void @if_then_else8(ptr %out, i8 %mask, ptr %if_true, ptr %if_false) { start: %t = load <8 x i32>, ptr %if_true, align 4 %f = load <8 x i32>, ptr %if_false, align 4 %m = bitcast i8 %mask to <8 x i1> %s = select <8 x i1> %m, <8 x i32> %t, <8 x i32> %f store <8 x i32> %s, ptr %out, align 4 ret void } ``` turned into ```asm if_then_else8: // @if_then_else8 sub sp, sp, llvm#16 ubfx w8, w1, #4, #1 and w11, w1, #0x1 ubfx w9, w1, #5, #1 fmov s1, w11 ubfx w10, w1, #1, #1 fmov s0, w8 ubfx w8, w1, #6, #1 ldp q5, q2, [x3] mov v1.h[1], w10 ldp q4, q3, [x2] mov v0.h[1], w9 ubfx w9, w1, #2, #1 mov v1.h[2], w9 ubfx w9, w1, #3, #1 mov v0.h[2], w8 ubfx w8, w1, #7, #1 mov v1.h[3], w9 mov v0.h[3], w8 ushll v1.4s, v1.4h, #0 ushll v0.4s, v0.4h, #0 shl v1.4s, v1.4s, llvm#31 shl v0.4s, v0.4s, llvm#31 cmlt v1.4s, v1.4s, #0 cmlt v0.4s, v0.4s, #0 bsl v1.16b, v4.16b, v5.16b bsl v0.16b, v3.16b, v2.16b stp q1, q0, [x0] add sp, sp, llvm#16 ret ``` With this PR that instead emits ```asm if_then_else8: adrp x8, .LCPI0_1 dup v0.4s, w1 ldr q1, [x8, :lo12:.LCPI0_1] adrp x8, .LCPI0_0 ldr q2, [x8, :lo12:.LCPI0_0] ldp q4, q3, [x2] and v1.16b, v0.16b, v1.16b and v0.16b, v0.16b, v2.16b ldp q5, q2, [x3] cmeq v1.4s, v1.4s, #0 cmeq v0.4s, v0.4s, #0 bsl v1.16b, v2.16b, v3.16b bsl v0.16b, v5.16b, v4.16b stp q0, q1, [x0] ret ``` So substantially shorter. Instead of building the mask element-by-element, this approach (by virtue of not splitting) instead splats the mask value into all vector lanes, performs a bitwise and with powers of 2, and compares with zero to construct the mask vector. cc rust-lang/rust#122376 cc llvm#175769
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
[AMDGPU][Scheduler] Revert all regions when remat fails to increase occ.
When the rematerialization stage fails to increase occupancy in all
regions, the current implementation only reverts the effect of
re-scheduling in regions in which the increased occupancy target could
not be achieved. However, given that re-scheduling with a higher
occupancy target puts more pressure on the scheduler to achieve lower
maximum RP at the cost of potentially lower ILP as well, region
schedules made with higher occupancy targets are generally less
desirable if the whole function is not able to meet that target.
Therefore, if at least one region cannot reach its target, it makes
sense to revert re-scheduling in all affected regions to go back to
a schedule that was made with a lower occupancy target.
This implements such logic for the rematerialization stage, and adds a
test to showcase that re-scheduling is indeed interrupted/reverted as
soon as a re-scheduled region that does not meet the increased target
occupancy is encountered.
As a minor improvement, this also sets higher occupancy targets
for re-scheduling at the end of stage initialization in some cases. In
cases where rematerializations alone are not able to achieve the target,
this can push the scheduler to be more aggressive in reducing RP and
achieve the target.