Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Print backtraces for LLVM segfaults and aborts #79153

Open
tmandry opened this issue Nov 18, 2020 · 13 comments
Open

Print backtraces for LLVM segfaults and aborts #79153

tmandry opened this issue Nov 18, 2020 · 13 comments
Assignees
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. E-help-wanted Call for participation: Help is requested to fix this issue. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@tmandry
Copy link
Member

tmandry commented Nov 18, 2020

When a crash happens in an LLVM tool, it prints a nice backtrace:

> build/x86_64-unknown-linux-gnu/llvm/bin/llc -O0 crashy.ll
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
Stack dump:
0.      Program arguments: /usr/local/google/home/tmandry/frust/build/x86_64-unknown-linux-gnu/llvm/bin/llc -O0 crashy.ll 
1.      Running pass 'Function Pass Manager' on module 'crashy.ll'.
2.      Running pass 'IRTranslator' on function '@_ZN3std3sys4unix2fs5lstat17h30bd1f0595542181E'
 #0 0x00007f75ffa4c47c PrintStackTraceSignalHandler(void*) (.llvm.14272527432730108163) (/usr/local/google/home/tmandry/frust/build/x86_64-unknown-linux-gnu/llvm/bin/../lib/libLLVM-11-rust-1.49.0-nightly.so+0x133747c)
 #1 0x00007f75ffa49c3e llvm::sys::RunSignalHandlers() (/usr/local/google/home/tmandry/frust/build/x86_64-unknown-linux-gnu/llvm/bin/../lib/libLLVM-11-rust-1.49.0-nightly.so+0x1334c3e)
 #2 0x00007f75ffa4c905 SignalHandler(int) (/usr/local/google/home/tmandry/frust/build/x86_64-unknown-linux-gnu/llvm/bin/../lib/libLLVM-11-rust-1.49.0-nightly.so+0x1337905)
 #3 0x00007f75fe6ec140 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14140)
 #4 0x00007f75ffe8c752 llvm::MachineRegisterInfo::addRegOperandToUseList(llvm::MachineOperand*) (/usr/local/google/home/tmandry/frust/build/x86_64-unknown-linux-gnu/llvm/bin/../lib/libLLVM-11-rust-1.49.0-nightly.so+0x1777752)
 #5 0x00007f760042d14a llvm::MachineIRBuilder::buildDirectDbgValue(llvm::Register, llvm::MDNode const*, llvm::MDNode const*) (/usr/local/google/home/tmandry/frust/build/x86_64-unknown-linux-gnu/llvm/bin/../lib/libLLVM-11-rust-1.49.0-nightly.so+0x1d1814a)
 #6 0x00007f76003e1da2 llvm::IRTranslator::translateKnownIntrinsic(llvm::CallInst const&, unsigned int, llvm::MachineIRBuilder&) (/usr/local/google/home/tmandry/frust/build/x86_64-unknown-linux-gnu/llvm/bin/../lib/libLLVM-11-rust-1.49.0-nightly.so+0x1cccda2)
 #7 0x00007f76003e3045 llvm::IRTranslator::translateCall(llvm::User const&, llvm::MachineIRBuilder&) (/usr/local/google/home/tmandry/frust/build/x86_64-unknown-linux-gnu/llvm/bin/../lib/libLLVM-11-rust-1.49.0-nightly.so+0x1cce045)
 #8 0x00007f76003e6aaa llvm::IRTranslator::translate(llvm::Instruction const&) (/usr/local/google/home/tmandry/frust/build/x86_64-unknown-linux-gnu/llvm/bin/../lib/libLLVM-11-rust-1.49.0-nightly.so+0x1cd1aaa)
 #9 0x00007f76003e8e2b llvm::IRTranslator::runOnMachineFunction(llvm::MachineFunction&) (/usr/local/google/home/tmandry/frust/build/x86_64-unknown-linux-gnu/llvm/bin/../lib/libLLVM-11-rust-1.49.0-nightly.so+0x1cd3e2b)
#10 0x00007f75ffe0bfee llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (/usr/local/google/home/tmandry/frust/build/x86_64-unknown-linux-gnu/llvm/bin/../lib/libLLVM-11-rust-1.49.0-nightly.so+0x16f6fee)
#11 0x00007f75ffbb1721 llvm::FPPassManager::runOnFunction(llvm::Function&) (/usr/local/google/home/tmandry/frust/build/x86_64-unknown-linux-gnu/llvm/bin/../lib/libLLVM-11-rust-1.49.0-nightly.so+0x149c721)
#12 0x00007f75ffbb9803 llvm::FPPassManager::runOnModule(llvm::Module&) (/usr/local/google/home/tmandry/frust/build/x86_64-unknown-linux-gnu/llvm/bin/../lib/libLLVM-11-rust-1.49.0-nightly.so+0x14a4803)
#13 0x00007f75ffbb218d llvm::legacy::PassManagerImpl::run(llvm::Module&) (/usr/local/google/home/tmandry/frust/build/x86_64-unknown-linux-gnu/llvm/bin/../lib/libLLVM-11-rust-1.49.0-nightly.so+0x149d18d)
#14 0x000000000020ae2e main (/usr/local/google/home/tmandry/frust/build/x86_64-unknown-linux-gnu/llvm/bin/llc+0x20ae2e)
#15 0x00007f75fe3f5cca __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x26cca)
#16 0x0000000000208129 _start (/usr/local/google/home/tmandry/frust/build/x86_64-unknown-linux-gnu/llvm/bin/llc+0x208129)

But when an LLVM crash happens in rustc we don't, even when RUST_BACKTRACE=1.

This may be a simple matter of calling some LLVM function to set up hooks, but it might be more complicated than that.

@tmandry tmandry added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. E-help-wanted Call for participation: Help is requested to fix this issue. labels Nov 18, 2020
@in42
Copy link
Contributor

in42 commented Nov 19, 2020

Will like to work on this.

@tmandry
Copy link
Member Author

tmandry commented Nov 19, 2020

@in42 Awesome! You can claim the issue by adding a comment with

@rustbot claim

See the rustc dev guide for more information about contributing and working in the compiler codebase. Support is on the #t-compiler/help zulip stream. (I can also answer questions here, but I'll be on vacation next week.)

@in42
Copy link
Contributor

in42 commented Nov 19, 2020

@rustbot claim

@in42
Copy link
Contributor

in42 commented Nov 21, 2020

Is there a test case which will crash the rustc compiler?

@tmiasko
Copy link
Contributor

tmiasko commented Nov 21, 2020

Stack overflow (in LLVM):

$ python3 -c 's = "fn main() {"; s += "\n".join("let x = 0;" for _ in range(6000)); s+= "}"; print(s)' > a.rs
$ rustc -Aunused -g a.rs
thread '<unknown>' has overflowed its stack
fatal runtime error: stack overflow

Incorrect use of a naked attribute:

$ cat b.rs 
#![feature(asm)]
#![feature(naked_functions)]

#[naked]
pub unsafe fn f(a: usize, b: usize) -> ! {
    asm!("/*{0}*//*{1}*/", in(reg) a, in(reg) b);
    unreachable!()
}
$ rustc --crate-type=lib b.rs -O
terminated by signal SIGSEGV (Address boundary error)

@in42
Copy link
Contributor

in42 commented Nov 22, 2020

So, I built the rustc stage1 compiler and reproduced the above llvm stack overflow using rustc +stage1 -g -Aunused a.rs.

However, when trying to check if llc emits a backtrace when called separately, I emitted llvm IR using build/x86_64-unknown-linux-gnu/stage1/bin/rustc -Aunused -g a.rs --emit llvm-ir. Then I called llc using llc a.ll, but now it did not crash.

Where in the rustc code should I look at to find llc being called so that I can find which options are used, and I can use those options to reproduce the crash? I grepped llc in compiler/ but could not find any useful results.

@in42
Copy link
Contributor

in42 commented Dec 3, 2020

I may have found the root cause. LLVM tools generally call the InitLLVM constructor in their main function. In that function at src/llvm-project/llvm/lib/Support/InitLLVM.cpp:29, there is this line sys::PrintStackTraceOnErrorSignal(Argv[0]); - this seems to setup the printing of the stack trace.
In src/llvm-project/llvm/include/llvm/Support/Signals.h,

 38   /// When an error signal (such as SIGABRT or SIGSEGV) is delivered to the
 39   /// process, print a stack trace and then exit.
 40   /// Print a stack trace if a fatal signal occurs.
 41   /// \param Argv0 the current binary name, used to find the symbolizer
 42   ///        relative to the current binary before searching $PATH; can be
 43   ///        StringRef(), in which case we will only search $PATH.
 44   /// \param DisableCrashReporting if \c true, disable the normal crash
 45   ///        reporting mechanisms on the underlying operating system.
 46   void PrintStackTraceOnErrorSignal(StringRef Argv0,
 47                                     bool DisableCrashReporting = false);

But in the rustc code, neither InitLLVM constructor is called nor PrintStackTraceOnErrorSignal function is called. Therefore the stack trace does not get printed.

@in42
Copy link
Contributor

in42 commented Dec 3, 2020

Should I add code to call InitLLVM constructor here:
compiler/rustc_driver/src/lib.rs:

1304 pub fn main() -> ! {
1305     let start = Instant::now();
1306     init_rustc_env_logger();
1307     let mut callbacks = TimePassesCallbacks::default();
1308     install_ice_hook();
1309     let exit_code = catch_with_exit_code(|| {
1310         let args = env::args_os()

?

@tmandry
Copy link
Member Author

tmandry commented Dec 4, 2020

Nice find! Adding it directly in librustc_driver isn't ideal since most of the compiler interacts with codegen backends like LLVM through the librustc_codegen_ssa abstraction traits.

The actual call to InitLLVM should go somewhere in librustc_codegen_llvm. We should probably wait until right before codegen starts to call it, too, just to avoid doing work we don't have to (if this invocation never gets to codegen. Unless we call into LLVM outside of codegen, but I can't think of rustc ever doing that.) This way you can drop the call into an existing method without having to plumb it through the traits in the ssa crate. I’m not sure the exact best place to call it -- constructor of the CodegenCx maybe?

One question to keep in mind is whether we need to do this per codegen backend thread -- I’m guessing not, since it will install a global signal handler.

@in42
Copy link
Contributor

in42 commented Dec 5, 2020

I am little bit not sure where InitLLVM is supposed to be called when using llvm as a library as is being done in rustc. Here is the documentation in InitLLVM.h:

// The main() functions in typical LLVM tools start with InitLLVM which does
// the following one-time initializations:
//
//  1. Setting up a signal handler so that pretty stack trace is printed out
//     if a process crashes. A signal handler that exits when a failed write to
//     a pipe occurs may optionally be installed: this is on-by-default.
//
//  2. Set up the global new-handler which is called when a memory allocation
//     attempt fails.
//
//  3. If running on Windows, obtain command line arguments using a
//     multibyte character-aware API and convert arguments into UTF-8
//     encoding, so that you can assume that command line arguments are
//     always encoded in UTF-8 on any platform.
//
// InitLLVM calls llvm_shutdown() on destruction, which cleans up
// ManagedStatic objects.

So it is probably meant to be called once per process whereas CodegenCx is created per thread. Is there anything in librustc_codegen_llvm which is not called per thread but is called once as initialization before the threads are created?

@tmandry
Copy link
Member Author

tmandry commented Dec 11, 2020

I think the query ongoing_codegen is what kicks off all the codegen threads? Try starting there.

@tmandry
Copy link
Member Author

tmandry commented Dec 17, 2020

@in42 Checking in, were you able to make progress on this? Feel free to ping me on Zulip/Discord if that's easier.

@in42
Copy link
Contributor

in42 commented Jan 10, 2021

@rustbot claim

fee1-dead added a commit to fee1-dead-contrib/rust that referenced this issue Jun 22, 2021
Implement printing of stack traces on LLVM segfaults and aborts

Implement rust-lang#79153

Based on discussion, try to extend the rust_backtrace=1 feature to handle segfault or aborts in the llvm backend
bors added a commit to rust-lang-ci/rust that referenced this issue Jul 2, 2021
Implement printing of stack traces on LLVM segfaults and aborts

Implement rust-lang#79153

Based on discussion, try to extend the rust_backtrace=1 feature to handle segfault or aborts in the llvm backend
@Noratrieb Noratrieb added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Apr 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. E-help-wanted Call for participation: Help is requested to fix this issue. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

4 participants