Skip to content
This repository has been archived by the owner on Jul 1, 2023. It is now read-only.

support for multiple text sections #6

Open
danielcdh opened this issue Jun 20, 2018 · 17 comments
Open

support for multiple text sections #6

danielcdh opened this issue Jun 20, 2018 · 17 comments
Assignees

Comments

@danielcdh
Copy link

If I have multiple text sections, I saw the following error:

perf2bolt: /usr/local/google/home/dehao/bolt/llvm/tools/llvm-bolt/src/RewriteInstance.cpp:1383: void llvm::bolt::RewriteInstance::discoverFileObjects(): Assertion `Section && "section for functions must be registered."' failed.

Any ideas how to workaround the issue?

@maksfb maksfb self-assigned this Jun 20, 2018
@maksfb
Copy link
Contributor

maksfb commented Jun 20, 2018

Thanks for reporting the issue. If multiple text sections are a result of a compiler splitting the code, the workaround is to disable it with -fno-reorder-blocks-and-partition, and let BOLT do the splitting.

@danielcdh
Copy link
Author

Thanks for the quick reply. BTW, great to see this finally get open-sourced. Thanks!

Unfortunately the separate section is not created by compiler and cannot easily remove. Any suggestions?

@maksfb
Copy link
Contributor

maksfb commented Jun 20, 2018

If functions are not split, then adding the support shouldn't be that difficult. We do process code in sections other than .text, e.g. in .init and .fini. If you can share an output of readelf -e then it might give me a clue on what's happening.

@danielcdh
Copy link
Author

Thanks, I managed to remove the section suffix. But got another assertion:

perf2bolt: /usr/local/google/home/dehao/bolt/llvm/tools/llvm-bolt/src/BinaryFunction.cpp:1726: bool llvm::bolt::BinaryFunction::buildCFG(): Assertion `ToBB && "cannot find BB containing TO branch"' failed.

@maksfb
Copy link
Contributor

maksfb commented Jun 20, 2018

This sounds like either PIC or assembly code issue. We are working on adding more diagnostics and improving PIC support. Is the binary input with relocations or without?

@danielcdh
Copy link
Author

The binary was not built with PIC or dynamic relocation. Looks like it failed while building CFG?

@maksfb
Copy link
Contributor

maksfb commented Jun 21, 2018

Yes. That's typically a symptom of a PIC or an assembly code with an embedded jump table. I have a fix for PIC. I'd like you to try it once it lands. If it doesn't work, I'll ask for more details.

@maksfb
Copy link
Contributor

maksfb commented Jun 21, 2018

@danielcdh : could you try the latest version?

@danielcdh
Copy link
Author

Thanks! That error is gone, but hit another issue:

perf2bolt: /usr/local/google/home/dehao/bolt/llvm/lib/Support/Unix/Program.inc:312: llvm::sys::ProcessInfo llvm::sys::Wait(const llvm::sys::ProcessInfo&, unsigned int, bool, std::__cxx11::string*): Assertion `PI.Pid && "invalid pid to wait on, process not started?"' failed.
#0 0x0000563a8a6a4c3e llvm::sys::PrintStackTrace(llvm::raw_ostream&) /usr/local/google/home/dehao/bolt/llvm/lib/Support/Unix/Signals.inc:398:0
#1 0x0000563a8a6a4cd1 PrintStackTraceSignalHandler(void*) /usr/local/google/home/dehao/bolt/llvm/lib/Support/Unix/Signals.inc:462:0
#2 0x0000563a8a6a3176 llvm::sys::RunSignalHandlers() /usr/local/google/home/dehao/bolt/llvm/lib/Support/Signals.cpp:49:0
#3 0x0000563a8a6a45b3 SignalHandler(int) /usr/local/google/home/dehao/bolt/llvm/lib/Support/Unix/Signals.inc:252:0
#4 0x00007efe8ecac0c0 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x110c0)
#5 0x00007efe8d83dfcf gsignal (/lib/x86_64-linux-gnu/libc.so.6+0x32fcf)
#6 0x00007efe8d83f3fa abort (/lib/x86_64-linux-gnu/libc.so.6+0x343fa)
#7 0x00007efe8d836e37 (/lib/x86_64-linux-gnu/libc.so.6+0x2be37)
#8 0x00007efe8d836ee2 (/lib/x86_64-linux-gnu/libc.so.6+0x2bee2)
#9 0x0000563a8a6a259b llvm::sys::Wait(llvm::sys::ProcessInfo const&, unsigned int, bool, std::__cxx11::basic_string<char, std::char_traits, std::allocator >*) /usr/local/google/home/dehao/bolt/llvm/lib/Support/Unix/Program.inc:314:0
#10 0x0000563a887e1485 llvm::bolt::DataAggregator::aggregate(llvm::bolt::BinaryContext&, std::map<unsigned long, llvm::bolt::BinaryFunction, std::less, std::allocator<std::pair<unsigned long const, llvm::bolt::BinaryFunction> > >&) /usr/local/google/home/dehao/bolt/llvm/tools/llvm-bolt/src/DataAggregator.cpp:367:0
#11 0x0000563a8885818e llvm::bolt::RewriteInstance::processProfileData() /usr/local/google/home/dehao/bolt/llvm/tools/llvm-bolt/src/RewriteInstance.cpp:2387:0
#12 0x0000563a8884d9a3 operator() /usr/local/google/home/dehao/bolt/llvm/tools/llvm-bolt/src/RewriteInstance.cpp:967:0
#13 0x0000563a8884d9a3 llvm::bolt::RewriteInstance::run()::'lambda'(std::set<unsigned long, std::less, std::allocator > const&)::operator()(std::set<unsigned long, std::less, std::allocator > const&) const (bin/perf2bolt+0x5189a3)
#14 0x0000563a8884dcf3 llvm::bolt::RewriteInstance::run() /usr/local/google/home/dehao/bolt/llvm/tools/llvm-bolt/src/RewriteInstance.cpp:996:0
#15 0x0000563a886f821e main /usr/local/google/home/dehao/bolt/llvm/tools/llvm-bolt/src/llvm-bolt.cpp:269:0
#16 0x00007efe8d82b2b1 __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202b1)
#17 0x0000563a886f6eaa _start (bin/perf2bolt+0x3c1eaa)

@rafaelauler
Copy link
Contributor

Do you have perf in your PATH? Which version?

@danielcdh
Copy link
Author

You are right, added perf in PATH and the problem resolved. And managed to go through the process and create a new binary with bolt. Unfortunately, that binary segfaults immediately when I execute.

@rafaelauler
Copy link
Contributor

Is it a trap or a segfault? Since the LLVM disassembler has problems with recent AVX512 instructions, BOLT will mutate functions that use AVX512 into a trap. We are working on improving AVX512 support.

@rafaelauler
Copy link
Contributor

By the way, BOLT has a safer mode of operation if things are not quite working yet for your binary (either if you use AVX512 or if you have weird assembly-written code that is causing BOLT to fail to read the binary at some parts).

Simply disable relocations in the linker or use -relocs=false in BOLT. This mode is less effective for performance because it does not reorder functions, and tries to reorder basic blocks in place without changing the rest of the binary. It is a far more conservative approach, but can still lead to performance improvements.

However, even in relocation mode (our most aggressive processing where every code in the binary gets rewritten), the binary shouldn't segfault unless there is something very weird happening. Traps can happen for AVX512, though.

@danielcdh
Copy link
Author

It's segfault. The binary was not built with avx512. It's also not built with relocation and I have to disable function reordering when invoking llvm-bolt.

Another issue is that llvm-bolt takes ~5 hours to process my 800MB binary and produces a 1.1GB binary, is it expected?

Thanks

@rafaelauler
Copy link
Contributor

If you are suffering with long processing times, you probably have deeply-inlined functions with a lot of basic blocks. For these cases, it's better to use -reorder-blocks=cache instead of -reorder-blocks=cache+. The expected processing time ranges from 2 to 6 minutes for ~100MB binary. If you use -update-debug-info, this time may climb to close to 10 minutes, some cases 20 minutes, depending on the code.

The resulting binary is larger in your case because you are using non-relocation mode. So the original code section is kept the same size, but the blocks are reordered and sometimes functions are split. If split, the cold part of these functions will account for the extra 300MB you are observing.

@maksfb
Copy link
Contributor

maksfb commented Jun 22, 2018

@danielcdh: you have a lot of code :) The fact that the binary crashes immediately is probably an indication of something trivial that we are not getting right. If you strip the binary after BOLT, it could be the reason, as we break some assumptions that strip makes.

@danielcdh
Copy link
Author

Yeah, it's a large binary :)

The segfault happens without stripping.

We will try debug with some smaller binaries and see if the problem can be reproduced.

aaupov pushed a commit that referenced this issue Dec 24, 2021
…he parser"

This reverts commit b0e8667.

ASAN/UBSAN bot is broken with this trace:

[ RUN      ] FlatAffineConstraintsTest.FindSampleTest
llvm-project/mlir/include/mlir/Support/MathExtras.h:27:15: runtime error: signed integer overflow: 1229996100002 * 809999700000 cannot be represented in type 'long'
    #0 0x7f63ace960e4 in mlir::ceilDiv(long, long) llvm-project/mlir/include/mlir/Support/MathExtras.h:27:15
    #1 0x7f63ace8587e in ceil llvm-project/mlir/include/mlir/Analysis/Presburger/Fraction.h:57:42
    #2 0x7f63ace8587e in operator* llvm-project/llvm/include/llvm/ADT/STLExtras.h:347:42
    #3 0x7f63ace8587e in uninitialized_copy<llvm::mapped_iterator<mlir::Fraction *, long (*)(mlir::Fraction), long>, long *> include/c++/v1/__memory/uninitialized_algorithms.h:36:62
    #4 0x7f63ace8587e in uninitialized_copy<llvm::mapped_iterator<mlir::Fraction *, long (*)(mlir::Fraction), long>, long *> llvm-project/llvm/include/llvm/ADT/SmallVector.h:490:5
    #5 0x7f63ace8587e in append<llvm::mapped_iterator<mlir::Fraction *, long (*)(mlir::Fraction), long>, void> llvm-project/llvm/include/llvm/ADT/SmallVector.h:662:5
    #6 0x7f63ace8587e in SmallVector<llvm::mapped_iterator<mlir::Fraction *, long (*)(mlir::Fraction), long> > llvm-project/llvm/include/llvm/ADT/SmallVector.h:1204:11
    #7 0x7f63ace8587e in mlir::FlatAffineConstraints::findIntegerSample() const llvm-project/mlir/lib/Analysis/AffineStructures.cpp:1171:27
    #8 0x7f63ae95a84d in mlir::checkSample(bool, mlir::FlatAffineConstraints const&, mlir::TestFunction) llvm-project/mlir/unittests/Analysis/AffineStructuresTest.cpp:37:23
    #9 0x7f63ae957545 in mlir::FlatAffineConstraintsTest_FindSampleTest_Test::TestBody() llvm-project/mlir/unittests/Analysis/AffineStructuresTest.cpp:222:3
maksfb pushed a commit that referenced this issue Jan 10, 2022
Segmentation fault in ompt_tsan_dependences function due to an unchecked NULL pointer dereference is as follows:

```
ThreadSanitizer:DEADLYSIGNAL
	==140865==ERROR: ThreadSanitizer: SEGV on unknown address 0x000000000050 (pc 0x7f217c2d3652 bp 0x7ffe8cfc7e00 sp 0x7ffe8cfc7d90 T140865)
	==140865==The signal is caused by a READ memory access.
	==140865==Hint: address points to the zero page.
	/usr/bin/addr2line: DWARF error: could not find variable specification at offset 1012a
	/usr/bin/addr2line: DWARF error: could not find variable specification at offset 133b5
	/usr/bin/addr2line: DWARF error: could not find variable specification at offset 1371a
	/usr/bin/addr2line: DWARF error: could not find variable specification at offset 13a58
	#0 ompt_tsan_dependences(ompt_data_t*, ompt_dependence_t const*, int) /ptmp/bhararit/llvm-project/openmp/tools/archer/ompt-tsan.cpp:1004 (libarcher.so+0x15652)
	#1 __kmpc_doacross_post /ptmp/bhararit/llvm-project/openmp/runtime/src/kmp_csupport.cpp:4280 (libomp.so+0x74d98)
	#2 .omp_outlined. for_ordered_01.c:? (for_ordered_01.exe+0x5186cb)
	#3 __kmp_invoke_microtask /ptmp/bhararit/llvm-project/openmp/runtime/src/z_Linux_asm.S:1166 (libomp.so+0x14e592)
	#4 __kmp_invoke_task_func /ptmp/bhararit/llvm-project/openmp/runtime/src/kmp_runtime.cpp:7556 (libomp.so+0x909ad)
	#5 __kmp_fork_call /ptmp/bhararit/llvm-project/openmp/runtime/src/kmp_runtime.cpp:2284 (libomp.so+0x8461a)
	#6 __kmpc_fork_call /ptmp/bhararit/llvm-project/openmp/runtime/src/kmp_csupport.cpp:308 (libomp.so+0x6db55)
	#7 main ??:? (for_ordered_01.exe+0x51828f)
	#8 __libc_start_main ??:? (libc.so.6+0x24349)
	#9 _start /home/abuild/rpmbuild/BUILD/glibc-2.26/csu/../sysdeps/x86_64/start.S:120 (for_ordered_01.exe+0x4214e9)

	ThreadSanitizer can not provide additional info.
	SUMMARY: ThreadSanitizer: SEGV /ptmp/bhararit/llvm-project/openmp/tools/archer/ompt-tsan.cpp:1004 in ompt_tsan_dependences(ompt_data_t*, ompt_dependence_t const*, int)
	==140865==ABORTING
```

	To reproduce the error, use the following openmp code snippet:

```
/* initialise  testMatrixInt Matrix, cols, r and c */
	  #pragma omp parallel private(r,c) shared(testMatrixInt)
	    {
	      #pragma omp for ordered(2)
	      for (r=1; r < rows; r++) {
	        for (c=1; c < cols; c++) {
	          #pragma omp ordered depend(sink:r-1, c+1) depend(sink:r-1,c-1)
	          testMatrixInt[r][c] = (testMatrixInt[r-1][c] + testMatrixInt[r-1][c-1]) % cols ;
	          #pragma omp ordered depend (source)
	        }
	      }
	    }
```

	Compilation:
```
clang -g -stdlib=libc++ -fsanitize=thread -fopenmp -larcher test_case.c
```

	It seems like the changes introduced by the commit https://reviews.llvm.org/D114005 causes this particular SEGV while using Archer.

Reviewed By: protze.joachim

Differential Revision: https://reviews.llvm.org/D115328
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants