-
Notifications
You must be signed in to change notification settings - Fork 180
support for multiple text sections #6
Comments
Thanks for reporting the issue. If multiple text sections are a result of a compiler splitting the code, the workaround is to disable it with |
Thanks for the quick reply. BTW, great to see this finally get open-sourced. Thanks! Unfortunately the separate section is not created by compiler and cannot easily remove. Any suggestions? |
If functions are not split, then adding the support shouldn't be that difficult. We do process code in sections other than |
Thanks, I managed to remove the section suffix. But got another assertion: perf2bolt: /usr/local/google/home/dehao/bolt/llvm/tools/llvm-bolt/src/BinaryFunction.cpp:1726: bool llvm::bolt::BinaryFunction::buildCFG(): Assertion `ToBB && "cannot find BB containing TO branch"' failed. |
This sounds like either PIC or assembly code issue. We are working on adding more diagnostics and improving PIC support. Is the binary input with relocations or without? |
The binary was not built with PIC or dynamic relocation. Looks like it failed while building CFG? |
Yes. That's typically a symptom of a PIC or an assembly code with an embedded jump table. I have a fix for PIC. I'd like you to try it once it lands. If it doesn't work, I'll ask for more details. |
@danielcdh : could you try the latest version? |
Thanks! That error is gone, but hit another issue: perf2bolt: /usr/local/google/home/dehao/bolt/llvm/lib/Support/Unix/Program.inc:312: llvm::sys::ProcessInfo llvm::sys::Wait(const llvm::sys::ProcessInfo&, unsigned int, bool, std::__cxx11::string*): Assertion `PI.Pid && "invalid pid to wait on, process not started?"' failed. |
Do you have perf in your PATH? Which version? |
You are right, added perf in PATH and the problem resolved. And managed to go through the process and create a new binary with bolt. Unfortunately, that binary segfaults immediately when I execute. |
Is it a trap or a segfault? Since the LLVM disassembler has problems with recent AVX512 instructions, BOLT will mutate functions that use AVX512 into a trap. We are working on improving AVX512 support. |
By the way, BOLT has a safer mode of operation if things are not quite working yet for your binary (either if you use AVX512 or if you have weird assembly-written code that is causing BOLT to fail to read the binary at some parts). Simply disable relocations in the linker or use -relocs=false in BOLT. This mode is less effective for performance because it does not reorder functions, and tries to reorder basic blocks in place without changing the rest of the binary. It is a far more conservative approach, but can still lead to performance improvements. However, even in relocation mode (our most aggressive processing where every code in the binary gets rewritten), the binary shouldn't segfault unless there is something very weird happening. Traps can happen for AVX512, though. |
It's segfault. The binary was not built with avx512. It's also not built with relocation and I have to disable function reordering when invoking llvm-bolt. Another issue is that llvm-bolt takes ~5 hours to process my 800MB binary and produces a 1.1GB binary, is it expected? Thanks |
If you are suffering with long processing times, you probably have deeply-inlined functions with a lot of basic blocks. For these cases, it's better to use -reorder-blocks=cache instead of -reorder-blocks=cache+. The expected processing time ranges from 2 to 6 minutes for ~100MB binary. If you use -update-debug-info, this time may climb to close to 10 minutes, some cases 20 minutes, depending on the code. The resulting binary is larger in your case because you are using non-relocation mode. So the original code section is kept the same size, but the blocks are reordered and sometimes functions are split. If split, the cold part of these functions will account for the extra 300MB you are observing. |
@danielcdh: you have a lot of code :) The fact that the binary crashes immediately is probably an indication of something trivial that we are not getting right. If you strip the binary after BOLT, it could be the reason, as we break some assumptions that |
Yeah, it's a large binary :) The segfault happens without stripping. We will try debug with some smaller binaries and see if the problem can be reproduced. |
…he parser" This reverts commit b0e8667. ASAN/UBSAN bot is broken with this trace: [ RUN ] FlatAffineConstraintsTest.FindSampleTest llvm-project/mlir/include/mlir/Support/MathExtras.h:27:15: runtime error: signed integer overflow: 1229996100002 * 809999700000 cannot be represented in type 'long' #0 0x7f63ace960e4 in mlir::ceilDiv(long, long) llvm-project/mlir/include/mlir/Support/MathExtras.h:27:15 #1 0x7f63ace8587e in ceil llvm-project/mlir/include/mlir/Analysis/Presburger/Fraction.h:57:42 #2 0x7f63ace8587e in operator* llvm-project/llvm/include/llvm/ADT/STLExtras.h:347:42 #3 0x7f63ace8587e in uninitialized_copy<llvm::mapped_iterator<mlir::Fraction *, long (*)(mlir::Fraction), long>, long *> include/c++/v1/__memory/uninitialized_algorithms.h:36:62 #4 0x7f63ace8587e in uninitialized_copy<llvm::mapped_iterator<mlir::Fraction *, long (*)(mlir::Fraction), long>, long *> llvm-project/llvm/include/llvm/ADT/SmallVector.h:490:5 #5 0x7f63ace8587e in append<llvm::mapped_iterator<mlir::Fraction *, long (*)(mlir::Fraction), long>, void> llvm-project/llvm/include/llvm/ADT/SmallVector.h:662:5 #6 0x7f63ace8587e in SmallVector<llvm::mapped_iterator<mlir::Fraction *, long (*)(mlir::Fraction), long> > llvm-project/llvm/include/llvm/ADT/SmallVector.h:1204:11 #7 0x7f63ace8587e in mlir::FlatAffineConstraints::findIntegerSample() const llvm-project/mlir/lib/Analysis/AffineStructures.cpp:1171:27 #8 0x7f63ae95a84d in mlir::checkSample(bool, mlir::FlatAffineConstraints const&, mlir::TestFunction) llvm-project/mlir/unittests/Analysis/AffineStructuresTest.cpp:37:23 #9 0x7f63ae957545 in mlir::FlatAffineConstraintsTest_FindSampleTest_Test::TestBody() llvm-project/mlir/unittests/Analysis/AffineStructuresTest.cpp:222:3
Segmentation fault in ompt_tsan_dependences function due to an unchecked NULL pointer dereference is as follows: ``` ThreadSanitizer:DEADLYSIGNAL ==140865==ERROR: ThreadSanitizer: SEGV on unknown address 0x000000000050 (pc 0x7f217c2d3652 bp 0x7ffe8cfc7e00 sp 0x7ffe8cfc7d90 T140865) ==140865==The signal is caused by a READ memory access. ==140865==Hint: address points to the zero page. /usr/bin/addr2line: DWARF error: could not find variable specification at offset 1012a /usr/bin/addr2line: DWARF error: could not find variable specification at offset 133b5 /usr/bin/addr2line: DWARF error: could not find variable specification at offset 1371a /usr/bin/addr2line: DWARF error: could not find variable specification at offset 13a58 #0 ompt_tsan_dependences(ompt_data_t*, ompt_dependence_t const*, int) /ptmp/bhararit/llvm-project/openmp/tools/archer/ompt-tsan.cpp:1004 (libarcher.so+0x15652) #1 __kmpc_doacross_post /ptmp/bhararit/llvm-project/openmp/runtime/src/kmp_csupport.cpp:4280 (libomp.so+0x74d98) #2 .omp_outlined. for_ordered_01.c:? (for_ordered_01.exe+0x5186cb) #3 __kmp_invoke_microtask /ptmp/bhararit/llvm-project/openmp/runtime/src/z_Linux_asm.S:1166 (libomp.so+0x14e592) #4 __kmp_invoke_task_func /ptmp/bhararit/llvm-project/openmp/runtime/src/kmp_runtime.cpp:7556 (libomp.so+0x909ad) #5 __kmp_fork_call /ptmp/bhararit/llvm-project/openmp/runtime/src/kmp_runtime.cpp:2284 (libomp.so+0x8461a) #6 __kmpc_fork_call /ptmp/bhararit/llvm-project/openmp/runtime/src/kmp_csupport.cpp:308 (libomp.so+0x6db55) #7 main ??:? (for_ordered_01.exe+0x51828f) #8 __libc_start_main ??:? (libc.so.6+0x24349) #9 _start /home/abuild/rpmbuild/BUILD/glibc-2.26/csu/../sysdeps/x86_64/start.S:120 (for_ordered_01.exe+0x4214e9) ThreadSanitizer can not provide additional info. SUMMARY: ThreadSanitizer: SEGV /ptmp/bhararit/llvm-project/openmp/tools/archer/ompt-tsan.cpp:1004 in ompt_tsan_dependences(ompt_data_t*, ompt_dependence_t const*, int) ==140865==ABORTING ``` To reproduce the error, use the following openmp code snippet: ``` /* initialise testMatrixInt Matrix, cols, r and c */ #pragma omp parallel private(r,c) shared(testMatrixInt) { #pragma omp for ordered(2) for (r=1; r < rows; r++) { for (c=1; c < cols; c++) { #pragma omp ordered depend(sink:r-1, c+1) depend(sink:r-1,c-1) testMatrixInt[r][c] = (testMatrixInt[r-1][c] + testMatrixInt[r-1][c-1]) % cols ; #pragma omp ordered depend (source) } } } ``` Compilation: ``` clang -g -stdlib=libc++ -fsanitize=thread -fopenmp -larcher test_case.c ``` It seems like the changes introduced by the commit https://reviews.llvm.org/D114005 causes this particular SEGV while using Archer. Reviewed By: protze.joachim Differential Revision: https://reviews.llvm.org/D115328
If I have multiple text sections, I saw the following error:
perf2bolt: /usr/local/google/home/dehao/bolt/llvm/tools/llvm-bolt/src/RewriteInstance.cpp:1383: void llvm::bolt::RewriteInstance::discoverFileObjects(): Assertion `Section && "section for functions must be registered."' failed.
Any ideas how to workaround the issue?
The text was updated successfully, but these errors were encountered: