Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relax BALC[32] to 16-bit variant using trampolines #2

Open
wants to merge 13 commits into
base: nmips/gold_v7
Choose a base branch
from

Conversation

djtodoro
Copy link
Collaborator

@djtodoro djtodoro commented Aug 10, 2023

Enabled by default, to disable please use --no-relax-balc-trampolines.

The patch on the LLVM side: MediaTek-Labs/llvm-project#25

@farazs-github
Copy link
Member

Just for my reference, what kind of integration testing has this patch seen?

@djtodoro
Copy link
Collaborator Author

Just for my reference, what kind of integration testing has this patch seen?

  • check-all passes
  • built llvm-test-suite (no regression nor improvements)

@farazs-github
Copy link
Member

No changes in llvm-test-suite sounds suspect. This was with the accompanying llvm patch and -Os?

@djtodoro
Copy link
Collaborator Author

Well, I agree. I will mark this as DRAFT until I get back with the feedback.

@djtodoro djtodoro marked this pull request as draft August 13, 2023 12:16
@farazs-github
Copy link
Member

Well, if you are revising the patch, 2 points:

  1. We discussed that the placeholder relocation would be added to those calls that we do not want to be optimized. Since most of the code is expected to be compiled with -Os and have the optimization enabled, it would be more efficient to mark the ones we want to skip.
  2. Lets decide on a relocation name instead of NONE ... something to indicate that it is a non-stub/non-tramp call, something like R_NANOMIPS_NOTRAMP or R_NANOMIPS_DIRECT_CALL. The relocation number will be 77. I don't mind reviewing with R_NANOMIPS_NONE, but I am pretty sure we won't merge the patch without this change.

@djtodoro djtodoro marked this pull request as ready for review August 15, 2023 14:01
Copy link
Member

@farazs-github farazs-github left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest you evaluate this against some simple test cases, before resubmitting. I found it to be broken in the most obvious way, by inspecting a few random tests from llvm-test-suites (gcc-loops, sradKernel). Calls that get contracted in the Trampoline phase get expanded back in a later Expansion phase, except the ones where you are inserting the trampoline with an extra BC.

Here is a simple and obvious test:
.text
.linkrelax
.module pcrel
.globl test
.type test,@function
.ent test
test:
jrc $ra
.rept 512
nop32
.endm
.end test

.globl main
.type main,@function
.ent main

main:
balc test
nop
balc test
nop
balc test
nop
balc test
nop
balc test
nop
.end main

gold/nanomips-insn.def Show resolved Hide resolved
@farazs-github farazs-github changed the base branch from mtk/gold_v7 to nmips/gold_v7 September 27, 2023 04:17
gold/nanomips.cc Outdated
@@ -7297,6 +7653,10 @@ Target_nanomips<size, big_endian>::scan_reloc_section_for_transform(
i, prelocs))
continue;

unsigned int notramp_reloc =
has_notramp_reloc<size, big_endian>(r_offset, reloc_count, i, prelocs);
bool has_balc_stub2 = notramp_reloc == false;
Copy link
Member

@farazs-github farazs-github Sep 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Parenthesis around the comparison expression.

  2. From my understanding, please confirm if this is right:
    Input: balc[32] with NOTRAMP -> no_tramp_reloc == true -> has_balc_stub2 == false
    Input: balc[32] without NOTRAMP -> no_tramp_reloc == false -> has_balc_stub2 == true
    Input: any other instruction without NOTRAMP -> no_tramp-reloc == false -> has_balc_stub2 == true??
    The question is, why is has_balc_stub2 true at all for instructions that are not 32-bit calls? At the least, the variable name is misnamed!

gold/nanomips.cc Outdated
@@ -7297,6 +7653,10 @@ Target_nanomips<size, big_endian>::scan_reloc_section_for_transform(
i, prelocs))
continue;

unsigned int notramp_reloc =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect type? Why not bool?

gold/nanomips.cc Outdated
{
unsigned int r_type = elfcpp::elf_r_type<size>(reloc.get_r_info());

if ((r_type != elfcpp::R_NANOMIPS_PC25_S1 || !has_balc_stub2)
Copy link
Member

@farazs-github farazs-github Sep 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary nesting. Would be better to put all parts of the || expression at the same level.

Also note the comment about the name `has_balc_stub2' below. I'd expect has_balc_stub2 to imply that it is a 32-bit call instruction and already has the appropriate relocation and any checking to that effect would simply be in an assert( ).

AndrijaSyrmia and others added 9 commits January 22, 2024 13:02
gold/
	* nanomips-insn.def: Add new transformation.
	* nanomips.cc (Nanomips_expand_insn::type): Expand ADDIU[GP.B]
	to ADDIU[GP.W] if the offset is word-aligned and fits 21 bits.
	* testsuite/Makefile.am: Add new tests.
	* testsuite/Makefile.in: Re-generate.
	* testsuite/nanomips_gprel_out_of_range_small.s: New source.
	* testsuite/nanomips_gprel_out_of_range_small.t: New script.
	* testsuite/nanomips_gprel_out_of_range.sh: Check new test
	output.
Enabled by default, to disable use
  --no-relax-balc-trampolines
farazs-github pushed a commit that referenced this pull request Apr 28, 2024
When running test-case gdb.server/connect-with-no-symbol-file.exp on
aarch64-linux (specifically, an opensuse leap 15.5 container on a
fedora asahi 39 system), I run into:
...
(gdb) detach^M
Detaching from program: target:connect-with-no-symbol-file, process 185104^M
Ending remote debugging.^M
terminate called after throwing an instance of 'gdb_exception_error'^M
...

The detailed backtrace of the corefile is:
...
 (gdb) bt
 #0  0x0000ffff75504f54 in raise () from /lib64/libpthread.so.0
 #1  0x00000000007a86b4 in handle_fatal_signal (sig=6)
     at gdb/event-top.c:926
 #2  <signal handler called>
 #3  0x0000ffff74b977b4 in raise () from /lib64/libc.so.6
 #4  0x0000ffff74b98c18 in abort () from /lib64/libc.so.6
 #5  0x0000ffff74ea26f4 in __gnu_cxx::__verbose_terminate_handler() ()
    from /usr/lib64/libstdc++.so.6
 #6  0x0000ffff74ea011c in ?? () from /usr/lib64/libstdc++.so.6
 #7  0x0000ffff74ea0180 in std::terminate() () from /usr/lib64/libstdc++.so.6
 bminor#8  0x0000ffff74ea0464 in __cxa_throw () from /usr/lib64/libstdc++.so.6
 bminor#9  0x0000000001548870 in throw_it (reason=RETURN_ERROR,
     error=TARGET_CLOSE_ERROR, fmt=0x16c7810 "Remote connection closed", ap=...)
     at gdbsupport/common-exceptions.cc:203
 bminor#10 0x0000000001548920 in throw_verror (error=TARGET_CLOSE_ERROR,
     fmt=0x16c7810 "Remote connection closed", ap=...)
     at gdbsupport/common-exceptions.cc:211
 bminor#11 0x0000000001548a00 in throw_error (error=TARGET_CLOSE_ERROR,
     fmt=0x16c7810 "Remote connection closed")
     at gdbsupport/common-exceptions.cc:226
 bminor#12 0x0000000000ac8f2c in remote_target::readchar (this=0x233d3d90, timeout=2)
     at gdb/remote.c:9856
 #13 0x0000000000ac9f04 in remote_target::getpkt (this=0x233d3d90,
     buf=0x233d40a8, forever=false, is_notif=0x0) at gdb/remote.c:10326
 #14 0x0000000000acf3d0 in remote_target::remote_hostio_send_command
     (this=0x233d3d90, command_bytes=13, which_packet=17,
     remote_errno=0xfffff1a3cf38, attachment=0xfffff1a3ce88,
     attachment_len=0xfffff1a3ce90) at gdb/remote.c:12567
 #15 0x0000000000ad03bc in remote_target::fileio_fstat (this=0x233d3d90, fd=3,
     st=0xfffff1a3d020, remote_errno=0xfffff1a3cf38)
     at gdb/remote.c:12979
 #16 0x0000000000c39878 in target_fileio_fstat (fd=0, sb=0xfffff1a3d020,
     target_errno=0xfffff1a3cf38) at gdb/target.c:3315
 #17 0x00000000007eee5c in target_fileio_stream::stat (this=0x233d4400,
     abfd=0x2323fc40, sb=0xfffff1a3d020) at gdb/gdb_bfd.c:467
 #18 0x00000000007f012c in <lambda(bfd*, void*, stat*)>::operator()(bfd *,
     void *, stat *) const (__closure=0x0, abfd=0x2323fc40, stream=0x233d4400,
     sb=0xfffff1a3d020) at gdb/gdb_bfd.c:955
 #19 0x00000000007f015c in <lambda(bfd*, void*, stat*)>::_FUN(bfd *, void *,
     stat *) () at gdb/gdb_bfd.c:956
 #20 0x0000000000f9b838 in opncls_bstat (abfd=0x2323fc40, sb=0xfffff1a3d020)
     at bfd/opncls.c:665
 #21 0x0000000000f90adc in bfd_stat (abfd=0x2323fc40, statbuf=0xfffff1a3d020)
     at bfd/bfdio.c:431
 #22 0x000000000065fe20 in reopen_exec_file () at gdb/corefile.c:52
 #23 0x0000000000c3a3e8 in generic_mourn_inferior ()
     at gdb/target.c:3642
 #24 0x0000000000abf3f0 in remote_unpush_target (target=0x233d3d90)
     at gdb/remote.c:6067
 #25 0x0000000000aca8b0 in remote_target::mourn_inferior (this=0x233d3d90)
     at gdb/remote.c:10587
 #26 0x0000000000c387cc in target_mourn_inferior (
     ptid=<error reading variable: Cannot access memory at address 0x2d310>)
     at gdb/target.c:2738
 #27 0x0000000000abfff0 in remote_target::remote_detach_1 (this=0x233d3d90,
     inf=0x22fce540, from_tty=1) at gdb/remote.c:6421
 #28 0x0000000000ac0094 in remote_target::detach (this=0x233d3d90,
     inf=0x22fce540, from_tty=1) at gdb/remote.c:6436
 #29 0x0000000000c37c3c in target_detach (inf=0x22fce540, from_tty=1)
     at gdb/target.c:2526
 #30 0x0000000000860424 in detach_command (args=0x0, from_tty=1)
    at gdb/infcmd.c:2817
 #31 0x000000000060b594 in do_simple_func (args=0x0, from_tty=1, c=0x231431a0)
     at gdb/cli/cli-decode.c:94
 #32 0x00000000006108c8 in cmd_func (cmd=0x231431a0, args=0x0, from_tty=1)
     at gdb/cli/cli-decode.c:2741
 #33 0x0000000000c65a94 in execute_command (p=0x232e52f6 "", from_tty=1)
     at gdb/top.c:570
 #34 0x00000000007a7d2c in command_handler (command=0x232e52f0 "")
     at gdb/event-top.c:566
 #35 0x00000000007a8290 in command_line_handler (rl=...)
     at gdb/event-top.c:802
 #36 0x0000000000c9092c in tui_command_line_handler (rl=...)
     at gdb/tui/tui-interp.c:103
 #37 0x00000000007a750c in gdb_rl_callback_handler (rl=0x23385330 "detach")
     at gdb/event-top.c:258
 #38 0x0000000000d910f4 in rl_callback_read_char ()
     at readline/readline/callback.c:290
 #39 0x00000000007a7338 in gdb_rl_callback_read_char_wrapper_noexcept ()
     at gdb/event-top.c:194
 #40 0x00000000007a73f0 in gdb_rl_callback_read_char_wrapper
     (client_data=0x22fbf640) at gdb/event-top.c:233
 #41 0x0000000000cbee1c in stdin_event_handler (error=0, client_data=0x22fbf640)
     at gdb/ui.c:154
 #42 0x000000000154ed60 in handle_file_event (file_ptr=0x232be730, ready_mask=1)
     at gdbsupport/event-loop.cc:572
 #43 0x000000000154f21c in gdb_wait_for_event (block=1)
     at gdbsupport/event-loop.cc:693
 #44 0x000000000154dec4 in gdb_do_one_event (mstimeout=-1)
    at gdbsupport/event-loop.cc:263
 #45 0x0000000000910f98 in start_event_loop () at gdb/main.c:400
 #46 0x0000000000911130 in captured_command_loop () at gdb/main.c:464
 #47 0x0000000000912b5c in captured_main (data=0xfffff1a3db58)
     at gdb/main.c:1338
 #48 0x0000000000912bf4 in gdb_main (args=0xfffff1a3db58)
     at gdb/main.c:1357
 #49 0x00000000004170f4 in main (argc=10, argv=0xfffff1a3dcc8)
     at gdb/gdb.c:38
 (gdb)
...

The abort happens because a c++ exception escapes to c code, specifically
opncls_bstat in bfd/opncls.c.  Compiling with -fexceptions works around this.

Fix this by catching the exception just before it escapes, in stat_trampoline
and likewise in few similar spot.

Add a new template catch_exceptions to do so in a consistent way.

Tested on aarch64-linux.

Approved-by: Pedro Alves <[email protected]>

PR remote/31577
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31577
farazs-github pushed a commit that referenced this pull request May 5, 2024
If threads are disabled, either by --disable-threading explicitely, or by
missing std::thread support, you get the following ASAN error when
loading symbols:

==7310==ERROR: AddressSanitizer: heap-use-after-free on address 0x614000002128 at pc 0x00000098794a bp 0x7ffe37e6af70 sp 0x7ffe37e6af68
READ of size 1 at 0x614000002128 thread T0
    #0 0x987949 in index_cache_store_context::store() const ../../gdb/dwarf2/index-cache.c:163
    #1 0x943467 in cooked_index_worker::write_to_cache(cooked_index const*, deferred_warnings*) const ../../gdb/dwarf2/cooked-index.c:601
    #2 0x1705e39 in std::function<void ()>::operator()() const /gcc/9/include/c++/9.2.0/bits/std_function.h:690
    #3 0x1705e39 in gdb::task_group::impl::~impl() ../../gdbsupport/task-group.cc:38

0x614000002128 is located 232 bytes inside of 408-byte region [0x614000002040,0x6140000021d8)
freed by thread T0 here:
    #0 0x7fd75ccf8ea5 in operator delete(void*, unsigned long) ../../.././libsanitizer/asan/asan_new_delete.cc:177
    #1 0x9462e5 in cooked_index::index_for_writing() ../../gdb/dwarf2/cooked-index.h:689
    #2 0x9462e5 in operator() ../../gdb/dwarf2/cooked-index.c:657
    #3 0x9462e5 in _M_invoke /gcc/9/include/c++/9.2.0/bits/std_function.h:300

It's happening because cooked_index_worker::wait always returns true in
this case, which tells cooked_index::wait it can delete the m_state
cooked_index_worker member, but cooked_index_worker::write_to_cache tries
to access it immediately afterwards.

Fixed by making cooked_index_worker::wait only return true if desired_state
is CACHE_DONE, same as if threading was enabled, so m_state will not be
prematurely deleted.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31694
Approved-By: Tom Tromey <[email protected]>
Fixed issues in align

Changed picking balc tramp candidates a bit

Reimplemented processing of R_NANOMIPS_NOTRAMP
farazs-github pushed a commit that referenced this pull request Jul 19, 2024
Similar to the x86_64 testcases, some .s files contain the corresponding
CFI directives.  This helps in validating the synthesized CFI by running
those tests with and without the --scfi=experimental command line
option.

GAS issues some diagnostics, enabled by default, with
--scfi=experimental.  The diagnostics have been added with an intent to
help user correct inadvertent errors in their hand-written asm.  An
error is issued when GAS finds that input asm is not amenable to
accurate CFI synthesis.  The existing scfi-diag-*.s tests in the
gas/testsuite/gas/scfi/x86_64 directory test some SCFI diagnostics
already:

      - (#1) "Warning: SCFI: Asymetrical register restore"
      - (#2) "Error: SCFI: usage of REG_FP as scratch not supported"
      - (#3) "Error: SCFI: unsupported stack manipulation pattern"
      - (#4) "Error: untraceable control flow for func 'XXX'"

In the newly added aarch64 testsuite, further tests for additional
diagnostics have been added:
 - scfi-diag-1.s in this patch highlights an aarch64-specific diagnostic:
   (#5) "Warning: SCFI: ignored probable save/restore op with reg offset"

Additionally, some testcases are added to showcase the (currently)
unsupported patterns, e.g., scfi-unsupported-1.s
        mov     x16, 4384
        sub     sp, sp, x16

gas/testsuite/:
	* gas/scfi/README: Update comment to include aarch64.
	* gas/scfi/aarch64/scfi-aarch64.exp: New file.
	* gas/scfi/aarch64/ginsn-arith-1.l: New test.
	* gas/scfi/aarch64/ginsn-arith-1.s: New test.
	* gas/scfi/aarch64/ginsn-cofi-1.l: New test.
	* gas/scfi/aarch64/ginsn-cofi-1.s: New test.
	* gas/scfi/aarch64/ginsn-ldst-1.l: New test.
	* gas/scfi/aarch64/ginsn-ldst-1.s: New test.
	* gas/scfi/aarch64/scfi-callee-saved-fp-1.d: New test.
	* gas/scfi/aarch64/scfi-callee-saved-fp-1.l: New test.
	* gas/scfi/aarch64/scfi-callee-saved-fp-1.s: New test.
	* gas/scfi/aarch64/scfi-callee-saved-fp-2.d: New test.
	* gas/scfi/aarch64/scfi-callee-saved-fp-2.l: New test.
	* gas/scfi/aarch64/scfi-callee-saved-fp-2.s: New test.
	* gas/scfi/aarch64/scfi-cb-1.d: New test.
	* gas/scfi/aarch64/scfi-cb-1.l: New test.
	* gas/scfi/aarch64/scfi-cb-1.s: New test.
	* gas/scfi/aarch64/scfi-cfg-1.d: New test.
	* gas/scfi/aarch64/scfi-cfg-1.l: New test.
	* gas/scfi/aarch64/scfi-cfg-1.s: New test.
	* gas/scfi/aarch64/scfi-cfg-2.d: New test.
	* gas/scfi/aarch64/scfi-cfg-2.l: New test.
	* gas/scfi/aarch64/scfi-cfg-2.s: New test.
	* gas/scfi/aarch64/scfi-cfg-3.d: New test.
	* gas/scfi/aarch64/scfi-cfg-3.l: New test.
	* gas/scfi/aarch64/scfi-cfg-3.s: New test.
	* gas/scfi/aarch64/scfi-cfg-4.l: New test.
	* gas/scfi/aarch64/scfi-cfg-4.s: New test.
	* gas/scfi/aarch64/scfi-cond-br-1.d: New test.
	* gas/scfi/aarch64/scfi-cond-br-1.l: New test.
	* gas/scfi/aarch64/scfi-cond-br-1.s: New test.
	* gas/scfi/aarch64/scfi-diag-1.l: New test.
	* gas/scfi/aarch64/scfi-diag-1.s: New test.
	* gas/scfi/aarch64/scfi-diag-2.l: New test.
	* gas/scfi/aarch64/scfi-diag-2.s: New test.
	* gas/scfi/aarch64/scfi-diag-3.l: New test.
	* gas/scfi/aarch64/scfi-diag-3.s: New test.
	* gas/scfi/aarch64/scfi-ldrp-1.d: New test.
	* gas/scfi/aarch64/scfi-ldrp-1.l: New test.
	* gas/scfi/aarch64/scfi-ldrp-1.s: New test.
	* gas/scfi/aarch64/scfi-ldrp-2.d: New test.
	* gas/scfi/aarch64/scfi-ldrp-2.l: New test.
	* gas/scfi/aarch64/scfi-ldrp-2.s: New test.
	* gas/scfi/aarch64/scfi-ldstnap-1.d: New test.
	* gas/scfi/aarch64/scfi-ldstnap-1.l: New test.
	* gas/scfi/aarch64/scfi-ldstnap-1.s: New test.
	* gas/scfi/aarch64/scfi-strp-1.d: New test.
	* gas/scfi/aarch64/scfi-strp-1.l: New test.
	* gas/scfi/aarch64/scfi-strp-1.s: New test.
	* gas/scfi/aarch64/scfi-strp-2.d: New test.
	* gas/scfi/aarch64/scfi-strp-2.l: New test.
	* gas/scfi/aarch64/scfi-strp-2.s: New test.
	* gas/scfi/aarch64/scfi-unsupported-1.l: New test.
	* gas/scfi/aarch64/scfi-unsupported-1.s: New test.
	* gas/scfi/aarch64/scfi-unsupported-2.l: New test.
	* gas/scfi/aarch64/scfi-unsupported-2.s: New test.
farazs-github pushed a commit that referenced this pull request Jul 24, 2024
On arm-linux, I run into:
...
PASS: gdb.ada/mi_task_arg.exp: mi runto task_switch.break_me
Expecting: ^(-stack-list-arguments 1[^M
]+)?(\^done,stack-args=\[frame={level="0",args=\[\]},frame={level="1",args=\[{name="<_task>",value="0x[0-9A-Fa-f]+"}(,{name="<_taskL>",value="[0-9]+"})?\]},frame={level="2",args=\[({name="self_id",value="(0x[0-9A-Fa-f]+|<optimized out>)"})?\]},.*[^M
]+[(]gdb[)] ^M
[ ]*)
-stack-list-arguments 1^M
^done,stack-args=[frame={level="0",args=[]},frame={level="1",args=[{name="<_task>",value="0x40bc48"}]},frame={level="2",args=[]}]^M
(gdb) ^M
FAIL: gdb.ada/mi_task_arg.exp: -stack-list-arguments 1 (unexpected output)
...

The problem is that the test-case expects a level 3 frame, but there is none.

This can be reproduced using cli bt:
...
 $ gdb -q -batch outputs/gdb.ada/mi_task_arg/task_switch \
   -ex "b task_switch.break_me" \
   -ex run \
   -ex bt
 Breakpoint 1 at 0x34b4: file task_switch.adb, line 57.

 Thread 3 "my_caller" hit Breakpoint 1, task_switch.break_me () \
   at task_switch.adb:57
 57	      null;
 #0  task_switch.break_me () at task_switch.adb:57
 #1  0x00403424 in task_switch.caller (<_task>=0x40bc48) at task_switch.adb:51
 #2  0xf7f95a08 in ?? () from /lib/arm-linux-gnueabihf/libgnarl-12.so
 Backtrace stopped: previous frame identical to this frame (corrupt stack?)
...

The purpose of the test-case is printing the frame at level 1, so I don't
think we should bother about the presence of the frame at level 3.

Fix this by allowing the backtrace to stop at level 2.

Tested on arm-linux.

Approved-By: Luis Machado <[email protected]>
Approved-By: Andrew Burgess <[email protected]>
farazs-github pushed a commit that referenced this pull request Jul 31, 2024
Since commit b1da98a ("gdb: remove use of alloca in
new_macro_definition"), if cached_argv is empty, we call macro_bcache
with a nullptr data.  This ends up caught by UBSan deep down in the
bcache code:

    $ ./gdb -nx -q --data-directory=data-directory  /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.base/macscp/macscp -readnow
    Reading symbols from /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.base/macscp/macscp...
    Expanding full symbols from /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.base/macscp/macscp...
    /home/smarchi/src/binutils-gdb/gdb/bcache.c:195:12: runtime error: null pointer passed as argument 2, which is declared to never be null

The backtrace:

    #1  0x00007ffff619a05d in __ubsan::__ubsan_handle_nonnull_arg_abort (Data=<optimized out>) at ../../../../src/libsanitizer/ubsan/ubsan_handlers.cpp:750
    #2  0x000055556337fba2 in gdb::bcache::insert (this=0x62d0000c8458, addr=0x0, length=0, added=0x0) at /home/smarchi/src/binutils-gdb/gdb/bcache.c:195
    #3  0x0000555564b49222 in gdb::bcache::insert<char const*, void> (this=0x62d0000c8458, addr=0x0, length=0, added=0x0) at /home/smarchi/src/binutils-gdb/gdb/bcache.h:158
    #4  0x0000555564b481fa in macro_bcache<char const*> (t=0x62100007ae70, addr=0x0, len=0) at /home/smarchi/src/binutils-gdb/gdb/macrotab.c:117
    #5  0x0000555564b42b4a in new_macro_definition (t=0x62100007ae70, kind=macro_function_like, special_kind=macro_ordinary, argv=std::__debug::vector of length 0, capacity 0, replacement=0x62a00003af3a "__builtin_va_arg_pack ()") at /home/smarchi/src/binutils-gdb/gdb/macrotab.c:573
    #6  0x0000555564b44674 in macro_define_internal (source=0x6210000ab9e0, line=469, name=0x7fffffffa710 "__va_arg_pack", kind=macro_function_like, special_kind=macro_ordinary, argv=std::__debug::vector of length 0, capacity 0, replacement=0x62a00003af3a "__builtin_va_arg_pack ()") at /home/smarchi/src/binutils-gdb/gdb/macrotab.c:777
    #7  0x0000555564b44ae2 in macro_define_function (source=0x6210000ab9e0, line=469, name=0x7fffffffa710 "__va_arg_pack", argv=std::__debug::vector of length 0, capacity 0, replacement=0x62a00003af3a "__builtin_va_arg_pack ()") at /home/smarchi/src/binutils-gdb/gdb/macrotab.c:816
    bminor#8  0x0000555563f62fc8 in parse_macro_definition (file=0x6210000ab9e0, line=469, body=0x62a00003af2a "__va_arg_pack() __builtin_va_arg_pack ()") at /home/smarchi/src/binutils-gdb/gdb/dwarf2/macro.c:203

This can be reproduced by running gdb.base/macscp.exp.  Avoid calling
macro_bcache if the macro doesn't have any arguments.

Change-Id: I33b5a7c3b3a93d5adba98983fcaae9c8522c383d
farazs-github pushed a commit that referenced this pull request Aug 2, 2024
Some flavors of indirect call and jmp instructions were not being
handled earlier, leading to a GAS error (#1):
  (#1) "Error: SCFI: unhandled op 0xff may cause incorrect CFI"

Not handling jmp/call (direct or indirect) ops is an error (as shown
above) because SCFI needs an accurate CFG to synthesize CFI correctly.
Recall that the presence of indirect jmp/call, however, does make the
CFG ineligible for SCFI. In other words, generating the ginsns for them
now, will eventually cause SCFI to bail out later with an error (#2)
anyway:
  (#2) "Error: untraceable control flow for func 'XXX'"

The first error (#1) gives the impression of missing functionality in
GAS.  So, it seems cleaner to synthesize a GINSN_TYPE_JUMP /
GINSN_TYPE_CALL now in the backend, and let SCFI machinery complain with
the error as expected.

The handling for these indirect jmp/call instructions is similar, so
reuse the code by carving out a function for the same.

Adjust the testcase to include the now handled jmp/call instructions as
well.

gas/
	* config/tc-i386-ginsn.c (x86_ginsn_indirect_branch): New
	function.
	(x86_ginsn_new): Refactor out functionality to above.

gas/testsuite/
	* gas/scfi/x86_64/ginsn-cofi-1.l: Adjust the output.
	* gas/scfi/x86_64/ginsn-cofi-1.s: Add further varieties of
	jmp/call opcodes.
farazs-github pushed a commit that referenced this pull request Sep 5, 2024
With test-case gdb.dwarf2/dw2-lines.exp on arm-linux, I run into:
...
(gdb) break bar_label^M
Breakpoint 2 at 0x4004f6: file dw2-lines.c, line 29.^M
(gdb) continue^M
Continuing.^M
^M
Breakpoint 2, bar () at dw2-lines.c:29^M
29        foo (2);^M
(gdb) PASS: $exp: cv=2: cdw=32: lv=2: ldw=32: continue to breakpoint: foo \(1\)
...

The pass is incorrect because the continue lands at line 29 with "foo (2)"
instead of line line 27 with "foo (1)".

A minimal version is:
...
$ gdb -q -batch dw2-lines.cv-2-cdw-32-lv-2-ldw-32 -ex "b bar_label"
Breakpoint 1 at 0x4f6: file dw2-lines.c, line 29.
...
where:
...
000004ec <bar>:
 4ec:	b580      	push	{r7, lr}
 4ee:	af00      	add	r7, sp, #0

000004f0 <bar_label>:
 4f0:	2001      	movs	r0, #1
 4f2:	f7ff fff1 	bl	4d8 <foo>

000004f6 <bar_label_2>:
 4f6:	2002      	movs	r0, #2
 4f8:	f7ff ffee 	bl	4d8 <foo>
...

So, how does this happen?  In short:
- skip_prologue_sal calls arm_skip_prologue with pc == 0x4ec,
- thumb_analyze_prologue returns 0x4f2
  (overshooting by 1 insn, PR tdep/31981), and
- skip_prologue_sal decides that we're mid-line, and updates to 0x4f6.

However, this is a test-case about .debug_line info, so why didn't arm_skip_prologue
use the line info to skip the prologue?

The answer is that the line info starts at bar_label, not at bar.

Fixing that allows us to work around PR tdep/31981.

Likewise in gdb.dwarf2/dw2-line-number-zero.exp.

Instead, add a new test-case gdb.arch/skip-prologue.exp that is dedicated to
checking quality of architecture-specific prologue analysis, without being
written in an architecture-specific way.

If fails on arm-linux for both marm and mthumb:
...
FAIL: gdb.arch/skip-prologue.exp: f2: $bp_addr == $prologue_end_addr (skipped too much)
FAIL: gdb.arch/skip-prologue.exp: f4: $bp_addr == $prologue_end_addr (skipped too much)
...
and passes for:
- x86_64-linux for {m64,m32}x{-fno-PIE/-no-pie,-fPIE/-pie}
- aarch64-linux.

Tested on arm-linux.
farazs-github pushed a commit that referenced this pull request Sep 8, 2024
When GDB opens a core file, in 'core_target::build_file_mappings ()',
we collection information about the files that are mapped into the
core file, specifically, the build-id and the DT_SONAME attribute for
the file, which will be set for some shared libraries.

We then cache the DT_SONAME to build-id information on the core file
bfd object in the function set_cbfd_soname_build_id.

Later, when we are loading the shared libraries for the core file, we
can use the library's file name to look in the DT_SONAME to build-id
map, and, if we find a matching entry, we can use the build-id to
validate that we are loading the correct shared library.

This works OK, but has some limitations: not every shared library will
have a DT_SONAME attribute.  Though it is good practice to add such an
attribute, it's not required.  A library without this attribute will
not have its build-id checked, which can lead to GDB loading the wrong
shared library.

What I want to do in this commit is to improve GDB's ability to use
the build-ids extracted in core_target::build_file_mappings to both
validate the shared libraries being loaded, and then to use these
build-ids to potentially find (via debuginfod) the shared library.

To do this I propose making the following changes to GDB:

(1) Rather than just recording the DT_SONAME to build-id mapping in
set_cbfd_soname_build_id, we should also record, the full filename to
build-id mapping, and also the memory ranges to build-id mapping for
every memory range covered by every mapped file.

(2) Add a new callback solib_ops::find_solib_addr.  This callback
takes a solib object and returns an (optional) address within the
inferior that is part of this library.  We can use this address to
find a mapped file using the stored memory ranges which will increase
the cases in which a match can be found.

(3) Move the mapped file record keeping out of solib.c and into
corelow.c.  Future commits will make use of this information from
other parts of GDB.  This information was never solib specific, it
lived in the solib.c file because that was the only user of the data,
but really, the data is all about the core file, and should be stored
in core_target, other parts of GDB can then query this data as needed.

Now, when we load a shared library for a core file, we do the
following lookups:

  1. Is the exact filename of the shared library found in the filename
  to build-id map?  If so then use this build-id for validation.

  2. Find an address within the shared library using ::find_solib_addr
  and then look for an entry in the mapped address to build-id map.
  If an entry is found then use this build-id.

  3. Finally, look in the soname to build-id map.  If an entry is
  found then use this build-id.

The addition of step #2 here means that GDB is now far more likely to
find a suitable build-id for a shared library.  Having acquired a
build-id the existing code for using debuginfod to lookup a shared
library object can trigger more often.

On top of this, we also create a build-id to filename map.  This is
useful as often a shared library is implemented as a symbolic link to
the actual shared library file.  The mapped file information is stored
based on the actual, real file name, while the shared library
information holds the original symbolic link file name.

If when loading the shared library, we find the symbolic link has
disappeared, we can use the build-id to file name map to check if the
actual file is still around, if it is (and if the build-id matches)
then we can fall back to use that file.  This is another way in which
we can slightly increase the chances that GDB will find the required
files when loading a core file.

Adding all of the above required pretty much a full rewrite of the
existing set_cbfd_soname_build_id function and the corresponding
get_cbfd_soname_build_id function, so I have taken the opportunity to
move the information caching out of solib.c and into corelow.c where
it is now accessed through the function core_target_find_mapped_file.

At this point the benefit of this move is not entirely obvious, though
I don't think the new location is significantly worse than where it
was originally.  The benefit though is that the cached information is
no longer tied to the shared library loading code.

I already have a second set of patches (not in this series) that make
use of this caching from elsewhere in GDB.  I've not included those
patches in this series as this series is already pretty big, but even
if those follow up patches don't arrive, I think the new location is
just as good as the original location.

Rather that caching the information within the core file BFD via the
registry mechanism, the information used for the mapped file lookup is
now stored within the core_file target directly.
farazs-github pushed a commit that referenced this pull request Sep 9, 2024
The commit:

  commit c6b4867
  Date:   Thu Mar 30 19:21:22 2023 +0100

      gdb: parse pending breakpoint thread/task immediately

Introduce a use bug where the value of a temporary variable was being
used after it had gone out of scope.  This was picked up by the
address sanitizer and would result in this error:

  (gdb) maintenance selftest create_breakpoint_parse_arg_string
  Running selftest create_breakpoint_parse_arg_string.
  =================================================================
  ==2265825==ERROR: AddressSanitizer: stack-use-after-scope on address 0x7fbb08046511 at pc 0x000001632230 bp 0x7fff7c2fb770 sp 0x7fff7c2fb768
  READ of size 1 at 0x7fbb08046511 thread T0
      #0 0x163222f in create_breakpoint_parse_arg_string(char const*, std::unique_ptr<char, gdb::xfree_deleter<char> >*, int*, int*, int*, std::unique_ptr<char, gdb::xfree_deleter<char> >*, bool*) ../../src/gdb/break-cond-parse.c:496
      #1 0x1633026 in test ../../src/gdb/break-cond-parse.c:582
      #2 0x163391b in create_breakpoint_parse_arg_string_tests ../../src/gdb/break-cond-parse.c:649
      #3 0x12cfebc in void std::__invoke_impl<void, void (*&)()>(std::__invoke_other, void (*&)()) /usr/include/c++/13/bits/invoke.h:61
      #4 0x12cc8ee in std::enable_if<is_invocable_r_v<void, void (*&)()>, void>::type std::__invoke_r<void, void (*&)()>(void (*&)()) /usr/include/c++/13/bits/invoke.h:111
      #5 0x12c81e5 in std::_Function_handler<void (), void (*)()>::_M_invoke(std::_Any_data const&) /usr/include/c++/13/bits/std_function.h:290
      #6 0x18bb51d in std::function<void ()>::operator()() const /usr/include/c++/13/bits/std_function.h:591
      #7 0x4193ef9 in selftests::run_tests(gdb::array_view<char const* const>, bool) ../../src/gdbsupport/selftest.cc:100
      bminor#8 0x21c2206 in maintenance_selftest ../../src/gdb/maint.c:1172
      ... etc ...

The problem was caused by three lines like this one:

  thread_info *thr
    = parse_thread_id (std::string (t.get_value ()).c_str (), &tmptok);

After parsing the thread-id TMPTOK would be left pointing into the
temporary string which had been created on this line.  When on the
next line we did this:

  gdb_assert (*tmptok == '\0');

The value of *TMPTOK is undefined.

Fix this by creating the std::string earlier in the scope.  Now the
contents of the string will remain valid when we check *TMPTOK.  The
address sanitizer issue is now resolved.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants