Squelches unhelpful console spam. by stellaraccident · Pull Request #48 · ROCm/TheRock

stellaraccident · 2025-02-01T03:18:47Z

Adds amd-llvm specific warning flags for sub-projects, disabling some very chatty low signal warnings (-Wno-documentation-unknown-command -Wno-documentation-pedantic -Wno-unused-command-line-argument).
Sets -DLLVM_ENABLE_PEDANTIC=OFF for LLVM if building on GCC since it seems that the project doesn't audit this carefully and is full of spam.
Disables verbose logs of sub-project installation sequences.

Updates #47.

* Adds amd-llvm specific warning flags for sub-projects, disabling some very chatty low signal warnings (`-Wno-documentation-unknown-command -Wno-documentation-pedantic -Wno-unused-command-line-argument`). * Sets `-DLLVM_ENABLE_PEDANTIC=OFF` for LLVM if building on GCC since it seems that the project doesn't audit this carefully and is full of spam. * Disables verbose logs of sub-project installation sequences. Updates #47.

stellaraccident · 2025-02-01T03:47:03Z

@ScottTodd for post-review

ScottTodd

Nice, 72000 lines of output before --> 18500 lines of output after :D!

When running test-case gdb.server/connect-with-no-symbol-file.exp on aarch64-linux (specifically, an opensuse leap 15.5 container on a fedora asahi 39 system), I run into: ... (gdb) detach^M Detaching from program: target:connect-with-no-symbol-file, process 185104^M Ending remote debugging.^M terminate called after throwing an instance of 'gdb_exception_error'^M ... The detailed backtrace of the corefile is: ... (gdb) bt #0 0x0000ffff75504f54 in raise () from /lib64/libpthread.so.0 #1 0x00000000007a86b4 in handle_fatal_signal (sig=6) at gdb/event-top.c:926 #2 <signal handler called> #3 0x0000ffff74b977b4 in raise () from /lib64/libc.so.6 #4 0x0000ffff74b98c18 in abort () from /lib64/libc.so.6 #5 0x0000ffff74ea26f4 in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib64/libstdc++.so.6 #6 0x0000ffff74ea011c in ?? () from /usr/lib64/libstdc++.so.6 #7 0x0000ffff74ea0180 in std::terminate() () from /usr/lib64/libstdc++.so.6 #8 0x0000ffff74ea0464 in __cxa_throw () from /usr/lib64/libstdc++.so.6 #9 0x0000000001548870 in throw_it (reason=RETURN_ERROR, error=TARGET_CLOSE_ERROR, fmt=0x16c7810 "Remote connection closed", ap=...) at gdbsupport/common-exceptions.cc:203 #10 0x0000000001548920 in throw_verror (error=TARGET_CLOSE_ERROR, fmt=0x16c7810 "Remote connection closed", ap=...) at gdbsupport/common-exceptions.cc:211 #11 0x0000000001548a00 in throw_error (error=TARGET_CLOSE_ERROR, fmt=0x16c7810 "Remote connection closed") at gdbsupport/common-exceptions.cc:226 #12 0x0000000000ac8f2c in remote_target::readchar (this=0x233d3d90, timeout=2) at gdb/remote.c:9856 #13 0x0000000000ac9f04 in remote_target::getpkt (this=0x233d3d90, buf=0x233d40a8, forever=false, is_notif=0x0) at gdb/remote.c:10326 #14 0x0000000000acf3d0 in remote_target::remote_hostio_send_command (this=0x233d3d90, command_bytes=13, which_packet=17, remote_errno=0xfffff1a3cf38, attachment=0xfffff1a3ce88, attachment_len=0xfffff1a3ce90) at gdb/remote.c:12567 #15 0x0000000000ad03bc in remote_target::fileio_fstat (this=0x233d3d90, fd=3, st=0xfffff1a3d020, remote_errno=0xfffff1a3cf38) at gdb/remote.c:12979 #16 0x0000000000c39878 in target_fileio_fstat (fd=0, sb=0xfffff1a3d020, target_errno=0xfffff1a3cf38) at gdb/target.c:3315 #17 0x00000000007eee5c in target_fileio_stream::stat (this=0x233d4400, abfd=0x2323fc40, sb=0xfffff1a3d020) at gdb/gdb_bfd.c:467 #18 0x00000000007f012c in <lambda(bfd*, void*, stat*)>::operator()(bfd *, void *, stat *) const (__closure=0x0, abfd=0x2323fc40, stream=0x233d4400, sb=0xfffff1a3d020) at gdb/gdb_bfd.c:955 #19 0x00000000007f015c in <lambda(bfd*, void*, stat*)>::_FUN(bfd *, void *, stat *) () at gdb/gdb_bfd.c:956 #20 0x0000000000f9b838 in opncls_bstat (abfd=0x2323fc40, sb=0xfffff1a3d020) at bfd/opncls.c:665 #21 0x0000000000f90adc in bfd_stat (abfd=0x2323fc40, statbuf=0xfffff1a3d020) at bfd/bfdio.c:431 #22 0x000000000065fe20 in reopen_exec_file () at gdb/corefile.c:52 #23 0x0000000000c3a3e8 in generic_mourn_inferior () at gdb/target.c:3642 #24 0x0000000000abf3f0 in remote_unpush_target (target=0x233d3d90) at gdb/remote.c:6067 #25 0x0000000000aca8b0 in remote_target::mourn_inferior (this=0x233d3d90) at gdb/remote.c:10587 #26 0x0000000000c387cc in target_mourn_inferior ( ptid=<error reading variable: Cannot access memory at address 0x2d310>) at gdb/target.c:2738 #27 0x0000000000abfff0 in remote_target::remote_detach_1 (this=0x233d3d90, inf=0x22fce540, from_tty=1) at gdb/remote.c:6421 #28 0x0000000000ac0094 in remote_target::detach (this=0x233d3d90, inf=0x22fce540, from_tty=1) at gdb/remote.c:6436 #29 0x0000000000c37c3c in target_detach (inf=0x22fce540, from_tty=1) at gdb/target.c:2526 #30 0x0000000000860424 in detach_command (args=0x0, from_tty=1) at gdb/infcmd.c:2817 #31 0x000000000060b594 in do_simple_func (args=0x0, from_tty=1, c=0x231431a0) at gdb/cli/cli-decode.c:94 #32 0x00000000006108c8 in cmd_func (cmd=0x231431a0, args=0x0, from_tty=1) at gdb/cli/cli-decode.c:2741 #33 0x0000000000c65a94 in execute_command (p=0x232e52f6 "", from_tty=1) at gdb/top.c:570 #34 0x00000000007a7d2c in command_handler (command=0x232e52f0 "") at gdb/event-top.c:566 #35 0x00000000007a8290 in command_line_handler (rl=...) at gdb/event-top.c:802 #36 0x0000000000c9092c in tui_command_line_handler (rl=...) at gdb/tui/tui-interp.c:103 #37 0x00000000007a750c in gdb_rl_callback_handler (rl=0x23385330 "detach") at gdb/event-top.c:258 #38 0x0000000000d910f4 in rl_callback_read_char () at readline/readline/callback.c:290 #39 0x00000000007a7338 in gdb_rl_callback_read_char_wrapper_noexcept () at gdb/event-top.c:194 #40 0x00000000007a73f0 in gdb_rl_callback_read_char_wrapper (client_data=0x22fbf640) at gdb/event-top.c:233 #41 0x0000000000cbee1c in stdin_event_handler (error=0, client_data=0x22fbf640) at gdb/ui.c:154 #42 0x000000000154ed60 in handle_file_event (file_ptr=0x232be730, ready_mask=1) at gdbsupport/event-loop.cc:572 #43 0x000000000154f21c in gdb_wait_for_event (block=1) at gdbsupport/event-loop.cc:693 #44 0x000000000154dec4 in gdb_do_one_event (mstimeout=-1) at gdbsupport/event-loop.cc:263 #45 0x0000000000910f98 in start_event_loop () at gdb/main.c:400 #46 0x0000000000911130 in captured_command_loop () at gdb/main.c:464 #47 0x0000000000912b5c in captured_main (data=0xfffff1a3db58) at gdb/main.c:1338 #48 0x0000000000912bf4 in gdb_main (args=0xfffff1a3db58) at gdb/main.c:1357 #49 0x00000000004170f4 in main (argc=10, argv=0xfffff1a3dcc8) at gdb/gdb.c:38 (gdb) ... The abort happens because a c++ exception escapes to c code, specifically opncls_bstat in bfd/opncls.c. Compiling with -fexceptions works around this. Fix this by catching the exception just before it escapes, in stat_trampoline and likewise in few similar spot. Add a new template catch_exceptions to do so in a consistent way. Tested on aarch64-linux. Approved-by: Pedro Alves <pedro@palves.net> PR remote/31577 Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31577

…rget_breakpoint::check_status ROCgdb handles target events very slowly when running a test case like this, where a breakpoint is preset on HipTest::vectorADD: for (int i=0; i < numDevices; ++i) { HIPCHECK(hipSetDevice(i)); hipLaunchKernelGGL(HipTest::vectorADD, dim3(blocks), dim3(threadsPerBlock), 0, stream[i], static_cast<const int*>(A_d[i]), static_cast<const int*>(B_d[i]), C_d[i], N); } What happens is: - A kernel is launched - The internal runtime breakpoint is hit during the second hipLaunchKernelGGL call, which causes amd_dbgapi_target_breakpoint::check_status to be called - Meanwhile, all waves of the kernel hit the breakpoint on vectorADD - amd_dbgapi_target_breakpoint::check_status calls process_event_queue, which pulls the thousand of breakpoint hit events from the kernel - As part of handling the breakpoint hit events, we write the PC of the waves that stopped to decrement it. Because the forward progress requirement is not disabled, this causes a suspend/resume of the queue each time, which is time-consuming. The stack trace where this all happens is: #32 0x00007ffff6b9abda in amd_dbgapi_write_register (wave_id=..., register_id=..., offset=0, value_size=8, value=0x7fffea9fdcc0) at /home/smarchi/src/amd-dbgapi/src/register.cpp:587 #33 0x00005555588c0bed in amd_dbgapi_target::store_registers (this=0x55555c7b1d20 <the_amd_dbgapi_target>, regcache=0x507000002240, regno=470) at /home/smarchi/src/wt/amd/gdb/amd-dbgapi-target.c:2504 #34 0x000055555a5186a1 in target_store_registers (regcache=0x507000002240, regno=470) at /home/smarchi/src/wt/amd/gdb/target.c:3973 #35 0x0000555559fab831 in regcache::raw_write (this=0x507000002240, regnum=470, src=...) at /home/smarchi/src/wt/amd/gdb/regcache.c:890 #36 0x0000555559fabd2b in regcache::cooked_write (this=0x507000002240, regnum=470, src=...) at /home/smarchi/src/wt/amd/gdb/regcache.c:915 #37 0x0000555559fc3ca5 in regcache::cooked_write<unsigned long, void> (this=0x507000002240, regnum=470, val=140737323456768) at /home/smarchi/src/wt/amd/gdb/regcache.c:850 #38 0x0000555559fab09a in regcache_cooked_write_unsigned (regcache=0x507000002240, regnum=470, val=140737323456768) at /home/smarchi/src/wt/amd/gdb/regcache.c:858 #39 0x0000555559fb0678 in regcache_write_pc (regcache=0x507000002240, pc=0x7ffff62bd900) at /home/smarchi/src/wt/amd/gdb/regcache.c:1460 #40 0x00005555588bb37d in process_one_event (event_id=..., event_kind=AMD_DBGAPI_EVENT_KIND_WAVE_STOP) at /home/smarchi/src/wt/amd/gdb/amd-dbgapi-target.c:1873 #41 0x00005555588bbf7b in process_event_queue (process_id=..., until_event_kind=AMD_DBGAPI_EVENT_KIND_BREAKPOINT_RESUME) at /home/smarchi/src/wt/amd/gdb/amd-dbgapi-target.c:2006 #42 0x00005555588b1aca in amd_dbgapi_target_breakpoint::check_status (this=0x511000140900, bs=0x50600014ed00) at /home/smarchi/src/wt/amd/gdb/amd-dbgapi-target.c:890 #43 0x0000555558c50080 in bpstat_stop_status (aspace=0x5070000061b0, bp_addr=0x7fffed0b9ab0, thread=0x518000026c80, ws=..., stop_chain=0x50600014ed00) at /home/smarchi/src/wt/amd/gdb/breakpoint.c:6126 #44 0x000055555984f4ff in handle_signal_stop (ecs=0x7fffeaa40ef0) at /home/smarchi/src/wt/amd/gdb/infrun.c:7169 #45 0x000055555984b889 in handle_inferior_event (ecs=0x7fffeaa40ef0) at /home/smarchi/src/wt/amd/gdb/infrun.c:6621 #46 0x000055555983eab6 in fetch_inferior_event () at /home/smarchi/src/wt/amd/gdb/infrun.c:4750 #47 0x00005555597caa5f in inferior_event_handler (event_type=INF_REG_EVENT) at /home/smarchi/src/wt/amd/gdb/inf-loop.c:42 #48 0x00005555588b838e in handle_target_event (client_data=0x0) at /home/smarchi/src/wt/amd/gdb/amd-dbgapi-target.c:1513 Fix that performance problem by disabling the forward progress requirement in amd_dbgapi_target_breakpoint::check_status, before calling process_event_queue, so that we can process all events efficiently. Since the same performance problem could theoritically happen any time process_event_queue is called with forward progress requirement enabled, add an assert to ensure that forward progress requirement is disabled when process_event_queue is invoked. This makes it necessary to add a require_forward_progress call to amd_dbgapi_finalize_core_attach. It looks a bit strange, since core files don't have execution, but it doesn't hurt. Add a test that replicates this scenario. The test launches a kernel that hits a breakpoint (with an always false condition) repeatedly. Meanwhile, the host process loads an unloads a code object, causing check_status to be called. Bug: SWDEV-482511 Change-Id: Ida86340d679e6bd8462712953458c07ba3fd49ec Approved-by: Lancelot Six <lancelot.six@amd.com> (cherry picked from commit bb7c679902ece35d1427a88e9a79305544b009e1)

stellaraccident merged commit 5d7de8e into main Feb 1, 2025

stellaraccident deleted the squelch_spam branch February 1, 2025 03:46

ScottTodd reviewed Feb 1, 2025

View reviewed changes

erman-gurses mentioned this pull request Oct 10, 2025

[Windows] Enable rocThrust tests by packaging .hip.exe #1753

Merged

1 task

erman-gurses mentioned this pull request Oct 22, 2025

[Issue]: One of the rocthrust tests failed #1865

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Squelches unhelpful console spam.#48

Squelches unhelpful console spam.#48
stellaraccident merged 1 commit into
mainfrom
squelch_spam

stellaraccident commented Feb 1, 2025

Uh oh!

stellaraccident commented Feb 1, 2025

Uh oh!

ScottTodd left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

stellaraccident commented Feb 1, 2025

Uh oh!

stellaraccident commented Feb 1, 2025

Uh oh!

ScottTodd left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants