Refactor: Use GDB for stack decode#410
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@monitor/filter_exception_decoder.py`:
- Around line 415-419: The code incorrectly treats the result of
platform.get_package_dir(...) as an object with a .path attribute; change the
gdb path construction in the block that uses
get_package_dir("tool-riscv32-esp-elf-gdb") to use the returned string directly
(e.g., use pkg or pkg_path returned from get_package_dir) when building gdb_bin
in the filter_exception_decoder module so that gdb_bin = os.path.join(pkg,
"bin", "riscv32-esp-elf-gdb") (or equivalent) instead of referencing pkg.path;
keep the existing pkg truthiness check and error handling.
- Around line 559-563: The addr2line subprocess.check_output(...) calls that
decode addresses (the invocation that sets raw =
subprocess.check_output(args).decode(enc) and the other addr2line call later)
must include a timeout (use timeout=10) and their except blocks should catch
subprocess.TimeoutExpired in addition to subprocess.CalledProcessError; on
TimeoutExpired mirror the existing failure handling by populating
self._addr_cache[(addr, elf_path)] = None for each addr and returning, following
the same pattern used elsewhere in this module for addr2line timeouts.
- Around line 49-50: The module currently calls DefaultEnvironment() and
env.PioPlatform() at import time (symbols DefaultEnvironment, env, PioPlatform),
which triggers a NameError when _RSP_SERVER_MODE is True because
DefaultEnvironment isn’t available; wrap that initialization in a guard that
checks the module-level _RSP_SERVER_MODE flag before calling
DefaultEnvironment()/env.PioPlatform(), and when _RSP_SERVER_MODE is True set
env and platform to None (or defer creating them until needed) so
_run_rsp_server() can run without importing DefaultEnvironment; update any later
references to use the guarded values or lazily initialize them.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
monitor/filter_exception_decoder.py (1)
39-43:⚠️ Potential issue | 🔴 CriticalImport
DefaultEnvironmentbefore it is called.Line 40 uses
DefaultEnvironment()but the symbol is never imported, so monitor mode can fail at import time withNameError.💡 Proposed fix
if not _RSP_SERVER_MODE: + from SCons.Script import DefaultEnvironment env = DefaultEnvironment() platform = env.PioPlatform() from platformio.compat import IS_WINDOWSRun this read-only check; expected current output is
used=Trueandimported_from_scons_script=False:#!/bin/bash python - <<'PY' import ast from pathlib import Path path = Path("monitor/filter_exception_decoder.py") tree = ast.parse(path.read_text(encoding="utf-8")) used = any(isinstance(n, ast.Name) and n.id == "DefaultEnvironment" for n in ast.walk(tree)) imported = any( isinstance(n, ast.ImportFrom) and n.module == "SCons.Script" and any(alias.name == "DefaultEnvironment" for alias in n.names) for n in ast.walk(tree) ) print({"used": used, "imported_from_scons_script": imported}) PY🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@monitor/filter_exception_decoder.py` around lines 39 - 43, DefaultEnvironment is used in the block guarded by _RSP_SERVER_MODE but never imported; add an import for DefaultEnvironment (e.g. from SCons.Script import DefaultEnvironment) before calling DefaultEnvironment() so env = DefaultEnvironment() succeeds, keeping the import positioned so it only executes when not _RSP_SERVER_MODE or at module top if appropriate to other callers.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@monitor/filter_exception_decoder.py`:
- Around line 1222-1227: The packet-parsing loop that reads from stdin
(variables stdin, c, data) can spin forever if stdin.read(1) returns b'' (EOF);
update the loop to check for EOF after reading c (e.g., if c == b'': handle EOF
by breaking/returning or raising a specific exception and/or logging), and
ensure any partial data is discarded or handled consistently before exiting so
the function (the packet parsing routine that contains this while True loop)
does not infinite-loop when the stream closes.
---
Duplicate comments:
In `@monitor/filter_exception_decoder.py`:
- Around line 39-43: DefaultEnvironment is used in the block guarded by
_RSP_SERVER_MODE but never imported; add an import for DefaultEnvironment (e.g.
from SCons.Script import DefaultEnvironment) before calling DefaultEnvironment()
so env = DefaultEnvironment() succeeds, keeping the import positioned so it only
executes when not _RSP_SERVER_MODE or at module top if appropriate to other
callers.
There was a problem hiding this comment.
♻️ Duplicate comments (1)
monitor/filter_exception_decoder.py (1)
1228-1230:⚠️ Potential issue | 🟡 MinorValidate checksum trailer read before processing packet.
stdin.read(2)may return fewer than 2 bytes on EOF. In that case, the server should exit instead of acknowledging and processing a truncated packet.💡 Proposed fix
while True: c = stdin.read(1) if not c: # EOF mid-packet — discard partial data and exit sys.exit(0) if c == b"#": - stdin.read(2) # checksum bytes + checksum = stdin.read(2) # checksum bytes + if len(checksum) < 2: + sys.exit(0) break data += c🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@monitor/filter_exception_decoder.py` around lines 1228 - 1230, When you encounter the trailer marker (the branch where c == b"#"), read the two checksum bytes into a variable and validate that stdin.read(2) returned exactly 2 bytes before breaking and continuing packet processing; if the read returns fewer than 2 bytes treat it as EOF/truncated packet and terminate/exit (or return an error) instead of breaking and acknowledging the packet. Update the code around the c == b"#" branch (the call to stdin.read(2)) to assign the result, check len(checksum_bytes) == 2, and bail out cleanly if not.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@monitor/filter_exception_decoder.py`:
- Around line 1228-1230: When you encounter the trailer marker (the branch where
c == b"#"), read the two checksum bytes into a variable and validate that
stdin.read(2) returned exactly 2 bytes before breaking and continuing packet
processing; if the read returns fewer than 2 bytes treat it as EOF/truncated
packet and terminate/exit (or return an error) instead of breaking and
acknowledging the packet. Update the code around the c == b"#" branch (the call
to stdin.read(2)) to assign the result, check len(checksum_bytes) == 2, and bail
out cleanly if not.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
monitor/filter_exception_decoder.py (1)
343-351: Nice addition for UX.Auto-installing
tool-esp-rom-elfswhen missing improves the user experience by enabling ROM address decoding without manual intervention. The try/except ensures this doesn't block the filter if installation fails.One consideration: this could cause a noticeable delay on first monitor run if the package isn't already installed. You may want to add a brief print statement before attempting installation to set user expectations.
💡 Optional: Add pre-install message
if not rom_elfs_dir or not os.path.isdir(rom_elfs_dir): - print("ESP ROM ELFs tool not found, installing...") + print("ESP ROM ELFs tool not found, installing (this may take a moment)...") try:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@monitor/filter_exception_decoder.py` around lines 343 - 351, Before calling platform.install_package to auto-install the missing tool, add a brief user-facing message so users know the monitor may take longer on first run; locate the block that checks rom_elfs_dir (uses platform.get_package_dir("tool-esp-rom-elfs")) and right before platform.install_package("tool-esp-rom-elfs") print a short notice like "Installing tool-esp-rom-elfs; this may take a minute..." (keep the existing try/except and subsequent reassignment of rom_elfs_dir unchanged).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@monitor/filter_exception_decoder.py`:
- Around line 343-351: Before calling platform.install_package to auto-install
the missing tool, add a brief user-facing message so users know the monitor may
take longer on first run; locate the block that checks rom_elfs_dir (uses
platform.get_package_dir("tool-esp-rom-elfs")) and right before
platform.install_package("tool-esp-rom-elfs") print a short notice like
"Installing tool-esp-rom-elfs; this may take a minute..." (keep the existing
try/except and subsequent reassignment of rom_elfs_dir unchanged).
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@monitor/filter_exception_decoder.py`:
- Around line 1018-1031: _build_riscv_stack_data currently appends words
sequentially ignoring each line's address, which shifts memory when lines are
missing or out of order; fix by using the per-line addr (from m.group(1)) to
compute the byte offset from base_addr and pad stack_data with zero bytes up to
that offset before appending the packed words (use struct.pack("<I", int(w,16))
for each word and advance the expected address by 4 per word); ensure base_addr
is set on the first seen addr and keep using addr offsets for every line so gaps
are preserved when returning (base_addr, stack_data).
- Around line 1092-1095: The add-symbol-file argument is vulnerable to unescaped
paths (self.rom_elf_path) containing spaces or backslashes; update the code that
builds gdb_args (the branch that appends "-ex", "add-symbol-file %s" %
self.rom_elf_path) to first normalize backslashes to forward slashes and wrap
the path in double quotes before interpolation (i.e., produce add-symbol-file
"path/with spaces/elf") so GDB accepts Windows-style paths and spaces; ensure
you still append the two list elements to gdb_args exactly as before.
There was a problem hiding this comment.
♻️ Duplicate comments (1)
monitor/filter_exception_decoder.py (1)
1013-1031:⚠️ Potential issue | 🟠 MajorPreserve per-line stack addresses when building RISC-V stack image.
Line 1025 initializes
base_addronce, and Line 1028-Line 1029 append words sequentially without applying each line’saddroffset. Missing/out-of-order stack lines will shift memory and produce incorrect GDB backtraces.💡 Suggested fix
def _build_riscv_stack_data(self): """Parse accumulated stack memory lines into (base_addr, bytes).""" - stack_data = b"" - base_addr = None + chunks = [] + base_addr = None + end_addr = None for line in self._riscv_stack_lines: m = re.match( r"\s*([0-9a-fA-F]{8}):\s+((?:0x[0-9a-fA-F]{8}\s*)+)", line ) if not m: continue addr = int(m.group(1), 16) - if base_addr is None: + if base_addr is None or addr < base_addr: base_addr = addr words = re.findall(r"0x([0-9a-fA-F]{8})", m.group(2)) - for w in words: - stack_data += struct.pack("<I", int(w, 16)) + chunk = b"".join(struct.pack("<I", int(w, 16)) for w in words) + if not chunk: + continue + chunks.append((addr, chunk)) + chunk_end = addr + len(chunk) + if end_addr is None or chunk_end > end_addr: + end_addr = chunk_end - return base_addr or 0, stack_data + if base_addr is None or end_addr is None: + return 0, b"" + + stack_image = bytearray(end_addr - base_addr) + for addr, chunk in chunks: + off = addr - base_addr + stack_image[off:off + len(chunk)] = chunk + + return base_addr, bytes(stack_image)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@monitor/filter_exception_decoder.py` around lines 1013 - 1031, The _build_riscv_stack_data function currently treats all words as contiguous starting at the first seen address (base_addr) and concatenates them, which misaligns stacks when lines are missing or out-of-order; change it to preserve per-line addresses by: collect each line’s addr and its packed bytes (using struct.pack("<I", ...)) into a mapping or list, compute the true base_addr as the minimum addr seen, allocate a bytearray sized to cover max(addr)+len(bytes) - base_addr, and write each line’s bytes at offset (addr - base_addr) into that bytearray before returning (convert back to bytes for stack_data); update variables base_addr and stack_data accordingly in _build_riscv_stack_data so non-contiguous or out-of-order lines are placed at their correct offsets.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@monitor/filter_exception_decoder.py`:
- Around line 1013-1031: The _build_riscv_stack_data function currently treats
all words as contiguous starting at the first seen address (base_addr) and
concatenates them, which misaligns stacks when lines are missing or
out-of-order; change it to preserve per-line addresses by: collect each line’s
addr and its packed bytes (using struct.pack("<I", ...)) into a mapping or list,
compute the true base_addr as the minimum addr seen, allocate a bytearray sized
to cover max(addr)+len(bytes) - base_addr, and write each line’s bytes at offset
(addr - base_addr) into that bytearray before returning (convert back to bytes
for stack_data); update variables base_addr and stack_data accordingly in
_build_riscv_stack_data so non-contiguous or out-of-order lines are placed at
their correct offsets.
* feat: Handle RISC-V MCAUSE interrupt causes (bit 31) Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
♻️ Duplicate comments (1)
monitor/filter_exception_decoder.py (1)
1050-1066:⚠️ Potential issue | 🟠 MajorFix negative-offset corruption for out-of-order lower stack addresses.
base_addris pinned to the first seen line. If a later line starts at a lower address,offsetbecomes negative and writes land in the wrong location, corrupting the synthetic stack image.💡 Proposed fix
- stack_data = bytearray() - base_addr = None + chunks = [] + base_addr = None + end_addr = None @@ - if base_addr is None: - base_addr = addr + if base_addr is None or addr < base_addr: + base_addr = addr words = re.findall(r"0x([0-9a-fA-F]{8})", m.group(2)) - for w in words: - offset = addr - base_addr - # Pad with zeros up to the current offset if there are gaps - if offset > len(stack_data): - stack_data.extend(b"\x00" * (offset - len(stack_data))) - if offset == len(stack_data): - stack_data.extend(struct.pack("<I", int(w, 16))) - else: - # Overlapping write — overwrite existing bytes - packed = struct.pack("<I", int(w, 16)) - stack_data[offset : offset + 4] = packed - addr += 4 + chunk = b"".join(struct.pack("<I", int(w, 16)) for w in words) + if not chunk: + continue + chunks.append((addr, chunk)) + chunk_end = addr + len(chunk) + if end_addr is None or chunk_end > end_addr: + end_addr = chunk_end - return base_addr or 0, bytes(stack_data) + if base_addr is None or end_addr is None: + return 0, b"" + stack_data = bytearray(end_addr - base_addr) + for addr, chunk in chunks: + off = addr - base_addr + stack_data[off:off + len(chunk)] = chunk + return base_addr, bytes(stack_data)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@monitor/filter_exception_decoder.py` around lines 1050 - 1066, base_addr currently stays at the first seen address which makes offset negative when a later line has a lower addr; change the loop so that when addr < base_addr you prepend zero bytes to stack_data (delta = base_addr - addr; stack_data = bytearray(b"\x00"*delta) + stack_data) and set base_addr = addr before computing offset, then continue with the existing write/overwrite logic using offset, addr, stack_data and struct.pack so out-of-order lower addresses don't corrupt the synthetic stack image.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@monitor/filter_exception_decoder.py`:
- Around line 1050-1066: base_addr currently stays at the first seen address
which makes offset negative when a later line has a lower addr; change the loop
so that when addr < base_addr you prepend zero bytes to stack_data (delta =
base_addr - addr; stack_data = bytearray(b"\x00"*delta) + stack_data) and set
base_addr = addr before computing offset, then continue with the existing
write/overwrite logic using offset, addr, stack_data and struct.pack so
out-of-order lower addresses don't corrupt the synthetic stack image.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
monitor/filter_exception_decoder.py (1)
719-725: Deduplicate lookup addresses before batch decode.Repeated addresses can still inflate a single
addr2linebatch call. Deduplicating while preserving order trims subprocess work without changing decode results.♻️ Proposed refactor
if not lookups: return + lookups = list(dict.fromkeys(lookups))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@monitor/filter_exception_decoder.py` around lines 719 - 725, The loop that builds the lookups list from addr_specs can append duplicate addresses causing redundant work in addr2line; change the construction in the block that iterates addr_specs (and uses self.is_address_ignored) to deduplicate addresses while preserving order before calling the batch decoder: maintain a seen set and only append a lookup to lookups if not already seen (or use an ordered-unique approach like OrderedDict/keys), ensuring you still honor the is_ret adjustment and ignore logic in the same loop so decode results remain unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@monitor/filter_exception_decoder.py`:
- Around line 719-725: The loop that builds the lookups list from addr_specs can
append duplicate addresses causing redundant work in addr2line; change the
construction in the block that iterates addr_specs (and uses
self.is_address_ignored) to deduplicate addresses while preserving order before
calling the batch decoder: maintain a seen set and only append a lookup to
lookups if not already seen (or use an ordered-unique approach like
OrderedDict/keys), ensuring you still honor the is_ret adjustment and ignore
logic in the same loop so decode results remain unchanged.
|
@coderabbitai Please do an PR where the goal and the usage of the exception decoder is explained. Do this as an extra file in markdown format. Start this task now! |
|
Tip For best results, initiate chat on the files or code changes.
Task: Create Exception Decoder Documentation PR
🧠 Learnings used💡 Use |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
monitor/filter_exception_decoder.py (1)
800-802: Consider renaming loop variable for readability.The variable
lcan be confused with1in some fonts. While thenoqa: E741comment acknowledges this, a more descriptive name likelinewould improve readability.lines = [ - l.strip() for l in raw.splitlines() if l.strip() # noqa: E741 + line.strip() for line in raw.splitlines() if line.strip() ]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@monitor/filter_exception_decoder.py` around lines 800 - 802, The list comprehension assigning to lines uses the single-letter variable l which is hard to read; change it to a descriptive name (e.g., line) in the comprehension in filter_exception_decoder.py so it reads: lines = [line.strip() for line in raw.splitlines() if line.strip()] and remove the now-unnecessary noqa: E741 comment; update any tests/linting if needed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@monitor/filter_exception_decoder.py`:
- Around line 800-802: The list comprehension assigning to lines uses the
single-letter variable l which is hard to read; change it to a descriptive name
(e.g., line) in the comprehension in filter_exception_decoder.py so it reads:
lines = [line.strip() for line in raw.splitlines() if line.strip()] and remove
the now-unnecessary noqa: E741 comment; update any tests/linting if needed.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@monitor/filter_exception_decoder.py`:
- Around line 1055-1059: The _feed_riscv_line method currently treats any single
match from REGISTER_ENTRY as valid and writes it into self._riscv_regs, which
can ingest unrelated log lines; update _feed_riscv_line to require at least two
matches (use len(reg_matches) >= 2 or check REGISTER_ENTRY match count) before
iterating and assigning to self._riscv_regs, preserving the existing int(val,
16) conversion and returning False only after a successful multi-match parse to
avoid corrupting unwind input.
- Around line 540-543: The code unconditionally dereferences pkg.path after
calling pm.get_package("tool-riscv32-esp-elf-gdb"), which can return None and
raise AttributeError; update the logic in monitor/filter_exception_decoder.py
around get_package to first check that pkg is not None (e.g., if pkg and
pkg.path:) before accessing pkg.path and setting gdb_bin, and ensure the
existing exception handling still falls back to addr2line-only behavior when the
package is missing instead of letting AttributeError propagate.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
monitor/filter_exception_decoder.py (1)
100-101: Emit a warning when ELF interval parsing falls back to fail-open.At Line 100-Line 101, parse/read failures are swallowed silently. This can hide broken ELF inputs and make decode behavior changes hard to diagnose.
♻️ Suggested adjustment
- except (FileNotFoundError, NotImplementedError, Exception): - self.intervals = [] + except (FileNotFoundError, NotImplementedError, Exception) as e: + sys.stderr.write( + "PcAddressMatcher: failed to load executable sections from %s: %s\n" + % (elf_path, e) + ) + self.intervals = []🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@monitor/filter_exception_decoder.py` around lines 100 - 101, The except block that currently swallows parse/read failures (where self.intervals is set to []) should log a warning with the exception details before falling back; update the exception handler in the method that sets self.intervals to capture the exception as e and call an appropriate logger (e.g. self.logger.warning or logging.getLogger(__name__).warning) with a clear message like "ELF interval parsing failed, falling back to empty intervals: %s" and include the exception information, then preserve the existing fallback of setting self.intervals = [].
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@monitor/filter_exception_decoder.py`:
- Around line 100-101: The except block that currently swallows parse/read
failures (where self.intervals is set to []) should log a warning with the
exception details before falling back; update the exception handler in the
method that sets self.intervals to capture the exception as e and call an
appropriate logger (e.g. self.logger.warning or
logging.getLogger(__name__).warning) with a clear message like "ELF interval
parsing failed, falling back to empty intervals: %s" and include the exception
information, then preserve the existing fallback of setting self.intervals = [].
Checklist:
Summary by CodeRabbit
New Features
Enhancements
Chores