-
Notifications
You must be signed in to change notification settings - Fork 7.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IDF monitor fails to parse coredumps from uart (IDFGH-6439) #8099
Comments
Hi, @bugadani. Thank you for reporting the issue. We'll take a look soon. |
This PR fixed processing coredumps that contain partially received lines.
@bugadani Could you please provide more detailed steps to reproduce the issue? E.g. example of C code to be monitored in the device. |
I can't, unfortunately, I have a pretty complicated rust project only. But let me try to provide more details, let's hope I don't get the facts terribly wrong. Before that, one thing to note: I opened this issue for esp-idf v4.3, where I made a similar patch to the PR I closed, that solved this issue for me. Right now, I'm experiencing this issue with esp-idf v4.4. For the previous version, the patch looked like this: diff --git a/tools/idf_monitor.py b/tools/idf_monitor.py
index 490178e53a..7f8dc1ddd1 100755
--- a/tools/idf_monitor.py
+++ b/tools/idf_monitor.py
@@ -640,8 +640,14 @@ class Monitor(object):
if self._last_line_part != b'':
if self._force_line_print or (finalize_line and self._line_matcher.match(self._last_line_part.decode(errors='ignore'))):
self._force_line_print = True
+
+ if self._decode_coredumps != COREDUMP_DECODE_DISABLE:
+ if self._reading_coredump == COREDUMP_READING:
+ self._coredump_buffer += self._last_line_part.replace(b'\r', b'')
+
self._print(self._last_line_part)
self.handle_possible_pc_address_in_line(self._last_line_part)
+ self.check_coredump_trigger_after_print(self._last_line_part)
self.check_gdbstub_trigger(self._last_line_part)
# It is possible that the incomplete line cuts in half the PC
# address. A small buffer is kept and will be used the next time SymptomsMost of the time when my program panics and outputs a coredump, esp-idf fails to parse this coredump and prints it's base64 raw representation instead. This is usually a 20-30kB dump. There are two distinct error messages I get:
Trying to decode this raw, printed buffer manually, in some of the cases (corresponding to error message 1), some of the lines are invalid base64, because data is missing from them. I suspect, that, in the other cases, the buffer was rejected with a data length mismatch, because in those cases, bytes were missing in such a way, that the buffer still represented a valid base64 string. I know that data is missing, because when I debugged this issue, I suspected the C code to generate invalid buffers and I printed the raw coredump buffer in a different way, and observed, that what I printed and what esp-idf read from the serial input, were slightly different. Possible causesThe error is probablilistic, but because of the size of my coredump, I hit it more often than not. Examining the code paths that process coredump info, we can see that in the code branch in the issue description, data can be lost without adding it to the coredump buffer. I suspect this is the root cause of this issue. This code is part of the
I believe the problematic code is hit, when a elif event_tag == TAG_SERIAL_FLUSH:
print("Flush")
self.serial_handler.handle_serial_input(data, self.console_parser, self.coredump,
self.gdb_helper, self._line_matcher,
self.check_gdb_stub_and_run, finalize_line=True) |
Hi @bugadani, thank you for a detailed and clear explanation of the problem. You are right. The issue is caused by invoking the wrong branch of the code (
Please, consider increasing the time interval |
Hi @sio13 indeed, increasing the timeout works well enough, thanks!. |
@bugadani, thank you for verifying the solution. We also found out that increasing the UART console baud rate using |
I hope both of these solutions are treated as a workaround only :) Well this isn't important, but modifying the code isn't an issue on my end, as I have other patches I need to apply for small fixes that aren't released, or haven't been backported yet. The baud rate increase is interesting, I need to verify if 2M works with my physical setup. It has the added benefit of making my log calls faster, as I've been able to verify that in my system, logging is actually IO bound, even using 460800. Regardless, thank you for providing workarounds, being able to see coredumps is certainly more helpful than not seeing them. :) |
@bugadani sure, the root cause was adding line breaks into the core dump data that led to issues in decoding. The fix should be merged and backported to the older versions soon. Thank you for reporting and helping with this issue. |
In the current implementation, it is possible for some bytes to be missing from coredump data received over UART. The branch that processed incomplete lines (referenced below) does not update the coredump processor at all:
esp-idf/tools/idf_monitor_base/serial_handler.py
Lines 105 to 120 in 07bfc09
The conditions where this may happen:
handle_serial_input
withfinalize_line=True
Effect: monitor dumps a bunch of base64 encoded coredump data to the console, failing to process it due to missing bytes - either the base64 becomes invalid, or the data becomes shorter than expected.
I'm experiencing the issue with esp-idf 4.3.1, but looking at the implementation, master should still be affected.
The text was updated successfully, but these errors were encountered: