Skip to content

[2.0.x] DUE debugging: Solve WDT startup delay, add traceback & crash report uses programming port baud rate#10185

Merged
Bob-the-Kuhn merged 3 commits intoMarlinFirmware:bugfix-2.0.xfrom
ejtagle:bugfix-2.0.x
Mar 24, 2018
Merged

[2.0.x] DUE debugging: Solve WDT startup delay, add traceback & crash report uses programming port baud rate#10185
Bob-the-Kuhn merged 3 commits intoMarlinFirmware:bugfix-2.0.xfrom
ejtagle:bugfix-2.0.x

Conversation

@ejtagle
Copy link
Contributor

@ejtagle ejtagle commented Mar 22, 2018

  1. Now the crash report sent through the Programming port respects the BAUDRATE
  2. Fixed a huge startup delay if WDT reset the board and USB native port was being used.
  3. Added a traceback to the crash report. For it to work, you need to compile the project with -funwind-tables and -mpoke-function-name compiler flags. A DUE_debug section has been added to platformio.ini to include those flags when debugging. Adding the flags increases the FLASH usage by 5% - 8% so they shouldn't be enabled by default.

The traceback reports as many stack levels as is available. It also reports the function name associated with each level. You'll need the .ELF file to unmangle most of the names or feed the program counter to the arm-none-eabi-addr2line utility to get a really nice report including file name & line number.

ejtagle added 2 commits March 22, 2018 03:31
…ge startup delays that happen when a WDT reset happens and we are connected through the native port
…t through the Programming port. And also shows the traceback of functions as discussed. For that latest feature to work, you need to compile the project with -funwind-tables and -mpoke-function-name compiler flags
@Bob-the-Kuhn
Copy link
Contributor

Are you seeing a WDT reset loop when the SD card is inserted? This is with WATCHDOG_RESET_MANUAL disabled.

As best I can tell the current 2.0.x and the dump of the PR from late Tuesday also have it. What's really strange is the Issue log definitely shows I tested the Tuesday image with the SD card with no problems.

Maybe My Due is broken.

@ejtagle
Copy link
Contributor Author

ejtagle commented Mar 22, 2018

You should enable WATCHDOG_RESET_MANUAL and compile wih the required flags, and you will get the exact point of the program... ;)

@ejtagle
Copy link
Contributor Author

ejtagle commented Mar 22, 2018

But i do suspect the lack of enough decoupling of the power supply (hw issue!) or lack of debouncing on the SD detect line to be culprits... I don't have that bootloop, but i basically never remove the SD
For example, assume due to SD detect line bouncing that you reinitialize the SD 4 times between temperature reads. You will get a WDT reset!

@Bob-the-Kuhn
Copy link
Contributor

One SD card works as expected, the other goes into a WDT loop when the host connects.

@ejtagle
Copy link
Contributor Author

ejtagle commented Mar 22, 2018

now you can exactly pinpoint the issue and the program location. Maybe the card is takong too log to initializr or to read..

@ejtagle
Copy link
Contributor Author

ejtagle commented Mar 22, 2018

when the host connects windows reads all the FATs and bootsectors. Something is taking too long to complete. Would be interesting to know what..

@Bob-the-Kuhn
Copy link
Contributor

The new code only reports one level in this instance.

As best I can tell the SD code is having problems with a non-standard file name or directory.

@Bob-the-Kuhn
Copy link
Contributor

Please change the DUE section in platformio.ini as follows. I'm lazy and don't want to look up the flags every time.

[env:DUE]
platform     = atmelsam
framework    = arduino
board        = due
build_flags  = ${common.build_flags}
lib_deps     = ${common.lib_deps}
lib_ignore   = c1921b4
src_filter   = ${common.default_src_filter}
monitor_baud = 250000
[env:DUE_USB]
platform     = atmelsam
framework    = arduino
board        = dueUSB
build_flags  = ${common.build_flags}
lib_deps     = ${common.lib_deps}
lib_ignore   = c1921b4
src_filter   = ${common.default_src_filter}
monitor_baud = 250000
[env:DUE_debug]
# Used when WATCHDOG_RESET_MANUAL is enabled
platform     = atmelsam
framework    = arduino
board        = due
build_flags  = ${common.build_flags}
  -funwind-tables 
  -mpoke-function-name
lib_deps     = ${common.lib_deps}
lib_ignore   = c1921b4
src_filter   = ${common.default_src_filter}
monitor_baud = 250000

@Bob-the-Kuhn
Copy link
Contributor

I have found one problem. When the host is on the USB port and I initiate a WDT it gets caught up in a WDT loop. Looks like it's hung in the USB startup code.

I need to go do some things so I'll be off line for a couple of hours & then I'll get back to debugging this. It'll probably be late tonight before I have confidence in reporting anything.

@Bob-the-Kuhn
Copy link
Contributor

The WDT loop only happens when I compile WITHOUT the debug flags.

It still reports the debug info but doesn't report the routine's name.

I'm thinking that a check needs to be made to see if the unwind stuff is present or not.

@ejtagle
Copy link
Contributor Author

ejtagle commented Mar 23, 2018

If you have the PC and LR of the crash report, and the associated ELF file, you can use the program arm-none-eabi-addr2line.exe (already installed with both platformio and Arduino) to translate that value to the file, the function and the source code line where the problem happened.

There is a check for the unwind information, that is why you don´t get it. Unwind information is used to throw exceptions, so some stack frames that can't catch them will be missing...

@Bob-the-Kuhn
Copy link
Contributor

In case it's of any value to you - here's some of the USB activity during the WDT loop.

WDT loop packets.zip

@Bob-the-Kuhn
Copy link
Contributor

Bob-the-Kuhn commented Mar 23, 2018

I'm happy with the PR code as it is once platform.ini has been updated.

It would be nice to fix the WDT loop but it's a corner case during not intended usage. As the doctor says "If it hurts don't do it do it the correct way".

Either way, when you're happy with the code, post a reply to Scott asking him to review and telling him we think it's ready to be merged.

I'll clean up the title & description.

@Bob-the-Kuhn Bob-the-Kuhn changed the title DUE: Solve huge startup delay when WDT resets board and USB native port was being used. [2.0.x] DUE debugging: Solve WDT startup delay, add traceback & crash report uses programming port baud rate Mar 23, 2018
@Bob-the-Kuhn Bob-the-Kuhn added PR: Improvement PR: Bug Fix T: HAL & APIs Topic related to the HAL and internal APIs. labels Mar 23, 2018
@Bob-the-Kuhn
Copy link
Contributor

@thinkyhead - this PR is ready to be merged if you're happy with it.

We've done a lot of testing and the last of the known glitches have been fixed.

@ejtagle
Copy link
Contributor Author

ejtagle commented Mar 23, 2018

let's merge this. I will probably then open a new one with some improvements to the backtracer, to try to make it work when no unwind tables are present . And yes, there is a way.. 😊

@Bob-the-Kuhn Bob-the-Kuhn merged commit f7857ac into MarlinFirmware:bugfix-2.0.x Mar 24, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

PR: Bug Fix PR: Improvement T: HAL & APIs Topic related to the HAL and internal APIs.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants