DUE: Fix random watchdog resets, improve watchdog handling, debug helper, tone fix...#10152
DUE: Fix random watchdog resets, improve watchdog handling, debug helper, tone fix...#10152thinkyhead merged 1 commit intoMarlinFirmware:bugfix-2.0.xfrom ejtagle:bugfix-2.0.x
Conversation
0a2e331 to
72281c4
Compare
|
Looks good! but please watch those commit messages! I've cleaned them up for you this time and squashed commits. We have helpful guidelines for commit messages posted on the Contributing page: Basically, keep the summary short. Put the long description after a blank line. |
|
@thinkyhead :Sorry about the commit messages.. Yesterday i was too tired and english is not my native language, but i wanted to make this pull request available as soon as it was possible. It fixes bootloops for me mostly. |
|
|
|
There should be a blank line on the end of all the files. I'm getting the following compile errors on PlatformIO: I tried the following change to lines 228-229 in conf_usb.h but there's no host support on the native USB port when the SD card is disabled. FROM TO |
|
humm... udi_msc.c should be ifdefd out if no SD support is enabled. PlatformIO is playing tricks on you, probably caching previously compiled versions. Look at line 60 onwards on udi_msc.c ... There is it the ifdef |
|
udi_msc.c isn't in the files Changed list,. |
There was a large block of serial output during EEPROM startup that caused a long delay. Although the serial output uses So, the solution is to initialize the |
|
hummm.. reiniting watchdog on an ISR doesnt feel good at all.. |
|
i am pretty sure a saw a watchdogReset() on the idle() task |
|
also, there is a bug ob watchdogReset() .. gcc optimizes it out..:( |
|
@Bob-the-Kuhn : the file udi_msc.c was changed by PR #10148. And at line 60 of that file there is a The file is not compiled at all if SDSUPPORT is not defined. The error you are getting is a linker error, and it says the udi_msc.c.o: In function That means the object code contains a compiled function called udi_msc_enable. That function should be ifdefd out by the #if ENABLED(SDSUPPORT) .. So, i guess that PlatformIO is failing in dependency analysis and is reusing an older compiler module. |
That's not where it happens. The ISR provides new temperature readings. The watchdog reset happens in the main thread — when new temperature readings become available. |
|
udi_msc.c is in my copy of bugfix-2.0.x because of the way I update it. Apparently my process has a hole in it that doesn't account for deleted files. My fork of Marlin has a default branch of bugfix-2.0.x. Whenever I start a new issue I use the git desktop app to copy the default into a new branch and then I use the following git commands to update it: -Xours will make the common files the same but apparently doesn't touch any "extra" files. |
|
I don't have tracking setup. Off to do some more reading. |
|
@thinkyhead : In that case, it is perfectly fine. I think i'd prefer to initialize the watchdog as soon as possible in the boot process. If the problem is the Serial dump taking too long (4 seconds is 250000/10*4 = 100000 characters long !!!!), then the solution is to clear the watchdog at that specific point, in that specific routine. Note that at the point of |
If you have a BASH shell you can use the provided git fetch upstream
git checkout upstream/bugfix-2.0.x -b my_feature_workThen the command
If all you want to do is maintain a current copy of …or… Of course the I don't tend to use my |
It should really be initialized last. Certainly after |
|
@thinkyhead : I do understand. The main problem is that we need all the things before thermalManager.init to happen, as changing the initialization order could break anything. But i think i have an idea :) ... |
|
And if you are bothered by an extra byte of RAM, i could contribute a bitvector implementation that compacts all those bool variable[count] to something less wasteful in terms of SRAM :) |
Is it not possible to simply disable the watchdog as the very first step? Then before entering the main loop re-enable it? Adding logic to |
|
@thinkyhead Unfortunately not. On SAM3X8, the Watchdog config register is write-once. So, once you set an specific configuration you simply can´t change it anymore until the next hardware reset. And this watchdog starts preconfigured as enabled... According to the datasheet, it should start preconfigured with a 16 second timeout. |
|
As said, implemented enabling and configuring of watchdog as the first thing. This, i can confirm, prevents all the bootloops on the SAM3X based boards. |
|
Now if I could figure out (again) how to do a proper review ... |
|
You are following my lead - silly person. In HAL_timers_Due.cpp the following is useless because the counter (CV register) is read only. The My previous comment is totally wrong. HAL_timer_set_compare and HAL_timer_start need different methods of calculating the frequency arguments. Please delete the routines HAL_timer_set_count and HAL_timer_restrain_count in the file HAL_timers_Due.h. They don't work and they aren't needed. |
Marlin/src/HAL/HAL_DUE/Tone.cpp
Outdated
There was a problem hiding this comment.
I've verified duration & frequency are correct via a logic analyzer.
There was a problem hiding this comment.
Sorry, @Bob-the-Kuhn ... Yesterday i was too tired, and fell asleep... (was 2AM ...I am at GMT-3).
I do agree with all your comments.
I will review the timer code against the datasheet 👍
It is not the calculation of the frequency the problem. I think the problem is to alter the period while the timer is running. That is why I added an TC_Stop() in the HAL_timers.cpp:HAL_timer_start
There was a problem hiding this comment.
If you take a look at my Tone.cpp file, you will notice we agree 👍 ... I will remove the line
tc->TC_CHANNEL[channel].TC_CV = 0;
There was a problem hiding this comment.
And all the unused functions from HAL_timers.h
There was a problem hiding this comment.
BTW, hw reviews are always hard. Dont worry: The usual "I like it/I dislike it/do it other way, this does not make sense"... It´s fine with me 👍
There was a problem hiding this comment.
TC_Configure stops the clock to the counter so TC_Stop() isn't need. As far as I can tell none of the changes in that file are required but none of them hurt either. Your choice.
There was a problem hiding this comment.
.platformio\packages\framework-arduinosam\system\sam\libsam\source\tc.c is where all the low level TC routines are at.
|
probably TC_Stop is unneeded. But it is good practice to disable ints in NVIC before changing timer options, or you could end with an spare interrupt as soon as you enable the timer. Most of this is caused by the fact of the reprogramming of a running timer. I'd split HAL_timer_start in 2 HAL_timer_start and HAL_timer_stop.. but, this is not time critical code.. |
|
On a 2nd thought, i'd prefer to leave the TC_Stop there. It makes reading the code flow and figuring out how it works more obvious.. ;) |
Marlin/src/module/temperature.cpp
Outdated
There was a problem hiding this comment.
All this added stuff for resetting the watchdog should only be added for the platforms where it matters. Recommend adding to Conditionals_post.h something like this:
#define EARLY_WATCHDOG (ENABLED(USE_WATCHDOG) && defined(ARDUINO_ARCH_SAM))…and then wrap these additions in #if EARLY_WATCHDOG…#endif.
There was a problem hiding this comment.
I could do it. I do get your point: "Avoid wasting an extra byte of SRAM on platforms that do not need it"...
Dont believe i am happy with the current solution: It is more like a hack, but HW is forcing us to use it...
I believe that on AVR watchdog can be reprogrammed as needed, so this hack is not needed there. On LPC i am not sure...
There was a problem hiding this comment.
Somehow i also think disabling the watchdog at the setup phase is also some kind of hack: In my specific case, i am using an Arduino DUE on a modified Ramps1.4
SAM3X8E starts with all IO ports configured as inputs with pull ups. So, it is very important to reconfigure the ports as fast as possible, because otherwise, the pullups turn on the Hotend and bed heaters. Watchdog is an important safety measure at all times, not only at loop() time ;)
There was a problem hiding this comment.
Avoid wasting an extra byte of SRAM on platforms that do not need it
Well, it's also a minor matter of the general style that Marlin is coded in. The use of compiler directives to leave out unnecessary code not only keeps the binary slimmer, it also acts as a handy reference to what is really required. We can look at the code and see: ah, this is needed, but that is not needed. The reason why the #include lines at the top of the G-code handler files are wrapped in #if blocks is not just to make the compilation quicker, it also gives us a nice statement at the top of these files about what the real dependencies are. And this gives us some insight into how we might re-organize things as we refactor in the future.
Watchdog is an important safety measure at all times…
Certainly. And if we discover that terrible things can happen in the earliest stages of boot-up on certain MCUs, and we can use the watchdog to prevent those disasters, then we should. But I haven't seen any situations where –for example– all the heaters get stuck on during boot-up, and I'd be very surprised if that starts to occur in spite of Marlin's thorough pin initializations during setup. I'd be doubly-surprised if the watchdog turns out to be the solution to that kind of issue.
There was a problem hiding this comment.
I agree. Watchdog is never a solution. It is a safeguard against bugs in sw or hw where is it preferable to reset a regain control rather than losing control of the process
8960719 to
9fd1016
Compare
|
Hummm... I accidentally replaced the default configuration with my own . I'll revert the configuration.h / configuration_adv.h files to the default ones |
- Watchdog reset during SD Card initialization. - Move `DebugMonitor` to `DebugMonitor_Due.cpp`. - Since the watchdog is enabled on boot do extra resets during init. - Have `thermalManager` do watchdog reset before its ISR starts to prevent reset. - Ensure that timers are stopped before reprogramming them to address tone issues. - Improve SAM3XE reset when reflashed through the native port.
|
Rebased, squashed, combined commit messages. Will merge shortly! |
|
I wanted to check whether the issues in this thread could explain an issue I am seeing. I've written some code that is using SPI to transfer blocks from the SD card to the SPI display controller. I am doing this to play an animated logo at startup. I'm transferring 512 byte blocks and this takes about 2ms. Despite of this, the video freezes randomly and the board resets. This is despite calling either |
|
2ms is no problem, but, if transferring from SD, i am pretty confident there could be up to 1 second of delay while reading some SD card sector (SD specification says nothing about "seek" time, and while most of the time reading the next sector takes nearly no time, sometimes the SD controller could take to get the next sector such time. |
|
@ejtagle: The odd thing is that the reboots seems to happen when the code is writing the blocks to my display controller, not when reading from the SD card. By any chance do Marlin interrupts use SPI, and if so, is there a possibility of a deadlock if I were to be accessing SPI at the same time? |
|
No, SPI is not using interrupts. At some point i did consider to implement that, but the u8g lib would not allow to use interrupts (or DMA!) for the SPI, unless patching the (u8g) library... |
I think it depends on how fast the tiny disk inside the SD card is spinning. |
|
@thinkyhead : Not exactly that, but there is an explanation called "wear leveling"... While you read or while you write, there could be extra things going on on the SD controller that is embedded in the SD card. A lot of time ago, i started having problems on SD reads (on an embedded platform i was working on at that time) and after countless hours of debugging, I found out the problem was a timeout while reading "sectors" from the SD card. Increasing the timeout from 10mS to 1S solved the issue. |
|
Indeed! That's the TL;DR for my tongue-in-cheek metaphor. I learned more than I ever wanted to know about wear leveling while I was adapting Creality3D's SD-card-based |
This PR does several things: