Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds core dump support via CrashCatcher #52

Closed
wants to merge 10 commits into from

Conversation

salkinium
Copy link
Member

@salkinium salkinium commented Aug 2, 2018

This replaces the xpcc UART debug functionality on HardFault with something way more useful: post-mortem debugging via CrashDebug.

This relatively simple application just uses CrashCatcher's HexDump behavior to dump all RAM via UART, which can take a lot of time, since the memories are quite large and UART quite slow at 115200 Baud.

Since the dump can take a long time, I would implement a custom dump process similar to HexDump, but with a optional user callback to set the device into a safe configuration.

  • Custom dump process with a optional callback on entry to make device safe.
  • Dump peripherals too, if possible.

cc @rleh @strongly-typed

@salkinium salkinium force-pushed the feature/crash_catcher branch 2 times, most recently from 79bbfc8 to 002aec5 Compare August 3, 2018 00:18
@@ -15,7 +15,7 @@ import os

def init(module):
module.parent = "platform"
module.name = "fault.cortex"
module.name = "fault.minimal"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for renaming this from cortex to minimal?
This module is completely ARM Cortex-M specific.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So is fault.coredump. I'd prefer to not encode the device id into the module name, since they themselves decide which devices to support. Maybe with a different platform the fault.minimal module could be implemented differently (Cortex-A, RiscV, Xtensa) but with the same functionality.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, this is how all of modm:platform:** modules work, the modm:platform:gpio module for AVRs implements the same functionality as the one for STM32, but they don't conflict, since only one is enabled per platform.

@rleh
Copy link
Member

rleh commented Aug 3, 2018

Fault handler use cases:

  • Eurobot robot main control unit/...
    • CrashDumper dumps microcontroller state via UART into logging/debug computer. Blinking LED notifies user.
  • Eurobot robot slave motor controller/...
    • Blinking LED to notify, but maybe not visible from outside the robot -> Immediate reset, except if debugger is connected: trigger breakpoint.
    • Better:
      • Write crash information to flash memory, then reset. Software can detect previous fault during startup and do something (e.g. CAN communication).
      • Fault handler sends special CAN message, then reset.
      • Some devices may not have a accessible UART. (e.g. micro-motor)
  • Freestanding device using modm/...
    • Usecases above, or maybe just wait and toggle a failure LED?

Always useful: optional user callback to set the device into a safe configuration.

I cannot see a usecase for the "wait" mode right now.

const uint16_t *pMemory = (const uint16_t *) pvMemory;
for (size_t ii = 1; ii <= elementCount; ii++)
{
uint16_t val = *pMemory++;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be pedantic, dereferencing pMemory here is undefined behaviour. According to language rules commonly referred to as "strict aliasing rules" pMemory has to point to an actual (u)int16_t object to be safe to dereference (https://en.cppreference.com/w/cpp/language/reinterpret_cast#Type_aliasing). Although every sensible compiler will probably generate the expected code, this pitfall is easy to avoid here. There is an exception to these rules for (unsigned) char and std::byte which may alias any type. If I am not overlooking something, it is possible to define pMemory as uint8_t* and to pass it to dumpMemory directly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I am not overlooking something, it is possible to define pMemory as uint8_t* and to pass it to dumpMemory directly.

Some memories can only be read with 16- or 32-bit accesses (mostly the peripherals), hence this annoyingly duplicate code. I can't change the const void *pvMemory, since that's the callback API from CrashCatcher, but how would you do this correctly?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how would you do this correctly?

To me it does not make sense at all, to apply these rules to memory-mapped IO. The rules deal with accessing C++ objects in memory through a pointer of a different type. Applying the concept of a C++ object to some piece of hardware is totally nonsensical. There is no existing C++ object in memory that the compiler knows of. Hence, the only practical solution is to leave the code as it is.

However I don't think there is any guarantee that the intended 16-bit access will actually be performed as one. I haven't looked much into the assembly output of arm compilers but on modern x86 platforms simple byte copying operations in loops often turn into fancy AVX instructions.
Using a pointer to volatile for the read would clearly state to the compiler: I want a 16 bit read and I want it now. I suppose adding volatile would not impede wanted optimizations. As the pointer itself is not volatile but the data it points to, operations on the pointer will still be optimized. I compiled a few toy examples for cortex-m4 with -Os and the resulting assembly looked almost identical regardless of whether volatile is added or not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good observation!
Another way to get around UB could be to add a read16 function coded in assembly that includes the correct load instruction.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added it to the TODO list in the new PR.

This allows for easier sharing of common settings between different
build script generators.
This allows specifying common options like project name and build path
together and not having to duplicate them for both generators.
Adds these behaviors on hard fault:
- reset device.
- force breakpoint.
- loop forever.
- flash LED every 1s.

Removes switching to process stack, so on stack overflow, only the
first three behaviors will work.
Selecting all modules through :platform:** is not possible anymore
with the introduction of conflicting :platform:fault.* modules.
@salkinium
Copy link
Member Author

Closed in favor of #210.

@salkinium salkinium closed this May 10, 2019
@salkinium salkinium deleted the feature/crash_catcher branch May 10, 2019 18:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

4 participants