Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to limit the number of core dump writes to flash (IDFGH-10816) #12027

Closed
nomis opened this issue Aug 6, 2023 · 6 comments
Closed

Option to limit the number of core dump writes to flash (IDFGH-10816) #12027

nomis opened this issue Aug 6, 2023 · 6 comments
Labels
Status: Opened Issue is new Type: Feature Request Feature request for IDF

Comments

@nomis
Copy link
Contributor

nomis commented Aug 6, 2023

Is your feature request related to a problem?

I'd like to be able to enable the coredump partition but during development there will be times where the application crashes immediately on startup. This results in a fast loop of repeatedly overwriting a core dump to the same area of flash.

I'm concerned about wearing out the flash when an crash involving a reboot loop occurs.

Describe the solution you'd like.

I'd like to be able to limit the number of core dump writes to flash, so that after a core dump has been written no more writes will occur until the existing core dump is erased.

Describe alternatives you've considered.

No response

Additional context.

Currently only one core dump to flash is supported so the only valid limit (if enabled) will be 1 until storage of multiple core dumps becomes a feature.

The only change required to implement this would be to check if there is already a core dump present in esp_core_dump_to_flash() and return instead of writing the core dump.

@nomis nomis added the Type: Feature Request Feature request for IDF label Aug 6, 2023
@espressif-bot espressif-bot added the Status: Opened Issue is new label Aug 6, 2023
@github-actions github-actions bot changed the title Option to limit the number of core dump writes to flash Option to limit the number of core dump writes to flash (IDFGH-10816) Aug 6, 2023
@chipweinberger
Copy link
Contributor

flash can be written to ~50,000 times!

Don't be too concerned!

nomis added a commit to nomis/esp-idf that referenced this issue Aug 20, 2023
If there's an unattended boot loop or a crash that causes another crash on
the next boot, it needs to be possible to avoid overwriting a saved core
dump with another core dump.

Add an option to do this and skip writing core dumps if the partition isn't
empty.

Fixes espressif#12027.
nomis added a commit to nomis/esp-idf that referenced this issue Aug 20, 2023
If there's an unattended boot loop or a crash that causes another crash on
the next boot, it needs to be possible to avoid overwriting a saved core
dump with another core dump.

Add an option to do this and skip writing core dumps if the partition isn't
empty.

Fixes espressif#12027.
nomis added a commit to nomis/esp-idf that referenced this issue Aug 20, 2023
If there's an unattended boot loop or a crash that causes another crash on
the next boot, it needs to be possible to avoid overwriting a saved core
dump with another core dump.

Add an option to do this and skip writing core dumps if the partition isn't
empty.

Fixes espressif#12027.
@nomis
Copy link
Contributor Author

nomis commented Aug 20, 2023

flash can be written to ~50,000 times!

Don't be too concerned!

You do realise that that's not a very large number?

At a rate of 1 core dump per second the flash write cycles could be exceeded in 1-2 days. For an unattended device stuck in a fast crash loop with nothing obviously wrong that could happen.

I've made a PR to make it possible to skip writing a core dump to flash if one already exists. A future improvement would be to support storing multiple core dumps until the partition (or partitions) becomes full but this will do for now.

@o-marshmallow
Copy link
Collaborator

Hello @nomis ,

If you are worried about crash loop that would write the coredump to flash again and again, you can configure the panic behavior from ESP_SYSTEM_PANIC_PRINT_REBOOT to ESP_SYSTEM_PANIC_PRINT_HALT. As such, the board will not reboot but simply halt after writing the coredump.

@chipweinberger
Copy link
Contributor

you can also set the ESP_SYSTEM_PANIC_REBOOT_DELAY

nomis added a commit to nomis/esp-idf that referenced this issue Aug 21, 2023
If there's an unattended boot loop or a crash that causes another crash on
the next boot, it needs to be possible to avoid overwriting a saved core
dump with another core dump.

Add an option to do this and skip writing core dumps if the partition isn't
empty.

Fixes espressif#12027.
nomis added a commit to nomis/esp-idf that referenced this issue Aug 21, 2023
If there's an unattended boot loop or a crash that causes another crash on
the next boot, it needs to be possible to avoid overwriting a saved core
dump with another core dump.

Add an option to do this and skip writing core dumps if the partition isn't
empty.

Fixes espressif#12027.
@nomis
Copy link
Contributor Author

nomis commented Aug 21, 2023

If you are worried about crash loop that would write the coredump to flash again and again, you can configure the panic behavior from ESP_SYSTEM_PANIC_PRINT_REBOOT to ESP_SYSTEM_PANIC_PRINT_HALT. As such, the board will not reboot but simply halt after writing the coredump.

That will be useful for development, but for production use I'd want it to reboot immediately and still be able to save a core dump without worrying that it could do it repeatedly without noticing.

@rnd-ash
Copy link

rnd-ash commented Aug 21, 2023

As a potential idea (I have no idea how hard it would be to implement)... Would it be possible to store some kind of SHA of the current coredump, and when a new one is created, compare its SHA to the coredump on NVS, and only overwrite if they are different?

I ask this since usually if a program is crashing very early on after boot, the coredump should be 100% identical since hardly any variance has occurred in the programs execution

nomis added a commit to nomis/esp-idf that referenced this issue Aug 22, 2023
If there's an unattended boot loop or a crash that causes another crash on
the next boot, it needs to be possible to avoid overwriting a saved core
dump with another core dump.

Add an option to do this and skip writing core dumps if the partition isn't
empty.

Fixes espressif#12027.
movsb pushed a commit to movsb/esp-idf that referenced this issue Dec 1, 2023
If there's an unattended boot loop or a crash that causes another crash on
the next boot, it needs to be possible to avoid overwriting a saved core
dump with another core dump.

Add an option to do this and skip writing core dumps if the partition isn't
empty.

Fixes espressif#12027.

Mergeshttps://github.com/espressif/pull/12105
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Opened Issue is new Type: Feature Request Feature request for IDF
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants