feat(unrecoverable-error): implement halting of full node execution#809
Merged
feat(unrecoverable-error): implement halting of full node execution#809
Conversation
bd220e7 to
d759c4e
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #809 +/- ##
==========================================
- Coverage 85.17% 85.08% -0.09%
==========================================
Files 292 293 +1
Lines 22667 22719 +52
Branches 3415 3418 +3
==========================================
+ Hits 19306 19331 +25
- Misses 2687 2706 +19
- Partials 674 682 +8 ☔ View full report in Codecov by Sentry. |
833e11f to
16b471f
Compare
msbrogli
requested changes
Oct 24, 2023
16b471f to
c3577d9
Compare
Contributor
Author
|
@msbrogli the code changed a lot, so I just rebased it. Could you please re-review all files? |
735ae0d to
a57f1af
Compare
ccc49e6 to
8b342a9
Compare
8b342a9 to
4278afa
Compare
4278afa to
8967098
Compare
jansegre
reviewed
Dec 5, 2023
Member
jansegre
left a comment
There was a problem hiding this comment.
We could start using this now when a storage write fails. Do you think it'd be worth to include this in this PR?
8967098 to
627494a
Compare
627494a to
41cb0fb
Compare
msbrogli
requested changes
Jan 12, 2024
41cb0fb to
1aab024
Compare
Contributor
Author
@jansegre I think we can do it in a separate PR |
c4bb1f4 to
2938c08
Compare
jansegre
previously approved these changes
Jan 16, 2024
msbrogli
previously approved these changes
Mar 4, 2024
12040da to
452b16b
Compare
msbrogli
approved these changes
Mar 4, 2024
jansegre
approved these changes
Mar 5, 2024
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
Currently, when an exception happens during a consensus update, it marks the tx as voided with a custom marker and continues to operate. This operation may be faulty, though, as the database is likely in an undesired state. For example, if such exception happens when a block is received, no following block will be accepted and the full node will not be able to sync anymore. If the full node is manually stopped and restarted, it starts up but continues to be unable to sync.
This PR's goal is to, instead, completely halt and exit the full node in those cases, forcing manual intervention. When some exception happens during a consensus update, the full node process exits with a non-zero exit code. This also guarantees that the database is marked as corrupted, so the full node cannot be restarted normally, only by a full verification or new database.
This PR is only more restrictive than what's currently implemented, and will be necessary for the Feature Activation for Transactions.
Acceptance Criteria
ExecutionManagerwithcrash_and_exit()method.EventManagerandTransactionStoragefor dealing with a full node crash.Checklist
master, confirm this code is production-ready and can be included in future releases as soon as it gets merged