Skip to content

fix(pipeline): clear MerkleStage checkpoints on invalid root#3973

Merged
onbjerg merged 5 commits intomainfrom
merkle/clean-checkpoint
Jul 31, 2023
Merged

fix(pipeline): clear MerkleStage checkpoints on invalid root#3973
onbjerg merged 5 commits intomainfrom
merkle/clean-checkpoint

Conversation

@joshieDo
Copy link
Collaborator

While working on another branch, I noticed that the merkle stage once it fails its validation (related to my branch), it can't really recover since its checkpoint never gets cleared.

This is what happened before the logs from the image below.

  1. Run first sync
  2. Reach tip
  3. Re-run the pipeline
  4. Fails MerkleStage and unwinds AccountHashing and StorageHashing.
  5. MerkleUnwind fatal failure with process exit
  6. Run again
  7. AccountHashing
  8. StorageHashing (you can see its end in the first line of the picture.

image

Proposing that we clear MerkleStage checkpoints once we hit a validation error.

@joshieDo joshieDo added C-bug An unexpected or incorrect behavior A-staged-sync Related to staged sync (pipelines and stages) labels Jul 27, 2023
@joshieDo joshieDo changed the title clear merkle checkpoint on invalid root fix(pipeline): clear MerkleStage checkpoints on invalid root Jul 27, 2023
@codecov
Copy link

codecov bot commented Jul 27, 2023

Codecov Report

Merging #3973 (3e5c7d9) into main (da3bc64) will increase coverage by 0.01%.
The diff coverage is 100.00%.

Impacted file tree graph

Files Changed Coverage Δ
crates/stages/src/pipeline/mod.rs 84.05% <100.00%> (+0.22%) ⬆️

... and 10 files with indirect coverage changes

Flag Coverage Δ
integration-tests 16.34% <0.00%> (-0.01%) ⬇️
unit-tests 64.34% <100.00%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
reth binary 26.67% <ø> (ø)
blockchain tree 83.04% <ø> (ø)
pipeline 89.83% <100.00%> (+0.01%) ⬆️
storage (db) 74.30% <ø> (ø)
trie 94.70% <ø> (ø)
txpool 46.00% <ø> (+0.60%) ⬆️
networking 77.64% <ø> (-0.07%) ⬇️
rpc 58.50% <ø> (-0.03%) ⬇️
consensus 63.51% <ø> (ø)
revm 33.10% <ø> (ø)
payload builder 6.58% <ø> (ø)
primitives 87.91% <ø> (-0.02%) ⬇️

Copy link
Contributor

@rkrasiuk rkrasiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch! however, we should do this inside the merkle stage

@joshieDo
Copy link
Collaborator Author

joshieDo commented Jul 27, 2023

we cannot commit anything from inside a stage though, and I do like that constraint

furthermore, there's already a branch to handle the validation error, so as long as it doesn't grow with many edge cases I feel like it should be fine?

edit: and we cannot drop the current transaction either

@rkrasiuk
Copy link
Contributor

ah, ye, makes sense

Copy link
Collaborator

@onbjerg onbjerg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would like to mark this as a FIXME since it's technically debt we should try to solve in a better way (this is "special treatment" for merkle)

@onbjerg onbjerg force-pushed the merkle/clean-checkpoint branch from ea90b5e to 6b339a3 Compare July 31, 2023 13:19
@onbjerg onbjerg enabled auto-merge July 31, 2023 13:22
@onbjerg onbjerg added this pull request to the merge queue Jul 31, 2023
Merged via the queue into main with commit 0b913e2 Jul 31, 2023
@onbjerg onbjerg deleted the merkle/clean-checkpoint branch July 31, 2023 13:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-staged-sync Related to staged sync (pipelines and stages) C-bug An unexpected or incorrect behavior

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants