Skip to content
This repository was archived by the owner on Jan 22, 2025. It is now read-only.

Repair alternate versions of dead slots#9805

Merged
carllin merged 22 commits intosolana-labs:masterfrom
carllin:FixReplayStage2
May 5, 2020
Merged

Repair alternate versions of dead slots#9805
carllin merged 22 commits intosolana-labs:masterfrom
carllin:FixReplayStage2

Conversation

@carllin
Copy link
Copy Markdown
Contributor

@carllin carllin commented Apr 29, 2020

Problem

Some validators were getting stuck on a slot on the main fork(invalid entry hash) if they got conflicting shreds for the same slot.

#9369

Summary of Changes

Check ClusterSlots to see if supermajority has completed a slot that was marked dead. If so:

  1. Dump the slot
  2. Repair the slot from a chosen stake-weighted validator
  3. Replay the slot, hope this version is the good one

Fixes #
#9369

@carllin carllin added the v1.1 label Apr 29, 2020
@carllin carllin requested review from aeyakovenko and sakridge April 29, 2020 23:49
Comment thread runtime/src/accounts_db.rs Outdated
Comment thread runtime/src/bank.rs
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 30, 2020

Codecov Report

Merging #9805 into master will decrease coverage by 0.0%.
The diff coverage is 86.2%.

@@           Coverage Diff            @@
##           master   #9805     +/-   ##
========================================
- Coverage    80.4%   80.3%   -0.1%     
========================================
  Files         283     283             
  Lines       64977   65564    +587     
========================================
+ Hits        52243   52689    +446     
- Misses      12734   12875    +141     

Comment thread runtime/src/accounts_index.rs
Comment thread runtime/src/accounts_db.rs Outdated
Comment thread runtime/src/accounts_db.rs
@sakridge sakridge requested a review from ryoqun April 30, 2020 22:52
@carllin carllin force-pushed the FixReplayStage2 branch from b9d6b99 to 9fa2dea Compare May 1, 2020 02:37
Comment thread core/src/replay_stage.rs
Comment thread core/src/broadcast_stage/fail_entry_verification_broadcast_run.rs
Comment thread core/src/broadcast_stage/fail_entry_verification_broadcast_run.rs
Comment thread core/src/replay_stage.rs
Comment thread ledger/src/blockstore.rs Outdated
Comment thread core/src/repair_service.rs
Comment thread runtime/src/accounts_db.rs
Comment thread local-cluster/src/cluster_tests.rs Outdated
Comment thread core/src/repair_service.rs Outdated
Comment thread core/src/repair_service.rs
Copy link
Copy Markdown
Member

@aeyakovenko aeyakovenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you schedule something to walk me through the code?

Comment thread core/src/repair_service.rs
Comment thread ledger/src/blockstore.rs
ryoqun
ryoqun previously approved these changes May 5, 2020
Copy link
Copy Markdown
Contributor

@ryoqun ryoqun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with nits; I did full-blown code review!

I say this is good to merge, although @aeyakovenko might have something to say according to
#9805 (review)

@carllin
Copy link
Copy Markdown
Contributor Author

carllin commented May 5, 2020

@ryoqun thanks for the detailed review!

@carllin carllin force-pushed the FixReplayStage2 branch from 0df5b86 to 964ef04 Compare May 5, 2020 19:17
@mergify mergify Bot dismissed ryoqun’s stale review May 5, 2020 19:17

Pull request has been modified.

@carllin carllin added the automerge Merge this Pull Request automatically once CI passes label May 5, 2020
@solana-grimes solana-grimes removed the automerge Merge this Pull Request automatically once CI passes label May 5, 2020
@solana-grimes
Copy link
Copy Markdown
Contributor

💔 Unable to automerge due to CI failure

@carllin carllin merged commit 3442f36 into solana-labs:master May 5, 2020
mergify Bot pushed a commit that referenced this pull request May 5, 2020
Co-authored-by: Carl <carl@solana.com>
(cherry picked from commit 3442f36)

# Conflicts:
#	runtime/src/accounts_db.rs
solana-grimes pushed a commit that referenced this pull request May 6, 2020
carllin added a commit to carllin/solana that referenced this pull request May 7, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants