Skip to content
This repository was archived by the owner on Jan 22, 2025. It is now read-only.

v1.3 persistent tower#13961

Closed
CriesofCarrots wants to merge 9 commits into
solana-labs:v1.3from
CriesofCarrots:v1.3-persistent-tower
Closed

v1.3 persistent tower#13961
CriesofCarrots wants to merge 9 commits into
solana-labs:v1.3from
CriesofCarrots:v1.3-persistent-tower

Conversation

@CriesofCarrots
Copy link
Copy Markdown
Contributor

#10718 and follow-up for v1.3

ryoqun and others added 3 commits December 4, 2020 13:29
* Save/restore Tower

* Avoid unwrap()

* Rebase cleanups

* Forcibly pass test

* Correct reconcilation of votes after validator resume

* d b g

* Add more tests

* fsync and fix test

* Add test

* Fix fmt

* Debug

* Fix tests...

* save

* Clarify error message and code cleaning around it

* Move most of code out of tower save hot codepath

* Proper comment for the lack of fsync on tower

* Clean up

* Clean up

* Simpler type alias

* Manage tower-restored ancestor slots without banks

* Add comment

* Extract long code blocks...

* Add comment

* Simplify returned tuple...

* Tweak too aggresive log

* Fix typo...

* Add test

* Update comment

* Improve test to require non-empty stray restored slots

* Measure tower save and dump all tower contents

* Log adjust and add threshold related assertions

* cleanup adjust

* Properly lower stray restored slots priority...

* Rust fmt

* Fix test....

* Clarify comments a bit and add TowerError::TooNew

* Further clean-up arround TowerError

* Truly create ancestors by excluding last vote slot

* Add comment for stray_restored_slots

* Add comment for stray_restored_slots

* Use BTreeSet

* Consider root_slot into post-replay adjustment

* Tweak logging

* Add test for stray_restored_ancestors

* Reorder some code

* Better names for unit tests

* Add frozen_abi to SavedTower

* Fold long lines

* Tweak stray ancestors and too old slot history

* Re-adjust error conditon of too old slot history

* Test normal ancestors is checked before stray ones

* Fix conflict, update tests, adjust behavior a bit

* Fix test

* Address review comments

* Last touch!

* Immediately after creating cleaning pr

* Revert stray slots

* Revert comment...

* Report error as metrics

* Revert not to panic! and ignore unfixable test...

* Normalize lockouts.root_slot more strictly

* Add comments for panic! and more assertions

* Proper initialize root without vote account

* Clarify code and comments based on review feedback

* Fix rebase

* Further simplify based on assured tower root

* Reorder code for more readability

Co-authored-by: Michael Vines <mvines@gmail.com>
* Better tower logs for SwitchForkDecision and etc

* nits

* Update comment
ryoqun and others added 6 commits December 4, 2020 13:46
…s#12350)

* Follow up to persistent tower

* Ignore for now...

* Hard-code validator identities for easy reasoning

* Add a test for opt. conf violation without tower

* Fix compile with rust < 1.47

* Remove unused method

* More move of assert tweak to the asser pr

* Add comments

* Clean up

* Clean the test addressing various review comments

* Clean up a bit
* Various clean-ups before assert adjustment

* oops
* Fix tower/blockstore unsync due to external causes

* Add and clean up long comments

* Clean up test

* Comment about warped_slot_history

* Run test_future_tower with master-only/master-slave

* Update comments about false leader condition
)

* Discard pre hard fork persisted tower if hard-forking

* Relax config.require_tower

* Add cluster test

* nits

* Remove unnecessary check

Co-authored-by: Ryo Onodera <ryoqun@gmail.com>
Co-authored-by: Carl Lin <carl@solana.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented Dec 4, 2020

Codecov Report

Merging #13961 (4f39ddc) into v1.3 (fe02979) will increase coverage by 0.0%.
The diff coverage is 88.0%.

@@           Coverage Diff           @@
##            v1.3   #13961    +/-   ##
=======================================
  Coverage   82.5%    82.5%            
=======================================
  Files        368      368            
  Lines      88184    89116   +932     
=======================================
+ Hits       72765    73604   +839     
- Misses     15419    15512    +93     

@ryoqun
Copy link
Copy Markdown
Contributor

ryoqun commented Dec 8, 2020

@CriesofCarrots Thanks for working instead of me. Is this still needed and planned to be merged? As you're more aware of the root cause of the outage, I think this giant back-porting can be skipped, perhaps?

If needed, I'm happy to review this. At first glance, this looks mostly ok.

@CriesofCarrots
Copy link
Copy Markdown
Contributor Author

I think this giant back-porting can be skipped, perhaps?

Yes, especially given the plans for v1.4, I think it can be skipped. @sakridge can you confirm?

@mvines mvines closed this Dec 16, 2020
@CriesofCarrots CriesofCarrots deleted the v1.3-persistent-tower branch February 25, 2021 22:55
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants