feat|refactor(header/sync): network head determination #990

Wondertan · 2022-08-07T09:25:49Z

Context

In #978 we started adding the Head method to Syncer; however, it's not that straightforward, and the deeper we looked, the more issues we found in Syncer related to the new Head method, so it was decided to extract a separate preparation PR that ensures the new Head is safe to use by multiple routines and returns as recent header as possible.

The previous version of Syncer struggled with two issues:

On Syncer's start, synchronization didn't start and waited for a gossiped header trigger sync and set a sync target (only when the subjective head was not expired)
- This is why Node could wait up to block time second to start syncing
There was no way to request the most recent objective header of the network
- I.e. if the user wanted to request the latest possible state, it wasn't able to do that besides waiting for full sync to finish

Changes

The new reimplementation fixes these two problems and improves code readability and docs. Mainly, it splits the existing trustedHead into two methods subjectiveHead and networkHead. The networkHead is supposed to be used by the future Head.
Where the latter now relies on the latest known header timestamp and block time to determine its recency. If the header is not recent, we request it from the trusted peer(s), assuming it's always synced.

In fact two of our bootstrappers falling out of sync, creating syncing issues.

Besides that, three more issues under fix commits related to multithreading were fixed with supporting tests.

Other

Tested sync manually(thought with some struggles due to broken bootstrappers).
As always, review CBC and checkout locally to see the full picture.

TODO

Double check that the node correctly determines recency, as their DA network might be lagging one block behind
Do we need time drift?
Telemetry

codecov-commenter · 2022-08-07T09:31:44Z

Codecov Report

Merging #990 (44ab8a2) into main (4618e2b) will increase coverage by 0.34%.
The diff coverage is 63.35%.

@@            Coverage Diff             @@
##             main     #990      +/-   ##
==========================================
+ Coverage   56.58%   56.92%   +0.34%     
==========================================
  Files         135      136       +1     
  Lines        8994     9063      +69     
==========================================
+ Hits         5089     5159      +70     
+ Misses       3369     3366       -3     
- Partials      536      538       +2

Impacted Files	Coverage Δ
params/network.go	`76.92% <ø> (ø)`
header/sync/sync_head.go	`58.77% <58.77%> (ø)`
header/sync/sync.go	`69.50% <62.50%> (+4.00%)`	⬆️
node/services/service.go	`80.83% <72.72%> (-0.93%)`	⬇️
header/header.go	`50.00% <100.00%> (+3.22%)`	⬆️
header/sync/ranges.go	`82.47% <100.00%> (ø)`
header/verify.go	`82.22% <100.00%> (+0.82%)`	⬆️
fraud/pb/proof.pb.go	`36.61% <0.00%> (+1.89%)`	⬆️
... and 2 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

Wondertan · 2022-08-07T11:33:38Z

Tests failed becauseSyncer.Start now requires initialized Store. Fixed now

header/sync/sync.go

renaynay

Initial thoughts here -- will do a deeper review after our call.

Want to discuss objectiveHead method.

Also we should consider splitting out process/validation-related functions into a separate file (maybe head.go) and just leaving the sync loop logic in sync.go

header/header.go

header/sync/sync.go

header/header.go

header/verify.go

Wondertan · 2022-08-08T17:39:41Z

@renaynay, @distractedm1nd, pls give another read to the docs. I think it's a different level now.

Wondertan · 2022-08-08T17:49:57Z

We should not merge this one until we downgrade our network to 0.34. Testing recency on Mamaki is impossible due to bigger block times than they should be. Also, this change does not bring ant value without stable block times

distractedm1nd

docs are great

header/sync/sync.go

renaynay

testing this against arabica and everything looks fine so far -- the problem is not enough blocks have been produced to be able to observe any larger range syncs.

I think we should each (@Wondertan and I) review this PR again with fresh eyes and then we can likely merge.

header/header.go

Wondertan · 2022-09-09T12:29:30Z

testing this against arabica and everything looks fine so far -- the problem is not enough blocks have been produced to be able to observe any larger range syncs.

@renaynay, How do you test? Do you compare that the node's instant head is equal to the core's one? Do you request core's one directly from the validator?

liamsi

The PR looks generally good. I think it is critical that we document the assumptions around subjective head, network head, trusted peers clearly and make any implicit assumptions very explicit to the user.

header/sync/sync.go

header/sync/sync_head.go

renaynay

Tested on all node types -- works very well. Thank you!

header/sync/sync.go

header/header.go

header/sync/sync.go

header/sync/sync_head.go

Co-authored-by: Rene <[email protected]>

… for Syncer improvements Co-authored-by: Rene <[email protected]>

…emination logic The previous version of Syncer struggled with two issues: * On Syncer's start, synchronization didn't start and waited for a gossiped header trigger sync and set a sync target (only when the subjective head was not expired) * This is why Node could wait up to block time second to start syncing * There was no way to request the most recent objective header of the network * I.e. if the user wanted to request the latest possible state, it wasn't able to do that besides waiting for full sync to finish. The new reimplementation fixes these two problems and improves code readability and docs. Mainly, it splits the existing `trustedHead` into two methods `subjectiveHead` and `objectiveHead`. Where the latter now relies on the latest known header timestamp and block time to determine its recency. The new reimplementation fixes these two problems and improves code readability and docs. Mainly, it splits the existing trustedHead into two methods subjectiveHead and objectiveHead. Where the latter now relies on the latest known header timestamp and block time to determine its recency. If the header is not recent, we request it from the trusted peer(s), assuming it's always synced.

Mainly, allow Start to error so that subsequent Stop does not panic. While also make lifecycle logic less confusing and less error-prone

… so that Node tests does not fail

Going further, there wiil be multiple readers that should not block each other

…any at any moment

…WaitSync

…vements * Terminology change from the 'objective head' to the 'network head' consistent over docs and logs * More logs for unhappy cases + more information for extisting logs * Extracttion of head retrieval logic into a separate file

Co-authored-by: rene <[email protected]> Co-authored-by: Ismail Khoffi <[email protected]>

Co-authored-by: rene <[email protected]>

renaynay

morge

@renaynay

* feat(params|header/sync): dirty intergration of block time into Syncer Co-authored-by: Rene <[email protected]> * feat(header): new utility funcs for the ExtendedHeader in preparation for Syncer improvements Co-authored-by: Rene <[email protected]> * feat|refactor(header/sync): revision of Syncer's objective head deteremination logic The previous version of Syncer struggled with two issues: * On Syncer's start, synchronization didn't start and waited for a gossiped header trigger sync and set a sync target (only when the subjective head was not expired) * This is why Node could wait up to block time second to start syncing * There was no way to request the most recent objective header of the network * I.e. if the user wanted to request the latest possible state, it wasn't able to do that besides waiting for full sync to finish. The new reimplementation fixes these two problems and improves code readability and docs. Mainly, it splits the existing `trustedHead` into two methods `subjectiveHead` and `objectiveHead`. Where the latter now relies on the latest known header timestamp and block time to determine its recency. The new reimplementation fixes these two problems and improves code readability and docs. Mainly, it splits the existing trustedHead into two methods subjectiveHead and objectiveHead. Where the latter now relies on the latest known header timestamp and block time to determine its recency. If the header is not recent, we request it from the trusted peer(s), assuming it's always synced. * docs(header/sync): add TODO for potential optimization * refactor(header/sync): rework Syncer lifecycling Mainly, allow Start to error so that subsequent Stop does not panic. While also make lifecycle logic less confusing and less error-prone * fix(node): do not fail the Start for Syncer if it is not initialized, so that Node tests does not fail * fix(header/sync): use RWLock for sync ranges Going further, there wiil be multiple readers that should not block each other * fix(header/sync): ensure objective head is requested only once when many at any moment * chore(header/sync): cleanup syncing code and update the tests to use WaitSync * chore(header/sync): documentation, logging and code dispoisiton improvements * Terminology change from the 'objective head' to the 'network head' consistent over docs and logs * More logs for unhappy cases + more information for extisting logs * Extracttion of head retrieval logic into a separate file * Apply docs suggestions from @renaynay and @liamsi Co-authored-by: rene <[email protected]> Co-authored-by: Ismail Khoffi <[email protected]> * Update header/header.go Co-authored-by: rene <[email protected]> Co-authored-by: Rene <[email protected]> Co-authored-by: rene <[email protected]> Co-authored-by: Ismail Khoffi <[email protected]>

@renaynay

* feat(params|header/sync): dirty intergration of block time into Syncer Co-authored-by: Rene <[email protected]> * feat(header): new utility funcs for the ExtendedHeader in preparation for Syncer improvements Co-authored-by: Rene <[email protected]> * feat|refactor(header/sync): revision of Syncer's objective head deteremination logic The previous version of Syncer struggled with two issues: * On Syncer's start, synchronization didn't start and waited for a gossiped header trigger sync and set a sync target (only when the subjective head was not expired) * This is why Node could wait up to block time second to start syncing * There was no way to request the most recent objective header of the network * I.e. if the user wanted to request the latest possible state, it wasn't able to do that besides waiting for full sync to finish. The new reimplementation fixes these two problems and improves code readability and docs. Mainly, it splits the existing `trustedHead` into two methods `subjectiveHead` and `objectiveHead`. Where the latter now relies on the latest known header timestamp and block time to determine its recency. The new reimplementation fixes these two problems and improves code readability and docs. Mainly, it splits the existing trustedHead into two methods subjectiveHead and objectiveHead. Where the latter now relies on the latest known header timestamp and block time to determine its recency. If the header is not recent, we request it from the trusted peer(s), assuming it's always synced. * docs(header/sync): add TODO for potential optimization * refactor(header/sync): rework Syncer lifecycling Mainly, allow Start to error so that subsequent Stop does not panic. While also make lifecycle logic less confusing and less error-prone * fix(node): do not fail the Start for Syncer if it is not initialized, so that Node tests does not fail * fix(header/sync): use RWLock for sync ranges Going further, there wiil be multiple readers that should not block each other * fix(header/sync): ensure objective head is requested only once when many at any moment * chore(header/sync): cleanup syncing code and update the tests to use WaitSync * chore(header/sync): documentation, logging and code dispoisiton improvements * Terminology change from the 'objective head' to the 'network head' consistent over docs and logs * More logs for unhappy cases + more information for extisting logs * Extracttion of head retrieval logic into a separate file * Apply docs suggestions from @renaynay and @liamsi Co-authored-by: rene <[email protected]> Co-authored-by: Ismail Khoffi <[email protected]> * Update header/header.go Co-authored-by: rene <[email protected]> Co-authored-by: Rene <[email protected]> Co-authored-by: rene <[email protected]> Co-authored-by: Ismail Khoffi <[email protected]>

Wondertan closed this Aug 7, 2022

Wondertan reopened this Aug 7, 2022

Wondertan force-pushed the hlib/syncer-head branch 2 times, most recently from 2d627be to a0bc2e4 Compare August 7, 2022 09:27

Wondertan added the kind:improvement label Aug 7, 2022

Wondertan changed the title ~~feat(params|header/sync): dirty intergration of block time into Syncer~~ feat|refactor(header/sync): revision of Syncer's objective head deteremination logic Aug 7, 2022

Wondertan force-pushed the hlib/syncer-head branch from 033c815 to 31c39c1 Compare August 7, 2022 11:19

Wondertan added the area:header Extended header label Aug 7, 2022

Wondertan self-assigned this Aug 7, 2022

Wondertan marked this pull request as ready for review August 7, 2022 11:21

Wondertan requested review from liamsi, renaynay, vgonkivs and distractedm1nd as code owners August 7, 2022 11:21

Wondertan mentioned this pull request Aug 7, 2022

header: Extract Head method into separate Head interface, make Syncer implement it #978

Closed

Wondertan force-pushed the hlib/syncer-head branch from db8f693 to 674ccfd Compare August 8, 2022 09:49

distractedm1nd reviewed Aug 8, 2022

View reviewed changes

header/sync/sync.go Outdated Show resolved Hide resolved

header/sync/sync.go Outdated Show resolved Hide resolved

renaynay reviewed Aug 8, 2022

View reviewed changes

header/header.go Show resolved Hide resolved

header/verify.go Show resolved Hide resolved

Wondertan force-pushed the hlib/syncer-head branch from ddeedce to a77d103 Compare August 8, 2022 17:30

Wondertan changed the title ~~feat|refactor(header/sync): revision of Syncer's objective head deteremination logic~~ feat|refactor(header/sync): network head deteremination Aug 8, 2022

Wondertan force-pushed the hlib/syncer-head branch from a77d103 to 53da495 Compare August 8, 2022 17:37

Wondertan force-pushed the hlib/syncer-head branch from 53da495 to bd3f3a7 Compare August 8, 2022 17:41

Wondertan changed the title ~~feat|refactor(header/sync): network head deteremination~~ feat|refactor(header/sync): network head determination Aug 8, 2022

distractedm1nd previously approved these changes Aug 9, 2022

View reviewed changes

header/sync/sync.go Show resolved Hide resolved

Wondertan commented Aug 9, 2022

View reviewed changes

header/sync/sync.go Show resolved Hide resolved

Wondertan force-pushed the hlib/syncer-head branch from bd3f3a7 to 5e3c233 Compare September 8, 2022 11:54

renaynay reviewed Sep 9, 2022

View reviewed changes

header/header.go Show resolved Hide resolved

renaynay dismissed distractedm1nd’s stale review via 5e3c233 September 9, 2022 09:47

liamsi reviewed Sep 12, 2022

View reviewed changes

header/sync/sync.go Outdated Show resolved Hide resolved

header/sync/sync.go Outdated Show resolved Hide resolved

header/sync/sync.go Show resolved Hide resolved

header/sync/sync_head.go Show resolved Hide resolved

renaynay previously approved these changes Sep 12, 2022

View reviewed changes

header/sync/sync.go Outdated Show resolved Hide resolved

header/header.go Outdated Show resolved Hide resolved

header/sync/sync.go Outdated Show resolved Hide resolved

header/sync/sync.go Outdated Show resolved Hide resolved

header/sync/sync_head.go Show resolved Hide resolved

Wondertan dismissed renaynay’s stale review via 1f714a9 September 13, 2022 10:35

Wondertan and others added 12 commits September 13, 2022 12:36

feat(params|header/sync): dirty intergration of block time into Syncer

5671647

Co-authored-by: Rene <[email protected]>

feat(header): new utility funcs for the ExtendedHeader in preparation…

1daa176

… for Syncer improvements Co-authored-by: Rene <[email protected]>

docs(header/sync): add TODO for potential optimization

a68dfa8

refactor(header/sync): rework Syncer lifecycling

6e154b3

Mainly, allow Start to error so that subsequent Stop does not panic. While also make lifecycle logic less confusing and less error-prone

fix(node): do not fail the Start for Syncer if it is not initialized,…

b5b8ee5

… so that Node tests does not fail

fix(header/sync): use RWLock for sync ranges

8d81fc0

Going further, there wiil be multiple readers that should not block each other

fix(header/sync): ensure objective head is requested only once when m…

2a46c3f

…any at any moment

chore(header/sync): cleanup syncing code and update the tests to use …

8661675

…WaitSync

Apply docs suggestions from @renaynay and @liamsi

be9a0a7

Co-authored-by: rene <[email protected]> Co-authored-by: Ismail Khoffi <[email protected]>

Update header/header.go

44ab8a2

Co-authored-by: rene <[email protected]>

Wondertan force-pushed the hlib/syncer-head branch from a06d114 to 44ab8a2 Compare September 13, 2022 10:36

renaynay approved these changes Sep 13, 2022

View reviewed changes

Bidon15 approved these changes Sep 13, 2022

View reviewed changes

renaynay merged commit 00d80c4 into main Sep 13, 2022

renaynay deleted the hlib/syncer-head branch September 13, 2022 11:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat|refactor(header/sync): network head determination #990

feat|refactor(header/sync): network head determination #990

Wondertan commented Aug 7, 2022 •

edited

Loading

codecov-commenter commented Aug 7, 2022 •

edited

Loading

Wondertan commented Aug 7, 2022 •

edited

Loading

renaynay left a comment

Wondertan commented Aug 8, 2022

Wondertan commented Aug 8, 2022

distractedm1nd left a comment •

edited

Loading

renaynay left a comment

Wondertan commented Sep 9, 2022

liamsi left a comment

renaynay left a comment

renaynay left a comment

feat|refactor(header/sync): network head determination #990

feat|refactor(header/sync): network head determination #990

Conversation

Wondertan commented Aug 7, 2022 • edited Loading

Context

Changes

Other

TODO

codecov-commenter commented Aug 7, 2022 • edited Loading

Codecov Report

Wondertan commented Aug 7, 2022 • edited Loading

renaynay left a comment

Choose a reason for hiding this comment

Wondertan commented Aug 8, 2022

Wondertan commented Aug 8, 2022

distractedm1nd left a comment • edited Loading

Choose a reason for hiding this comment

renaynay left a comment

Choose a reason for hiding this comment

Wondertan commented Sep 9, 2022

liamsi left a comment

Choose a reason for hiding this comment

renaynay left a comment

Choose a reason for hiding this comment

renaynay left a comment

Choose a reason for hiding this comment

Wondertan commented Aug 7, 2022 •

edited

Loading

codecov-commenter commented Aug 7, 2022 •

edited

Loading

Wondertan commented Aug 7, 2022 •

edited

Loading

distractedm1nd left a comment •

edited

Loading