incentives: cache top online accounts and use when building AbsentParticipationAccounts#6085
Conversation
…ding AbsentParticipationAccounts
975ddb4 to
21db44d
Compare
My first approach was to make it a field in the |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## feature/heartbeats #6085 +/- ##
======================================================
+ Coverage 56.22% 56.27% +0.05%
======================================================
Files 494 494
Lines 69954 70040 +86
======================================================
+ Hits 39330 39416 +86
+ Misses 27947 27944 -3
- Partials 2677 2680 +3 ☔ View full report in Codecov by Sentry. |
…break TestAbsenteeChecks
Co-authored-by: John Jannotti <jannotti@gmail.com>
f5b42d4 to
01b150a
Compare
1c4c898 to
c558d59
Compare
| // test same scenario on double ledger | ||
| t.Run("DoubleLedger", func(t *testing.T) { | ||
| m := newDoubleLedgerAcctModel(t, protocol.ConsensusFuture, true) | ||
| m := newDoubleLedgerAcctModel(t, protocol.ConsensusV39, true) // TODO simulate heartbeats |
There was a problem hiding this comment.
Why not keep this on future?
There was a problem hiding this comment.
It fails because heartbeats aren't implemented, but proposers aren't being set, so the big accounts are challenged and kicked offline, and all the stake numbers don't match the test expectations. I could have tried to fix this by ensuring all the test accounts show up as proposers as often as necessary to avoid suspension, but I thought maybe it would be better to see after heartbeats were implemented whether that would make the tests pass without as much modification.
| // lookup retrieves agreement data about an address, querying the ledger if necessary. | ||
| lookupAgreement(basics.Address) (basics.OnlineAccountData, error) | ||
| onlineStake() (basics.MicroAlgos, error) | ||
| knockOfflineCandidates() (map[basics.Address]basics.OnlineAccountData, error) |
There was a problem hiding this comment.
Maybe a NIT: should we actually call this top online accounts or similar naming? It's very clear from comments that's what we are requesting, more a debate over if the name should be based on what it's sourced from vs the use-case we have for this atm.
There was a problem hiding this comment.
It is a potentially stale list of top online accounts, if new accounts appeared online in the last 256 rounds (since the last state proof) they wouldn't appear. So the word "candidates" was intended to make it seem a little less definitive that this was the complete list of top online accounts for the round... but happy to pick any other name, I wasn't particularly happy with this name.
This is already being used in a method JJ called "generateKnockOfflineAccountsList" in #5757 which is where the "knockOffline" part came from.
| a := require.New(fixtures.SynchronizedTest(t)) | ||
|
|
||
| consensusParams := getDefaultStateProofConsensusParams() | ||
| consensusParams.Payouts = config.ProposerPayoutRules{} // TODO re-enable payouts when nodes aren't suspended |
There was a problem hiding this comment.
I guess I should make an issue to address the "update this test once heartbeats are implemented" TODOs in this PR
jannotti
left a comment
There was a problem hiding this comment.
I think we will not have the top online cache before the first state proof, right? Maybe it would make sense to seed it during genesis (since the onlince accounts are listed out for us in the genesis file, I think). That could avoid special cases in the tests.
| func (eval *BlockEvaluator) endOfBlock() error { | ||
| // When generating a block, participating addresses are passed to prevent a | ||
| // proposer from suspending itself. | ||
| func (eval *BlockEvaluator) endOfBlock(participating ...basics.Address) error { |
There was a problem hiding this comment.
Why ...basics.Address instead of []basics.Address? I assume callers always have a slice, as opposed to call sites with, say, 5 explicit arguments.
There was a problem hiding this comment.
it's true, this is just me optimizing for a smaller diff, to not change other endOfBlock callers, but the idea is to pass a slice — can change
| IncentiveEligible bool // currently unused below, but may be needed in the future | ||
| } | ||
| candidates := make(map[basics.Address]candidateData) | ||
| partAddrs := util.MakeSet(participating...) |
There was a problem hiding this comment.
Do we do anything else with this slice? Maybe we should push the Set type up through the callers, so that it is built as a Set when it is first created to pass to endOfBlock?
There was a problem hiding this comment.
It's used in GenerateBlock while making a map of end-of-block account state for participating addresses, to include in the UnfinishedBlock ... if we pushed it up to GenerateBlock then it could protect against looking up the same participating address twice, if duplicate addresses were passed to GenerateBlock.
| if maxSuspensions > 0 { | ||
| knockOfflineCandidates, err := eval.state.knockOfflineCandidates() | ||
| if err != nil { | ||
| // Log an error and keep going; generating lists of absent and expired |
There was a problem hiding this comment.
So this implies some nodes can "choose" not to search for absent/expired accounts.
There was a problem hiding this comment.
yes, when generating a block it is not required they put any accounts in the {Absent,Expired}ParticipationAccounts block headers, but if they are in the list, validation rules require that the accounts are actually absent or expired.
|
|
||
| // Now, check these candidate accounts to see if they are expired or absent. | ||
| for accountAddr, acctData := range candidates { | ||
| if acctData.MicroAlgosWithRewards.IsZero() { |
There was a problem hiding this comment.
100% of time, zero balance implies being closed?
There was a problem hiding this comment.
Yes, that's correct, my understanding is currently the only way you can have a zero balance at the end of the round is if your account has been closed.
| // | ||
| // This function is passed a list of participating addresses so a node will not | ||
| // propose a block that suspends or expires itself. | ||
| func (eval *BlockEvaluator) generateKnockOfflineAccountsList(participating []basics.Address) { |
There was a problem hiding this comment.
participating is really "participating accounts excluding any I host"
There was a problem hiding this comment.
here, the "participating" argument is the accounts that the node hosts.
gmalouf
left a comment
There was a problem hiding this comment.
I'm good in general, a few small comments.
| vb := l.endBlock(t, blkEval) | ||
| vb := l.endBlock(t, blkEval, recvAddr) | ||
| blkEval = l.nextBlock(t) | ||
| //require.Empty(t, vb.Block().ExpiredParticipationAccounts) |
There was a problem hiding this comment.
why is this added commented out?
There was a problem hiding this comment.
The test sets up a bunch of participating accounts that are separate from the ones that I'm interested in, and they do expire (before they didn't because they weren't noticed), but in a separate branch I was working on updating this
| challenge := byte(0) | ||
| for i := uint64(0); i < uint64(1210); i++ { // A bit past one grace period (200) past challenge at 1000. | ||
| vb := l.endBlock(t, blkEval) | ||
| for i := uint64(0); i < uint64(1200); i++ { // Just before first suspension at 1171 |
There was a problem hiding this comment.
Would this not go past first suspension - why 1200?
There was a problem hiding this comment.
it's based on the values are set for certain accounts initializing LastHeartbeat/LastProposed earlier in the test
| } | ||
|
|
||
| st := txn.Sign(keys[0]) | ||
| err = eval.Transaction(st, transactions.ApplyData{}) |
There was a problem hiding this comment.
Why remove all of these eval.Transaction calls?
There was a problem hiding this comment.
you no longer need to send transactions to cause GenerateBlock/BlockEvaluator to "notice" an account is expired or not participating
| } | ||
|
|
||
| // fetch fresh data up to this round from online account cache. These accounts should all | ||
| // be in cache, as long as proto.StateProofTopVoters < onlineAccountsCacheMaxSize. |
There was a problem hiding this comment.
This feels like a condition to call out in the consensus file.
There was a problem hiding this comment.
added TestOnlineAccountsCacheSizeBiggerThanStateProofTopVoters
eff5fb4 to
8b6c443
Compare
a843630 to
c558d59
Compare
|
Merging this into
|
| // key material deleted. If it is only suspended, the key material will remain. | ||
| func (eval *BlockEvaluator) generateKnockOfflineAccountsList() { | ||
| // | ||
| // Different ndoes may propose different list of addresses based on node state. |
| candidates[accountAddr] = candidateData{ | ||
| VoteLastValid: acctData.VoteLastValid, | ||
| VoteID: acctData.VoteID, | ||
| Status: basics.Online, // from lookupOnlineAccountData, which only returns online accounts |
There was a problem hiding this comment.
I guess need a test to enforce knockOfflineCandidates -> lookupOnlineAccountData control flow
| for addr := range voters.AddrToPos { | ||
| data, err := l.acctsOnline.lookupOnlineAccountData(rnd, addr) | ||
| if err != nil { | ||
| continue // skip missing / not online accounts |
There was a problem hiding this comment.
why would voters ever return non-online account?
There was a problem hiding this comment.
the voters are only calculating Top N every 256 rounds, so if a lookup for the current round (for the cached addr from the last state proof interval) being requested is that the account was closed/deleted, you could hit an error here.
There was a problem hiding this comment.
I should add a comment and write a test exercising this case, realizing it is kind of complicated now after writing it out
Summary
In #5757 a mechanism was introduced to suspend "absentee" accounts that don't participate (by making a proposal, or heartbeat as in #5799), by adding a block header
AbsentParticipationAccounts, similar toExpiredParticipationAccounts.Currently, the list is generated by considering any account touched by a transaction in the current block, since this data is readily available at
endOfBlock(). This PR adds a periodically-updated cache of top online accounts to the ledger, to find additional online accounts not mentioned in the current block.All of these tracked addresses will now be checked for absentee or expired status each round. To get a recent list of top online accounts, this PR uses recent work done by the votersTracker and state proof worker. (Every 256 rounds, the state proof system performs a TopOnlineAccounts query.) This adds access to the votersTracker to fetch the most recent list of top online addresses, and for each address looks up the latest round's data from the online account cache.
LastProposed and LastHeartbeat are added to the online accounts table's DB representation in this PR. This also fixes an issue introduced in #5965 where uses of ledgercore.OnlineAccountData (which didn't have LastHeartbeat/LastProposed fields) were replaced by basics.OnlineAccountData (which did) and ended up with those fields not being set in a couple of conversions from AccountData.
Test Plan
update test/e2e-go/features/incentives/suspension_test.go(TODO return later after heartbeats)