Implementer's guide: Assignments counting procedure#1930
Implementer's guide: Assignments counting procedure#1930burdges wants to merge 8 commits intoparitytech:masterfrom
Conversation
We'v e two purposes for this, so best to mention them for clarity.
Actually check_approval should be merged into tranches_to_approve I think.
| * return `n_tranches` | ||
| 1. First, set `t := session_info.needed_approvals`. Set the base tranche `l=0`. | ||
| 2. Take assignments from tranches `l..` until we have at least `t` assignments. Let `k` denote the highest tranche taken. Count the number `assigned` of assignments taken in tranches `0..k`. If `assigned < t` then return a special value `ALL` which indicates we wait for more assignments. | ||
| 3. If `assigned > 128` then return the candidate as unfinalizable and advise block production to build another fork from the inclusion relay parent. This condition can be reverted if some no-shows turnning up eventually. |
There was a problem hiding this comment.
We could discuss omitting this check for now if anyone is worried about it adding complexity. We're approaching breaks in underlying security assumptions if assigned gets so large anyways.
roadmap/implementers-guide/src/node/approval/approval-voting.md
Outdated
Show resolved
Hide resolved
| #### `import_checked_assignment` | ||
| * Load the candidate in question and access the `approval_entry` for the block hash the cert references. | ||
| * Ensure the validator index is not part of the backing group for the candidate. | ||
| * Ensure the validator index is not part of the backing group for the candidate. We count late backing votes via the backing system. |
There was a problem hiding this comment.
Remove so this represents the current goal, or use future/conditional tense to represent later plans?
| * return `n_tranches` | ||
| 1. First, set `t := session_info.needed_approvals`. Set the base tranche `l=0`. | ||
| 2. Take assignments from tranches `l..` until we have at least `t` assignments. Let `k` denote the highest tranche taken. Count the number `assigned` of assignments taken in tranches `0..k`. If `assigned < t` then return a special value `ALL` which indicates we wait for more assignments. | ||
| 3. If `assigned > 128` then return the candidate as unfinalizable and advise block production to build another fork from the inclusion relay parent. This condition can be reverted if some no-shows turnning up eventually. |
There was a problem hiding this comment.
If
assigned > 128then return the candidate as unfinalizable
This should say to return ALL.
advise block production to build another fork from the inclusion relay parent
Omit this - it's underspecified at the moment. Maybe add a TODO and a GitHub issue for blacklisting those forks.
There was a problem hiding this comment.
We should remove this entirely I think and just create an issue.
There was a problem hiding this comment.
| 1. First, set `t := session_info.needed_approvals`. Set the base tranche `l=0`. | ||
| 2. Take assignments from tranches `l..` until we have at least `t` assignments. Let `k` denote the highest tranche taken. Count the number `assigned` of assignments taken in tranches `0..k`. If `assigned < t` then return a special value `ALL` which indicates we wait for more assignments. | ||
| 3. If `assigned > 128` then return the candidate as unfinalizable and advise block production to build another fork from the inclusion relay parent. This condition can be reverted if some no-shows turnning up eventually. | ||
| 4. Count the number `noshows` of no-shows in tranches `l..k`. If `noshows` is zero then return success with `n_tranches := k`. Of course this happens early and does not indicate final termination, as we may later return `ALL` after more no-shows, but if all these assigned checkers vote valid then we are done. |
There was a problem hiding this comment.
Of course this happens early and does not indicate final termination, as we may later return
ALLafter more no-shows, but if all these assigned checkers vote valid then we are done.
This sounds like logic beyond the scope of the function - check_approval is the part that checks approval.
There was a problem hiding this comment.
It's a remark that this function is non-monotonic. It can increase, and then decrease (and return approved once it gets merged with check_approval again).
There was a problem hiding this comment.
Can you change the language a bit? Right now it sounds sort of like the function can return more than once, whereas what we actually want to get across is that it is non-monotonic.
| 2. Take assignments from tranches `l..` until we have at least `t` assignments. Let `k` denote the highest tranche taken. Count the number `assigned` of assignments taken in tranches `0..k`. If `assigned < t` then return a special value `ALL` which indicates we wait for more assignments. | ||
| 3. If `assigned > 128` then return the candidate as unfinalizable and advise block production to build another fork from the inclusion relay parent. This condition can be reverted if some no-shows turnning up eventually. | ||
| 4. Count the number `noshows` of no-shows in tranches `l..k`. If `noshows` is zero then return success with `n_tranches := k`. Of course this happens early and does not indicate final termination, as we may later return `ALL` after more no-shows, but if all these assigned checkers vote valid then we are done. | ||
| 5. For each no-show in `noshows`, we require both another checker and another tranche, which ever means more tranches. Take assignments from at least `no_shows` subsequent tranches and then if we have not yet covered all noshows then continue taking tranches until we do cover all no-shows. e.g. if there are 2 no-shows, we might only need to take 1 additional tranche with >= 2 assignments. Or we might need to take 3 tranches, where one is empty and the other two have 1 assignment each. |
There was a problem hiding this comment.
We should address the "we run out of tranches to take, having not received any assignments past a certain point" case.
There was a problem hiding this comment.
That was 3 which I just removed. If we run out of tranches without 3 then every validator is assigned to check the block. Yes I suppose we should say so. If we run out of tranches with 3 then we've decided this block sucks and we're going to fork the chain or something.
There was a problem hiding this comment.
We need to know how many tranches to take, but remember that this is from the perspective of a single node executing these algorithms, and this node may simply have not received those assignments yet.
| #### `check_approval(block_entry, approval_entry, n_tranches) -> bool` | ||
| * If `n_tranches` is ALL, return false | ||
| * Otherwise, if all validators in `n_tranches` have approved, return `true`. If any validator in these tranches has not yet approved but is not yet considered a no-show, return `false`. | ||
| * Otherwise, if all validators in `n_tranches` have either approved or been replaced as a no-show, then return `true`. If any validator in these tranches has not yet approved but is not yet considered a no-show, return `false`. |
There was a problem hiding this comment.
This function doesn't have information to understand replacement. This function is invoked after the replacement procedure has selected extra tranches. I could see it gaining an extra parameter for tolerated_missing_approvals. We would also need to alter tranches_to_approve to return a (n_tranches, n_replaced) tuple, and supply tolerated_missing_approvals = n_replaced in the call site of check_approval. Then we would not approve if there are any missing approval votes in those tranches beyond the no-shows we've already accounted for.
There was a problem hiding this comment.
As I mentioned elsewhere, I needed tranches_to_approve and check_approval to be the same function to have everything in one place. You cannot count tranches for no-shows without knowing the current approved votes either.
There was a problem hiding this comment.
They aren't the same function in this PR, though. So they would need to be unified
This reverts commit 5ca8d5b.
We're under a DoS attack in this case, or maybe a bad parachain, so we shoiuld never finalize the block until no-shows actually respond. It's hard to explain telling block production to take another path right here though.
| #### ` assignees_status(approval_entry) -> AssigneeStatus` | ||
| * Summarise our view of this approval entry's run by iterating over assignment and approval vote records | ||
| 1. First, set `needed := session_info.needed_approvals`. Set the base tranche `l=0`. | ||
| 2. Take assignments from tranches `l..` until we have at least `needed` assignments or hit our timeouts. Let `tranches` denote the highest tranche taken (plus one). |
There was a problem hiding this comment.
What does "hit our timeouts" mean?
| 1. First, set `needed := session_info.needed_approvals`. Set the base tranche `l=0`. | ||
| 2. Take assignments from tranches `l..` until we have at least `needed` assignments or hit our timeouts. Let `tranches` denote the highest tranche taken (plus one). | ||
| 3. Count the number `assigned` of assignments taken in tranches `0..tranches`. If `assigned < needed` then return a special value `PENDING` which indicates we wait for more assignments. | ||
| 4. Count the number `approvals` in tranches `0..tranches`. Also, count the number of `noshows` of no-shows in tranches `l..tranches`. If `noshows` is zero then return `DONE(approvals,assigned,needed,tranches)`. Of course, this indicates potential approval only if `approvals == assigned` and we hide the assignment timeout for `tranches`. In fact, this is an over simplification since we care about arrival times, so counting tranches does not suffice here. If `approvals < assigned` then more no-shows could occur on future invokations, returning us to `PEMNDING`. |
There was a problem hiding this comment.
I can't merge over-simplification into the implementer's guide!
roadmap/implementers-guide/src/node/approval/approval-voting.md
Outdated
Show resolved
Hide resolved
| #### `check_approval(block_entry, approval_entry, n_tranches) -> bool` | ||
| * If `n_tranches` is ALL, return false | ||
| * Otherwise, if all validators in `n_tranches` have either approved or been replaced as a no-show, then return `true`. If any validator in these tranches has not yet approved but is not yet considered a no-show, return `false`. | ||
| * Invoke `assignees_status` and then check if `approvals == assigned == needed` or if `needed >= num_validators` then we ask that `3 approvals > 2 num_validators`. |
There was a problem hiding this comment.
The 3 * approvals > 2 * num_validators thing is new? Is num_validators the total number of validators?
There was a problem hiding this comment.
I think that's just an integer-safe expression of the inequality approvals/num_validators > 2/3.
There was a problem hiding this comment.
Yes, it's just not something that was described in the previous writeup. But Jeff & I spoke about this yesterday in more detail. It's an escape hatch for if we seem to be requesting a very large amount of assignments
There was a problem hiding this comment.
We'll do something better here later, but some 2/3 condition is simple and suffices for an MVP I think. I also like this "worse" condition for initial testnets because if we've some bug that causes a checker explosion then this make it show up more in testing.
|
I've updated the description to far more closely resemble tracker.rs in https://github.com/paritytech/polkadot/pull/1558/files which I also updated for the new no show scheme. I'm worried this new description is a "bit too efficient" and while the text I removed was vague it maybe communicated important aspects of the goal. The gap is discussed in db69ad9#diff-588e17953264507ee591a633edc92d8dd229f9da717eef8a65245396ab1e12d4R249 |
|
#1972 is better |
Alters the assignments counting procedure from #1691 to take at least one tranche per no-show. This changes a few things:
I needed to write
check_approvalandtranches_to_approveas basically one methodadvance_assignee_statusin the pseduo-code #1558 so they'll turn out similar here I think, but right now this suffices I guess.@rphmeier There is another "iterating VRFs for no-shows" scheme that's stronger under attack by a powerful DoS adversary: We could cover no-shows, not by using additional tranches, but by running fresh VRFs whose story is the no-show, meaning this second (third) order VRF input is the VRF output of the no-show, and it returns a tranche.
In this, adversary knows literally nothing about the no-show checks until they happen, which makes DoS attacks almost worthless, excepting in conjunction with relay chain equivocation of course. Yet, there are several serious down sides for this iterating VRFs scheme:
tranches_to_approvelooks complex, but it's a single loop for which we can write extensive unit tests, while this iterating VRFs requires more complex tests that cover the gossip messages, etc.I'm therefore proposing the "simpler" scheme in this PR instead of iterating VRFs for no-shows.