Implementer's guide: Assignments counting procedure by burdges · Pull Request #1930 · paritytech/polkadot

burdges · 2020-11-08T20:58:42Z

Alters the assignments counting procedure from #1691 to take at least one tranche per no-show. This changes a few things:

We're now forced to keep expected assignees per tranche small, like 1-2, except for tranche zero. We planned this anyways because if prevents an adversary from bruit forcing their bad assignees into having lower tranches than many honest assignees.
We consume more assignees resources but only slightly more and only under niche scenarios.
We're much less vulnerable to an adversary with as nearly unlimited capability to DoS honest assignees: Imagine they wait until they've many bad assignees into low tranches and then just DoS all the honest ones, making them no-shows. We previously counted these bad assignees towards replacing the honest DoSed no-shows, but now we add an expected 1-2 honest checkers per tranche aka per no-show.

I needed to write check_approval and tranches_to_approve as basically one method advance_assignee_status in the pseduo-code #1558 so they'll turn out similar here I think, but right now this suffices I guess.

@rphmeier There is another "iterating VRFs for no-shows" scheme that's stronger under attack by a powerful DoS adversary: We could cover no-shows, not by using additional tranches, but by running fresh VRFs whose story is the no-show, meaning this second (third) order VRF input is the VRF output of the no-show, and it returns a tranche.

In this, adversary knows literally nothing about the no-show checks until they happen, which makes DoS attacks almost worthless, excepting in conjunction with relay chain equivocation of course. Yet, there are several serious down sides for this iterating VRFs scheme:

We'd replace honest no-shows very slowly because we've no analog of the fat zeroth tranche here, so like maybe a single no-show delays finality for minutes!
We'd complicate the code with yet more gossip messages, etc. I know tranches_to_approve looks complex, but it's a single loop for which we can write extensive unit tests, while this iterating VRFs requires more complex tests that cover the gossip messages, etc.
We might address 1 with a VRF bidding scheme where anyone with low results sends a message, but this makes 2 much much worse. And it worsens our relay chain equivocation situation somewhat.

I'm therefore proposing the "simpler" scheme in this PR instead of iterating VRFs for no-shows.

We'v e two purposes for this, so best to mention them for clarity.

Actually check_approval should be merged into tranches_to_approve I think.

burdges · 2020-11-09T11:05:39Z

roadmap/implementers-guide/src/node/approval/approval-voting.md

-    * return `n_tranches`
+    1. First, set `t := session_info.needed_approvals`.  Set the base tranche `l=0`.
+    2. Take assignments from tranches `l..` until we have at least `t` assignments.  Let `k` denote the highest tranche taken.  Count the number `assigned` of assignments taken in tranches `0..k`.  If `assigned < t` then return a special value `ALL` which indicates we wait for more assignments. 
+    3. If `assigned > 128` then return the candidate as unfinalizable and advise block production to build another fork from the inclusion relay parent.  This condition can be reverted if some no-shows turnning up eventually.


We could discuss omitting this check for now if anyone is worried about it adding complexity. We're approaching breaks in underlying security assumptions if assigned gets so large anyways.

roadmap/implementers-guide/src/node/approval/approval-voting.md

rphmeier · 2020-11-09T16:01:37Z

roadmap/implementers-guide/src/node/approval/approval-voting.md

 #### `import_checked_assignment`
  * Load the candidate in question and access the `approval_entry` for the block hash the cert references.
-  * Ensure the validator index is not part of the backing group for the candidate.
+  * Ensure the validator index is not part of the backing group for the candidate.  We count late backing votes via the backing system.


We don't, at the moment

Remove so this represents the current goal, or use future/conditional tense to represent later plans?

I suggest removal

rphmeier · 2020-11-09T16:03:42Z

roadmap/implementers-guide/src/node/approval/approval-voting.md

-    * return `n_tranches`
+    1. First, set `t := session_info.needed_approvals`.  Set the base tranche `l=0`.
+    2. Take assignments from tranches `l..` until we have at least `t` assignments.  Let `k` denote the highest tranche taken.  Count the number `assigned` of assignments taken in tranches `0..k`.  If `assigned < t` then return a special value `ALL` which indicates we wait for more assignments. 
+    3. If `assigned > 128` then return the candidate as unfinalizable and advise block production to build another fork from the inclusion relay parent.  This condition can be reverted if some no-shows turnning up eventually.


If assigned > 128 then return the candidate as unfinalizable

This should say to return ALL.

advise block production to build another fork from the inclusion relay parent

Omit this - it's underspecified at the moment. Maybe add a TODO and a GitHub issue for blacklisting those forks.

We should remove this entirely I think and just create an issue.

https://github.com/w3f/research-security-issues/issues/47

rphmeier · 2020-11-09T16:04:53Z

roadmap/implementers-guide/src/node/approval/approval-voting.md

+    1. First, set `t := session_info.needed_approvals`.  Set the base tranche `l=0`.
+    2. Take assignments from tranches `l..` until we have at least `t` assignments.  Let `k` denote the highest tranche taken.  Count the number `assigned` of assignments taken in tranches `0..k`.  If `assigned < t` then return a special value `ALL` which indicates we wait for more assignments. 
+    3. If `assigned > 128` then return the candidate as unfinalizable and advise block production to build another fork from the inclusion relay parent.  This condition can be reverted if some no-shows turnning up eventually.
+    4. Count the number `noshows` of no-shows in tranches `l..k`.  If `noshows` is zero then return success with `n_tranches := k`.  Of course this happens early and does not indicate final termination, as we may later return `ALL` after more no-shows, but if all these assigned checkers vote valid then we are done.


Of course this happens early and does not indicate final termination, as we may later return ALL after more no-shows, but if all these assigned checkers vote valid then we are done.

This sounds like logic beyond the scope of the function - check_approval is the part that checks approval.

It's a remark that this function is non-monotonic. It can increase, and then decrease (and return approved once it gets merged with check_approval again).

Can you change the language a bit? Right now it sounds sort of like the function can return more than once, whereas what we actually want to get across is that it is non-monotonic.

rphmeier · 2020-11-09T16:06:02Z

roadmap/implementers-guide/src/node/approval/approval-voting.md

+    2. Take assignments from tranches `l..` until we have at least `t` assignments.  Let `k` denote the highest tranche taken.  Count the number `assigned` of assignments taken in tranches `0..k`.  If `assigned < t` then return a special value `ALL` which indicates we wait for more assignments. 
+    3. If `assigned > 128` then return the candidate as unfinalizable and advise block production to build another fork from the inclusion relay parent.  This condition can be reverted if some no-shows turnning up eventually.
+    4. Count the number `noshows` of no-shows in tranches `l..k`.  If `noshows` is zero then return success with `n_tranches := k`.  Of course this happens early and does not indicate final termination, as we may later return `ALL` after more no-shows, but if all these assigned checkers vote valid then we are done.
+    5. For each no-show in `noshows`, we require both another checker and another tranche, which ever means more tranches.  Take assignments from at least `no_shows` subsequent tranches and then if we have not yet covered all noshows then continue taking tranches until we do cover all no-shows.  e.g. if there are 2 no-shows, we might only need to take 1 additional tranche with >= 2 assignments. Or we might need to take 3 tranches, where one is empty and the other two have 1 assignment each. 


We should address the "we run out of tranches to take, having not received any assignments past a certain point" case.

That was 3 which I just removed. If we run out of tranches without 3 then every validator is assigned to check the block. Yes I suppose we should say so. If we run out of tranches with 3 then we've decided this block sucks and we're going to fork the chain or something.

We need to know how many tranches to take, but remember that this is from the perspective of a single node executing these algorithms, and this node may simply have not received those assignments yet.

rphmeier · 2020-11-09T16:09:20Z

roadmap/implementers-guide/src/node/approval/approval-voting.md

 #### `check_approval(block_entry, approval_entry, n_tranches) -> bool`
  * If `n_tranches` is ALL, return false
-  * Otherwise, if all validators in `n_tranches` have approved, return `true`. If any validator in these tranches has not yet approved but is not yet considered a no-show, return `false`.
+  * Otherwise, if all validators in `n_tranches` have either approved or been replaced as a no-show, then return `true`.  If any validator in these tranches has not yet approved but is not yet considered a no-show, return `false`.


This function doesn't have information to understand replacement. This function is invoked after the replacement procedure has selected extra tranches. I could see it gaining an extra parameter for tolerated_missing_approvals. We would also need to alter tranches_to_approve to return a (n_tranches, n_replaced) tuple, and supply tolerated_missing_approvals = n_replaced in the call site of check_approval. Then we would not approve if there are any missing approval votes in those tranches beyond the no-shows we've already accounted for.

As I mentioned elsewhere, I needed tranches_to_approve and check_approval to be the same function to have everything in one place. You cannot count tranches for no-shows without knowing the current approved votes either.

They aren't the same function in this PR, though. So they would need to be unified

This reverts commit 5ca8d5b.

We're under a DoS attack in this case, or maybe a bad parachain, so we shoiuld never finalize the block until no-shows actually respond. It's hard to explain telling block production to take another path right here though.

rphmeier · 2020-11-10T16:36:19Z

roadmap/implementers-guide/src/node/approval/approval-voting.md

+#### ` assignees_status(approval_entry) -> AssigneeStatus`
+  * Summarise our view of this approval entry's run by iterating over assignment and approval vote records 
+    1. First, set `needed := session_info.needed_approvals`.  Set the base tranche `l=0`.
+    2. Take assignments from tranches `l..` until we have at least `needed` assignments or hit our timeouts.  Let `tranches` denote the highest tranche taken (plus one).  


What does "hit our timeouts" mean?

rphmeier · 2020-11-10T16:36:55Z

roadmap/implementers-guide/src/node/approval/approval-voting.md

+    1. First, set `needed := session_info.needed_approvals`.  Set the base tranche `l=0`.
+    2. Take assignments from tranches `l..` until we have at least `needed` assignments or hit our timeouts.  Let `tranches` denote the highest tranche taken (plus one).  
+    3. Count the number `assigned` of assignments taken in tranches `0..tranches`.  If `assigned < needed` then return a special value `PENDING` which indicates we wait for more assignments. 
+    4. Count the number `approvals` in tranches `0..tranches`.  Also, count the number of `noshows` of no-shows in tranches `l..tranches`.  If `noshows` is zero then return `DONE(approvals,assigned,needed,tranches)`.  Of course, this indicates potential approval only if `approvals == assigned` and we hide the assignment timeout for `tranches`.  In fact, this is an over simplification since we care about arrival times, so counting tranches does not suffice here.  If `approvals < assigned` then more no-shows could occur on future invokations, returning us to `PEMNDING`.


I can't merge over-simplification into the implementer's guide!

roadmap/implementers-guide/src/node/approval/approval-voting.md

rphmeier · 2020-11-10T16:38:32Z

roadmap/implementers-guide/src/node/approval/approval-voting.md

 #### `check_approval(block_entry, approval_entry, n_tranches) -> bool`
-  * If `n_tranches` is ALL, return false
-  * Otherwise, if all validators in `n_tranches` have either approved or been replaced as a no-show, then return `true`.  If any validator in these tranches has not yet approved but is not yet considered a no-show, return `false`.
+  * Invoke `assignees_status` and then check if `approvals == assigned == needed` or if `needed >= num_validators` then we ask that `3 approvals > 2 num_validators`.  


The 3 * approvals > 2 * num_validators thing is new? Is num_validators the total number of validators?

I think that's just an integer-safe expression of the inequality approvals/num_validators > 2/3.

Yes, it's just not something that was described in the previous writeup. But Jeff & I spoke about this yesterday in more detail. It's an escape hatch for if we seem to be requesting a very large amount of assignments

We'll do something better here later, but some 2/3 condition is simple and suffices for an MVP I think. I also like this "worse" condition for initial testnets because if we've some bug that causes a checker explosion then this make it show up more in testing.

burdges · 2020-11-16T13:46:02Z

I've updated the description to far more closely resemble tracker.rs in https://github.com/paritytech/polkadot/pull/1558/files which I also updated for the new no show scheme.

I'm worried this new description is a "bit too efficient" and while the text I removed was vague it maybe communicated important aspects of the goal. The gap is discussed in db69ad9#diff-588e17953264507ee591a633edc92d8dd229f9da717eef8a65245396ab1e12d4R249

burdges · 2020-11-28T09:48:18Z

#1972 is better

burdges added 3 commits November 8, 2020 12:13

ApprovedAncestor warants more explination

5ca8d5b

We'v e two purposes for this, so best to mention them for clarity.

Remark on late backing votes

fadd668

Try to describe tranches_to_approve

635ad3b

Actually check_approval should be merged into tranches_to_approve I think.

burdges commented Nov 9, 2020

View reviewed changes

rphmeier reviewed Nov 9, 2020

View reviewed changes

roadmap/implementers-guide/src/node/approval/approval-voting.md Outdated Show resolved Hide resolved

rphmeier reviewed Nov 9, 2020

View reviewed changes

burdges added 4 commits November 9, 2020 17:20

Revert "ApprovedAncestor warants more explination"

f541273

This reverts commit 5ca8d5b.

Rmove excessive assignments case.

59b7a69

We're under a DoS attack in this case, or maybe a bad parachain, so we shoiuld never finalize the block until no-shows actually respond. It's hard to explain telling block production to take another path right here though.

Clarify what happens without the early abort trisk.

560194d

assignees_status

98a84fa

rphmeier reviewed Nov 10, 2020

View reviewed changes

roadmap/implementers-guide/src/node/approval/approval-voting.md Outdated Show resolved Hide resolved

rphmeier reviewed Nov 10, 2020

View reviewed changes

Try writing it like the code

db69ad9

rphmeier mentioned this pull request Nov 18, 2020

change approval voting counting procedure #1972

Merged

burdges closed this Nov 28, 2020

Conversation

burdges commented Nov 8, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rphmeier Nov 9, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

burdges Nov 9, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rphmeier Nov 11, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

burdges commented Nov 16, 2020

Uh oh!

burdges commented Nov 28, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

rphmeier Nov 9, 2020 •

edited

Loading

burdges Nov 9, 2020 •

edited

Loading

rphmeier Nov 11, 2020 •

edited

Loading