Skip to content
This repository was archived by the owner on Nov 15, 2023. It is now read-only.

Conversation

@dvc94ch
Copy link
Contributor

@dvc94ch dvc94ch commented Jul 5, 2019

TODO:

  • use session store in babe
  • use session store in grandpa
  • lru to remove old keys
  • work with offchain workers api
  • move some cli args to node
  • update im-online
  • clean up mess
    cc @joepetrowski

@dvc94ch dvc94ch added the A0-please_review Pull request needs code review. label Jul 5, 2019
@dvc94ch dvc94ch requested a review from rphmeier July 5, 2019 10:41
@rphmeier
Copy link
Contributor

rphmeier commented Jul 5, 2019

This suffers from the problem I mentioned earlier, which is that users are never going to set their key exactly on the end of a session. it will always be a bit before or after. If all nodes set their new keys a bit before the end of the session, no nodes will author blocks, and the new session will never start until someone intervenes manually.

So we probably need it to author on both keys for some time.

@gavofyork
Copy link
Member

So the way this should work:

  • User calls RPC at some point before session with generate new key. Key is stored and indexed in local DB.
  • User registers the new key for their next session (actually, with session key buffering, it'll kick in one session later, but that doesn't really matter here).
  • Whenever a new block is to be authored or finalised or whatever, the key to be used is not fixed. Rather it just checks for all current session keys (i.e. keys that would be valid to sign with in this context) and cross-references them with the indexed list of keys that we control. If and only if there is one, then that key is used. If there isn't, then it's assumed that we're no longer validating and we don't bother signing anything.
  • Keys should be kept in the key database along with an LRU (last recently used) timestamp and a boolean that get flagged when the key is replaced/a new key is generated. Keys that have been unused for over 24 hours and which have been replaced should be removed from the index.

This should be trivial for Babe and Aura. It might require a little help from @rphmeier for Grandpa.

@dvc94ch dvc94ch force-pushed the dvc-rpc-set-key branch from aee2520 to aac6f32 Compare July 7, 2019 10:25
@dvc94ch
Copy link
Contributor Author

dvc94ch commented Jul 7, 2019

in BABE we probably need to start a new authoring future for every key that owns a slot, because slot ownership is associated w/ a VRF proof

Can a validator be assigned more than one slot? In the session module only one key can be active per validator. I'm just taking the first slot that we have a key for.

@rphmeier
Copy link
Contributor

rphmeier commented Jul 7, 2019

@dvc94ch yep, a validator can have more than one slot. What you should do is rather take the first key that owns the slot, if any. So do the vrf-proof on all keys in the current validator set and if there's only one do we author on the slot with that key

@dvc94ch dvc94ch changed the title Adds set_key rpc call and changes aura to use the latest key. Implement session key handover Jul 7, 2019
Copy link
Contributor

@rphmeier rphmeier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should take a different approach in finality-grandpa

@dvc94ch
Copy link
Contributor Author

dvc94ch commented Jul 8, 2019

yeah, the main goal was to get it to build. I'm studying the aura paper you sent me a link to and expect to get through the babe and grandpa papers till tomorrow.

@rphmeier
Copy link
Contributor

rphmeier commented Jul 8, 2019

@dvc94ch OK, great. In the meantime I've checked the logic in Aura and BABE and it seems right.

@dvc94ch
Copy link
Contributor Author

dvc94ch commented Jul 8, 2019

So updated babe and grandpa according to your comments. I think babe is correct now and grandpa selects a key before creating an environment. I'm not sure if the process to select the key is correct. It takes the first key that is in the VoterSet if there is one.

@dvc94ch dvc94ch requested a review from rphmeier July 8, 2019 14:48
Copy link
Contributor

@rphmeier rphmeier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more grumbles & Qs

Copy link
Contributor

@mxinden mxinden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two questions:

@dvc94ch dvc94ch added A4-gotissues and removed A0-please_review Pull request needs code review. labels Jul 9, 2019
@dvc94ch
Copy link
Contributor Author

dvc94ch commented Jul 21, 2019

since the AuthorityKeyProvider requires the aura api and the grandpa api it has to be implemented in the node. Otherwise we introduce dependencies on aura/grandpa to the service.

@gavofyork
Copy link
Member

#3150 merged.

}

impl<Block, Client> GrandpaKeyProvider<Block, Client>
where
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move to end of line above

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and same on further instances of the style

use super::KeyProviderId;

/// Aura key provider.
pub const AURA: KeyProviderId = 10;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Er... no.

core shouldn't be hard-coding consensus-algorithm-related stuff in here.

@burdges
Copy link

burdges commented Jul 25, 2019

@rphmeier :

yep, a validator can have more than one slot. What you should do is rather take the first key that owns the slot, if any. So do the vrf-proof on all keys in the current validator set and if there's only one do we author on the slot with that key

Actually this sounds confusing and/or harmful. In any slot, there should only be one babe or grandpa key from which the network accepts signatures.

I think gav's description sounds correct #3034 (comment) with the caveat that actually only one current babe or grandpa key is valid. At least BABE need the transaction that registers session key changes finalized in an epoch before the epoch switch.

We should likely include a counter in the session key, which I missed before. If I register a session certificate with a lower counter value than currently set, then the transaction is dropped. If I register a session certificate with a higher counter value, then the transaction sets the session certificate for the appropriate future slot and epoch, with babe requiring some waiting time, and grandpa maybe not so much.

We're happy if registering a new session certificate makes block production with babe impossible for a couple epochs, in which case only one current session key exists.

I want this workflow for validator opeators: Joe our validator operator wishes to migrate validator hardware. First, Joe brings up his new validator, which generates a new session keypair. Second, Joe updates his session certificate by signing the new session public key with his controller key and posting the change session certificate transaction. And this transaction finalizes establishing the exact epoch and slot when Joe's old session key becomes invalid and Joe's new session key becomes valid, for babe and grandpa respectively. Third, Joe's old validator stops signing harmlessly once it discovers it does not control any valid session key. Fourth, Joe's new validator begins running grandpa and then babe with its new session key.

In this workflow, there are no two machines that ever know the same session private key, which makes equivocation impossible, and even permits restricting those keys to SGX enclaves. We thus avoid all the needless slashing done by Cosmos so far. ;)

) -> Option<Box<dyn offchain::OffchainKey>> {
let authorities = self.client
.runtime_api()
.grandpa_authorities(at)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't give us the active authority set because GRANDPA authority set changes are only enacted on finality. What we need to do is figure out what the highest finalized ancestor of at is (or at itself), and query the runtime state at that block.

session_store.get_key(public).map(|key| (index, key))
})
.collect::<Vec<_>>();
let (index, key) = if keys.len() == 1 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if there is more than one key? That should never happen, but we need to figure out what to do in that case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rphmeier said that can be handled in a follow up, maybe by someone more familiar with babe...

pub key: Option<String>,

/// Enable validator mode
/// Shortcut for `--aura` and `--grandpa-voter`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this switch on --babe instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These flags should me moved to the node and not be in core

This was referenced Jul 28, 2019
@dvc94ch dvc94ch closed this Aug 5, 2019
@rphmeier
Copy link
Contributor

rphmeier commented Aug 6, 2019

@burdges

In any slot, there should only be one babe or grandpa key from which the network accepts signatures.

BABE slots can definitely be won by more than one key, just due to the probabilistic nature of the VRF. How likely this is depends on parameterization. I should have clarified:

"Do the vrf-proof on all keys in the current validator set that you control and if there's only one do we author on the slot with that key."

This is literally the same as what Gavin described above and is orthogonal to the migration scheme you mentioned.

@gnunicorn gnunicorn modified the milestones: 2.1, Polkadot Mar 4, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

A0-please_review Pull request needs code review.

Projects

None yet

Development

Successfully merging this pull request may close these issues.