-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Implement session key handover #3034
Conversation
|
This suffers from the problem I mentioned earlier, which is that users are never going to set their key exactly on the end of a session. it will always be a bit before or after. If all nodes set their new keys a bit before the end of the session, no nodes will author blocks, and the new session will never start until someone intervenes manually. So we probably need it to author on both keys for some time. |
|
So the way this should work:
This should be trivial for Babe and Aura. It might require a little help from @rphmeier for Grandpa. |
Can a validator be assigned more than one slot? In the session module only one key can be active per validator. I'm just taking the first slot that we have a key for. |
|
@dvc94ch yep, a validator can have more than one slot. What you should do is rather take the first key that owns the slot, if any. So do the vrf-proof on all keys in the current validator set and if there's only one do we author on the slot with that key |
rphmeier
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should take a different approach in finality-grandpa
|
yeah, the main goal was to get it to build. I'm studying the aura paper you sent me a link to and expect to get through the babe and grandpa papers till tomorrow. |
|
@dvc94ch OK, great. In the meantime I've checked the logic in Aura and BABE and it seems right. |
|
So updated babe and grandpa according to your comments. I think babe is correct now and grandpa selects a key before creating an environment. I'm not sure if the process to select the key is correct. It takes the first key that is in the |
rphmeier
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some more grumbles & Qs
mxinden
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two questions:
|
since the AuthorityKeyProvider requires the aura api and the grandpa api it has to be implemented in the node. Otherwise we introduce dependencies on aura/grandpa to the service. |
|
#3150 merged. |
| } | ||
|
|
||
| impl<Block, Client> GrandpaKeyProvider<Block, Client> | ||
| where |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move to end of line above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and same on further instances of the style
| use super::KeyProviderId; | ||
|
|
||
| /// Aura key provider. | ||
| pub const AURA: KeyProviderId = 10; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Er... no.
core shouldn't be hard-coding consensus-algorithm-related stuff in here.
Actually this sounds confusing and/or harmful. In any slot, there should only be one babe or grandpa key from which the network accepts signatures. I think gav's description sounds correct #3034 (comment) with the caveat that actually only one current babe or grandpa key is valid. At least BABE need the transaction that registers session key changes finalized in an epoch before the epoch switch. We should likely include a counter in the session key, which I missed before. If I register a session certificate with a lower counter value than currently set, then the transaction is dropped. If I register a session certificate with a higher counter value, then the transaction sets the session certificate for the appropriate future slot and epoch, with babe requiring some waiting time, and grandpa maybe not so much. We're happy if registering a new session certificate makes block production with babe impossible for a couple epochs, in which case only one current session key exists. I want this workflow for validator opeators: Joe our validator operator wishes to migrate validator hardware. First, Joe brings up his new validator, which generates a new session keypair. Second, Joe updates his session certificate by signing the new session public key with his controller key and posting the change session certificate transaction. And this transaction finalizes establishing the exact epoch and slot when Joe's old session key becomes invalid and Joe's new session key becomes valid, for babe and grandpa respectively. Third, Joe's old validator stops signing harmlessly once it discovers it does not control any valid session key. Fourth, Joe's new validator begins running grandpa and then babe with its new session key. In this workflow, there are no two machines that ever know the same session private key, which makes equivocation impossible, and even permits restricting those keys to SGX enclaves. We thus avoid all the needless slashing done by Cosmos so far. ;) |
| ) -> Option<Box<dyn offchain::OffchainKey>> { | ||
| let authorities = self.client | ||
| .runtime_api() | ||
| .grandpa_authorities(at) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't give us the active authority set because GRANDPA authority set changes are only enacted on finality. What we need to do is figure out what the highest finalized ancestor of at is (or at itself), and query the runtime state at that block.
| session_store.get_key(public).map(|key| (index, key)) | ||
| }) | ||
| .collect::<Vec<_>>(); | ||
| let (index, key) = if keys.len() == 1 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if there is more than one key? That should never happen, but we need to figure out what to do in that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rphmeier said that can be handled in a follow up, maybe by someone more familiar with babe...
| pub key: Option<String>, | ||
|
|
||
| /// Enable validator mode | ||
| /// Shortcut for `--aura` and `--grandpa-voter`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this switch on --babe instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These flags should me moved to the node and not be in core
BABE slots can definitely be won by more than one key, just due to the probabilistic nature of the VRF. How likely this is depends on parameterization. I should have clarified: "Do the vrf-proof on all keys in the current validator set that you control and if there's only one do we author on the slot with that key." This is literally the same as what Gavin described above and is orthogonal to the migration scheme you mentioned. |
TODO:
cc @joepetrowski