Skip to content
This repository was archived by the owner on Nov 15, 2023. It is now read-only.

Conversation

@gavofyork
Copy link
Member

@gavofyork gavofyork commented Jul 26, 2019

This PR is turning into a major refactoring over the crypto and key management system, which seems both long overdue and somewhat inescapable.

I'll want to integrate the general changes in (but not merge) #3034 and possibly #3193 together with #3137.

The main issues that this PR aims to solve are:

  • Independent subsystems that happened to use the same crypto had to share the same session keys (and key-store division).
  • Assumption that the twin consensus systems of block production (Aura/Babe) and finality (Grandpa) each required keys (which dirtied a lot of Substrate, placing APIs like fn fg_authority_key() all the way through the codebase). Though they do now, they won't forever and it shouldn't be an assumption that we bake in to internal APIs.
  • Keys passed in through CLI are sticky; their existence enabled the consensus system workers and force them to use these keys alone. Rather whatever key in the key store that happens to be an authority should be used at any given time.

TODO:

  • Refactor ImOnline:

    • Should get its own key and not piggy-back on other session keys.
  • Dynamic key selection:

    • Enable Babe and Grandpa to be able to determine current authority key (if there is one).
    • Dynamically select the correct authoring key to use in each subsystem.
    • Use --validator to enable active validation process; remove any validator key-management stuff from the CLI (--key). (A second --worker arg can enable off-chain workers.)
    • Remove all static-key selection code (--key all the way down to key-choice).
    • Remove authority_key/fg_authority_key APIs completely.
    • Make --password work (should accept a manually-entered password and use it for everything in the keystore).
    • Introduce RPCs for session rotation, using code from Implement session key handover #3034 if it makes sense. There should be a single RPC author_rotateKeys which creates a new set of session keys, saves them to the key-store and returns their public keys, so that a rotation-transaction can be constructed. There should be no pollution of the APIs with session keys: the rotation API should be wholly independent of each of the individual keys. This may mean new APIs in the OpaqueKeys trait.
  • Refactor Client:

    • Create a basic Client trait that provides all the basic APIs and has Block as an associated item.

I would also like to solve the fact that service construction is highly monolithic and polluted by consensus-module-specific code. Really it should just be a case of specifying the block production and finality systems and letting it do the rest. Code for set-up, take-down, RPC and key-store integration should be provided through traits that the consensus modules (Babe, Aura, Grandpa and eventually Rho and PoW) implement within their module's scope. There shouldn't be any consensus-module-specific logic dirtying up service.rs. That might be a different PR, though.

  • Refactor service:
    • Move out any consensus-module-specific code, and hardwired assumptions that they each have "authority keys", from service.rs into the respective consensus modules. Provide a single API endpoint (i.e. a type that implements some ConsensusSystem trait and have that as the only resource that service needs know of.

This also adds:

  • Add Call type to extensible transactions.
  • Add swap to allow two storage items in the same map to be swapped.

@gavofyork gavofyork requested a review from kianenigma July 26, 2019 02:31
@gavofyork gavofyork added the A0-please_review Pull request needs code review. label Jul 26, 2019
@kianenigma
Copy link
Contributor

This could potentially deprecate the info. Both the inner and outer Calls implement GetDispatchInfo which can be used to get the same struct info being passed down from executive to signed extension.

Just needs some test-fixing as well, with which I could help if needed.

@gavofyork
Copy link
Member Author

yeah that'd be good. can also merge Resource and BlockExhausted

@gavofyork gavofyork added A2-insubstantial Pull request requires no code review (e.g., a sub-repository hash update). A3-in_progress Pull request is in progress. No review needed at this stage. and removed A0-please_review Pull request needs code review. A2-insubstantial Pull request requires no code review (e.g., a sub-repository hash update). labels Jul 27, 2019
@gavofyork gavofyork changed the title Add Call type to extensible transactions. Refactor key management Jul 28, 2019
@kianenigma
Copy link
Contributor

Signed-Extension-related tests should work fine now; I have the logic needed to throw out info from extension API (as mentioned, totally redundant with the new call parameter) packed in one commit but not sure if you want me to add it into this PR or not (given the big changes planned)?

#[derive(Clone, Copy, PartialEq, Eq, Encode, Decode)]
#[cfg_attr(feature = "std", derive(Debug))]
#[repr(u32)]
pub enum Kind {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason to avoid using an enum is that we will never be able to exhaustively define all crypto types. This change makes it impossible to introduce custom crypto without requiring an upstream pull request

Copy link
Member Author

@gavofyork gavofyork Jul 29, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced it would be otherwise trivial to add new kinds of crypto. Realistically, that crypto would need extern and/or off-chain crypto support which would necessitate alterations to core. There could be a crypto Kind of User, which would basically mean "core doesn't know anything about it". In any case, this would not be needed any time soon so can go in a separate PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should core need to know anything about any of the kinds of crypto?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mainly for offchain worker stuff. Hopefully we can remove this.

@rphmeier
Copy link
Contributor

rphmeier commented Jul 28, 2019

It would also be helpful to ensure that the model does not make this less possible: #3063 (comment) . Ideally the slashing subsystems don't have to make any assumption about what kind of crypto is used for transactions, and we should make sure that this PR doesn't introduce anything in that direction.

One reason that I've wanted KeyTypeId to exist is that it allows the runtime to make pretty much all crypto-based decisions. If we have a general keystore keyed by KeyTypeId, the runtime can tell us what kind of crypto to use and we will simply search our keystore for it. This enables Substrate to become extensible with user-crypto, as long as the user code adds support for registering a new kind of key.

@gavofyork
Copy link
Member Author

gavofyork commented Jul 29, 2019

@kianenigma If it can be applied here, then I don't see why not.

@rphmeier The store shouldn't be searched for keys of a certain type of crypto, it should be keys from a certain subsystem. If two subsystems happen to use the same underlying crypto, then it should not entail them sharing the same keys. This appears to be a recurring point and it was a Big Problem with the previous design.

Kind/CryptoType is only used by core to identify keys so that type-based crypto operations work in the offline-worker API. As I point out in the code, by using KeyTypeId as an application identifier, we can fully support user key extensibility without the problem of having to trespass on other subsystems' divisions of the key store.

@gavofyork gavofyork added this to the 2.0-k milestone Jul 29, 2019
/// Key type for controlling an account in a Substrate runtime, built-in.
pub const ACCOUNT: KeyTypeId = *b"acco";
/// Key type for Aura module, built-in.
pub const AURA: KeyTypeId = *b"acco";
Copy link
Contributor

@dvc94ch dvc94ch Jul 29, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo, should be *b"aura"

@kianenigma kianenigma mentioned this pull request Jul 29, 2019
@dvc94ch
Copy link
Contributor

dvc94ch commented Jul 29, 2019

So does the keystore have to persist keys to disk or not? I was under the impression that that was not the case, since the only keys in the keystore are session keys which should only be stored in memory according to a discussion on riot.

@bkchr
Copy link
Member

bkchr commented Jul 29, 2019

So does the keystore have to persist keys to disk or not? I was under the impression that that was not the case, since the only keys in the keystore are session keys which should only be stored in memory according to a discussion on riot.

When you generate new session keys and only know the public key, what will you do when your node crashes? Let's say the node crashes and you already send the transaction to change the session key. You could not participate with this session key, as you have lost it. Sounds like we should persist them?

@dvc94ch
Copy link
Contributor

dvc94ch commented Jul 29, 2019

The previous thinking was that the client generates/persists the session keys and configures them via rpc.

@bkchr
Copy link
Member

bkchr commented Jul 29, 2019

Introduce RPCs for session rotation, using code from #3034 if it makes sense. There should be a single RPC author_rotateKeys which creates a new set of session keys, saves them to the key-store and returns their public keys, so that a rotation-transaction can be constructed. There should be no pollution of the APIs with session keys: the rotation API should be wholly independent of each of the individual keys. This may mean new APIs in the OpaqueKeys trait.

Seems not to be wanted anymore.

@gavofyork
Copy link
Member Author

Seems not to be wanted anymore.

Oh? Why?

@bkchr
Copy link
Member

bkchr commented Jul 29, 2019

Maybe I misunderstood @dvc94ch, but didn't he talked about sending the private key via RPC? Or is client the substrate-client? If yes, why should substrate-client persist the keys not using keystore?

As I understood your comment, you call author_rotateKeys and it generates new session key pairs. These pairs are persisted using the keystore and as answer of the RPC you get the public keys of the generated keys. So, the private will never leave the node.

@dvc94ch
Copy link
Contributor

dvc94ch commented Jul 29, 2019

I read the PR text, I was just wondering why the change. I believe @joepetrowski and @gavofyork used to be of the opinion that offloading key management to the client is a good idea for security reasons, although I'm not really sure why that was.

@bkchr
Copy link
Member

bkchr commented Jul 29, 2019

I still don't know what you referring to when you talk about "client"^^

@dvc94ch
Copy link
Contributor

dvc94ch commented Jul 29, 2019

Wallet if you prefer...

@tomaka
Copy link
Contributor

tomaka commented Jul 29, 2019

I would also like to solve the fact that service construction is highly monolithic and polluted by consensus-module-specific code. Really it should just be a case of specifying the block production and finality systems and letting it do the rest. Code for set-up, take-down, RPC and key-store integration should be provided through traits that the consensus modules (Babe, Aura, Grandpa and eventually Rho and PoW) implement within their module's scope. There shouldn't be any consensus-module-specific logic dirtying up service.rs. That might be a different PR, though.

I'm slowly working on this, but considering the amount of changes I'm waiting for all PRs that touch the service to land before actually doing that.

@gavofyork
Copy link
Member Author

the idea is that both models are supported (user pushing private keys via RPC to the node and having the user pull public keys from node over RPC).

gavofyork and others added 3 commits July 30, 2019 00:53
This will eventually allow for dynamic determination of authority
keys and avoid having to set them directly on CLI.
Experiment with modular consensus API.
+ HeaderBackend<<Self::Client as BlockOf>::Type>;

/// Initialize the consensus service.
fn initialize(client: &'a Self::Client, keystore: Arc<Store>) -> Self;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There doesn't seem to be a good reason why we should have service objects at all.

The design of the pluggable consensus system was intended to avoid over-assumption about what resources a consensus system might need access to, and I'm afraid that's exactly what this trait function is -- it's omitting at a bare minimum network and event-loop access.

The primitive of "some thing that can be driven to completion" (i.e. a Future) is much less fragile.
I understand the motivation of wanting to avoid users having to write any consensus-specific initialization in their service files, but it should be 1 line per consensus engine at most, if we instead go the route of consensus crates exposing an initialize function (without fixed type signature!).

That is much more anti-fragile compared to a bunch of trait machinery/macro magic trying to create an idealized declarative system that will almost certainly have to be ripped up later.

Copy link
Contributor

@tomaka tomaka Jul 31, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, considering that this PR is pressing, let's please not introduce API changes that aren't strictly necessary, because discussions around such API changes shouldn't get rushed.

Copy link
Member Author

@gavofyork gavofyork Aug 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was really just a draft design to get people thinking, I'm not familiar enough with the details of the overall consensus framework to do much more. That said...

My concern is that it actually is "1 line per consensus engine at most".

Right now it's a load of incoherent boilerplate babble that chain creators can neither write nor understand. Enforcing a coherent API that clearly provides the functions and hooks that a consensus service may use in a single place would dramatically improve devex.

Right now as a developer looking to create a new consensus system in Substrate, I'm faced with an incoherent mashup of randomly assorted crates that I'm expected to magically understand which I should be using where, how they get initialised, referenced and driven.

Over-generalising might technically enable everything, but unless you take care to provide an appropriate environment and API rails, you'll end up with nothing.

@gavofyork
Copy link
Member Author

Closed in favour of #3296

@gavofyork gavofyork closed this Aug 3, 2019
@Demi-Marie Demi-Marie deleted the gav-ext-tx-call branch August 19, 2019 17:36
@gnunicorn gnunicorn modified the milestones: 2.1, Polkadot Mar 4, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

A3-in_progress Pull request is in progress. No review needed at this stage.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants