Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

RPC metadata with types #11181

Closed
ascjones opened this issue Apr 7, 2022 · 22 comments
Closed

RPC metadata with types #11181

ascjones opened this issue Apr 7, 2022 · 22 comments
Labels
Z2-medium Can be fixed by a coder with good Rust knowledge but little knowledge of the codebase.

Comments

@ascjones
Copy link
Contributor

ascjones commented Apr 7, 2022

Currently all client implementations must maintain manual mappings for available RPC methods.

We should provide metadata for the available RPC calls, together with scale-info type information for the parameters and return types. The metadata for an RPC call could take the following form:

struct RpcMetadata<T: scale_info::form::Form> {
    name: T::String,
    args: Vec<(T::String, T::Type)>,
    return_type: T::Type,
}

So this metadata would need to be generated at the site of the RPC definition, and then collected together with other RPC definitions and made available via an RPC call itself.

Raised in this SE question: https://substrate.stackexchange.com/questions/1631/rpc-full-type-specification-from-metadata. Related: paritytech/subxt#416

@ascjones ascjones added the Z2-medium Can be fixed by a coder with good Rust knowledge but little knowledge of the codebase. label Apr 7, 2022
@bkchr
Copy link
Member

bkchr commented Apr 7, 2022

I'm still not convinced that we need this. We already had this discussion once here: #10065

@jacogr
Copy link
Contributor

jacogr commented Apr 7, 2022

At least a full definition of the input/output types are very useful. The issue is that RPCs are the one area where people still need to define their own types and inject it into SDK's - so if you change Header (there are a couple of teams), on the runtime layer you are all ok (assuming you have something that exposes that), however on the RPC layer you still need to go and inject the specific types.

It applies to each RPC, for instance the latest addition, dev_getBlockStats also needed new types to be added to stuff like the JS API

This is the one area where, as soon as you customize types or adding new endpoints, it requires code updates somewhere. (Your own chain, it gets added to your published JS type bundle, on the stock-standard Substrate/Polkadot/Kusama chains, this means you need to update your JS API version.

Where I do agree is that it is not as disruptive as runtime types and changes (I actually remember the days before we had metadata and it was absolute chaos), however it doesn't mean that it would not remove some real painpoints out there, not just for me, but actual chain devs as well.

@bkchr
Copy link
Member

bkchr commented Apr 7, 2022

I mean I'm not against having some proper documentation, this is for sure required. @tomaka's approach here looks really good.

This is the one area where, as soon as you customize types or adding new endpoints, it requires code updates somewhere. (Your own chain, it gets added to your published JS type bundle, on the stock-standard Substrate/Polkadot/Kusama chains, this means you need to update your JS API version.

While I get this, it would still need the "end user dev" to change its code to use the new RPC method. We could cut out the middle man, aka you, to add these new rpc methods. However, I really hope that with the new RPC interface description the core RPCs don't get changed that often. If someone defines their own RPC for their chain, adding support for it in their own libs is probably not that hard?

@tomaka
Copy link
Contributor

tomaka commented Apr 7, 2022

This feels very strongly like solving an XY problem.

Adding metadata to the RPC calls that can be automatically parsed is cool if your objective is to just query data through an RPC call and display the response to the user.

If you want to actually programmatically interpret the response (which is arguably what the JSON-RPC API is about), then this metadata is useless. You can't for example write a program that automatically understands what the fields in a header are. You need to assume for example that this header contains a block number and a parent hash. In other words, generating metadata for the RPC calls would only duplicate the information that you need to put in the program that calls the functions.

@jacogr
Copy link
Contributor

jacogr commented Apr 7, 2022

The argument doesn't quite hold - it is like saying that exposing storage types via metadata is also useless, i.e. you still need to know the fields to interpret it so you may as well not define the types. And yes, what is very true is that when there are changes, you need updates to the code using it. This is where this metadata actually goes a massive way to tell developer what changes there are. Real-world example:

A good example is the runtime metadata - when an upgrade happens, you can point your type generator to the metadata and it will (a) pickup the changes compared to last & (b) generate interfaces for you. (In my specific sphere, these are TypeScript definitions that are generated). This basically allows you to already see statically where you need to make changes since invalid access to anything changed gets complained about up front.

I certainly do know really well that the changes on this layer (additions to RPCs, not actual type changes) are quite easy to detect and adjust to - there are warnings for instance on my side of the world that will complain when it finds something unknown. And in this case the PRs are always simple, so it doesn't add massive overhead - it normally gets juggled between coffee and getting a kid ready for school, so it doesn't need 200% superhuman focus. It certainly is also "relatively stable" in terms of types, but because we manually force it to be that way, aka don't touch the RPC external interfaces.

@tomaka
Copy link
Contributor

tomaka commented Apr 7, 2022

What, for example, if the RpcMetadata struct that Andrew suggest above is modified?
You could add some meta-metadata that contains metadata about RpcMetadata, but then this meta-metadata could be modified as well. If you add a meta-meta-metadata, then it can be modified as well.
At some point you need a stable foundation that is standard and is the same everywhere and whose definition is duplicated everywhere, and I think we all agree that if we add this RpcMetadata then it should never have any breaking change.

My point, however, is that this foundation shouldn't be RpcMetadata, as proposed here, but instead the API of the JSON-RPC functions themselves.

The API that I've put up at https://paritytech.github.io/json-rpc-interface-spec/ is designed to be the same for every Substrate-based chain, and doesn't requiring interpreting any SCALE-encoded piece of data in order to be used, thus rendering RpcMetadata useless.

@jacogr
Copy link
Contributor

jacogr commented Apr 7, 2022

The modification of the RpcMetadata struct, like with Metadata is alongside a version bump. The rpc_methods already provides it, but in this case I would suggest an endpoint in the same fashion as getMetadata(), i.e. no re-use although the version can be changed, since you need to convey information.

  • in the runtime world, when the description structure changes, the version is bumped
  • in the RPC world, when the description structure changes, the version is bumped

As soon as you get SCALE data returned, you still need to know what type it is to decode it. What the rpc_metadata does is supply those types. If the interface does change along the way to "other methods" and they are on feature parity, then sure, but for instance in the dev_getBlockStats example would still like to know how to decode what is returned. (Storage is very different, since the types are available)

In your example, for instance I see functions that return of both Header and Block. For these the structure is important. As is types in those structures. We say for instance "String containing an opaque value representing the operation" so need to know how to do the opaque.

Is BlockNumber a u32 or a u64? Are there extra fields that are not in the Substrate Header? So the types remain. This is what the RPC metadata is supposed to address - the type definitions. It is no less important than on the runtime knowing these, although it is certainly less likely to change.

@dvdplm
Copy link
Contributor

dvdplm commented Apr 7, 2022

The API is the stable foundation, I don't think anyone here is arguing otherwise. Adding a machine readable and versioned description of that API's arguments and return types seems like a useful addition to me and I don't see a contradiction between the two.
I'm more sensible to @bkchr's argument of "do we really need this?"

@tomaka
Copy link
Contributor

tomaka commented Apr 7, 2022

There's no contradiction between the two, I'm also defending the point of view that this is a useless addition that adds more complexity again and again.

To me this proposal is a XY problem. The X problem is "the JSON-RPC API has flaws in its design", and this is properly solved by having a better JSON-RPC API, not by adding RPC metadata.

@tomaka
Copy link
Contributor

tomaka commented Apr 7, 2022

In your example, for instance I see functions that return of both Header and Block. For these the structure is important. As is types in those structures. We say for instance "String containing an opaque value representing the operation" so need to know how to do the opaque.

This is a string representing the operation, similar to a subscriptionId. It must not be interpreted by the client in any way.

Is BlockNumber a u32 or a u64? Are there extra fields that are not in the Substrate Header? So the types remain. This is what the RPC metadata is supposed to address - the type definitions. It is no less important than on the runtime knowing these, although it is certainly less likely to change.

In my new API you never need to know the block number in order to use the API. For example you can follow storage items without any problem without having to decode a header at any point.

If you need to know the block number, the BlockNumber type and the format of Header are extractable (or if they're not, they should be) from the on-chain metadata. And you don't need to know anything about the chain in order to obtain the on-chain metadata.

@tomaka
Copy link
Contributor

tomaka commented Apr 7, 2022

As a more meta remark, I think that proposals need to be better argued, presenting some context and motivation, rather than saying "hey I want to do X, what do you think?". One should explain why a change is desired, what its drawbacks are, and what the possible alternatives are and why they're not better than the proposed change.

@jacogr
Copy link
Contributor

jacogr commented Apr 7, 2022

Ok. So I still want the header - I need to access it to get the parent hash and logs as an example. So I cannot get away from decoding it. Any API that doesn’t give me headers or blocks doesn’t quite expose the chain :)

You are only thinking of storage. It is a solved problem since we have the types in metadata. I still want to follow blocks and headers and get other responses and know how to decide them.

Headers and not available in the metadata in the runtime, neither is the block structure. As you rightly said it should be in the metadata - and that is exactly what we are solving here in this issue.

We are exposing all types that are not in the metadata and is used in RPCs in metadata. That is the intent of this issue.

@ascjones
Copy link
Contributor Author

ascjones commented Apr 7, 2022

As a more meta remark, I think that proposals need to be better argued, presenting some context and motivation

FWIW I am agnostic on this feature, I didn't spend a lot of time thinking about it (hence the underspecified issue). I have received multiple requests for this, so worth putting an issue to point people at if nothing else, and to spark the debate.

I think this might be useful, but the question is would it be useful enough to be worth the costs (implementation, complexity etc.).

One pro would be to avoid having to copy rpc request/response types e.g. https://github.com/paritytech/cargo-contract/blob/3db4734c1a056f8834f0b0e35a77a4ef334fc79f/src/cmd/extrinsics/call.rs#L164. OTOH they may not change that often so 🤷

@tomaka
Copy link
Contributor

tomaka commented Apr 7, 2022

Ok. So I still want the header - I need to access it to get the parent hash and logs as an example. So I cannot get away from decoding it. Any API that doesn’t give me headers or blocks doesn’t quite expose the chain :)

The new API gives you the parent hash of each block when it notifies it: https://paritytech.github.io/json-rpc-interface-spec/api/chainHead_unstable_follow.html#newblock

@jacogr
Copy link
Contributor

jacogr commented Apr 7, 2022

I'm not fighting your specification, I'm saying I still need to decode the encoded values inside it and have a specification for it.

If I request and receive a Header, I need to know what is in it and how to decode it - these are not fixed, chain add fields to their own specification. When I get a block body, the same thing. If I request data from a contract RPC, I need to know what is inside it and how to decode it. Unless I want to still continue answering the same question because a structure changed on an RPC call.... incidentally the same link Andrew pointed to above.

The RPCs have an insane amount of types that go in and out, not just storage, blocks or headers. A sample - ExtrinsicStatus (it is an enum, it is not fixed), EpochAuthorship, BeefySignedCommitment, Header, SignedBlock, ContractCallRequest, ContractExecResult, ContractInstantiateResult, CodeUploadRequest, CodeUploadResult, BlockStats, CreatedBlock, Justification, EncodedFinalityProofs, ReportedRoundStates, JustificationNotification, MmrLeafProof, StorageKind, RuntimeDispatchInfo, FeeDetails, ... there are more, just grabbed the first one that fit on a screen.

I certainly welcome a well-specified RPC interface, and quite happy to hardcode those if it returns data, it works. The only comment you will have from me on any replacements RPCs is that they return the same goodies, if they do and it works better, great.

But types returned and that needs to go in... different story altogether. It needs to have a definition.

@tomaka
Copy link
Contributor

tomaka commented Apr 7, 2022

If you take for example all the contracts-related JSON-RPC calls, they do nothing but call in the runtime. In other words, they're literally nothing more but shortcuts for the state_call function. And the list of runtime calls and the format of their parameter and return value, including the ones related to contracts, is already in the on-chain metadata.

Don't you find this design completely insane? We're adding RPC shortcuts that add no value, and then we realize that the shortcuts are unusable, so we add even more features to make the shortcuts usable? Can we not instead get rid of 2/3rds of the JSON-RPC functions and have a coherent well-defined API that doesn't require even more metadata?

Right now if the runtime upgrades and modifies the contract-related type definitions, the on-chain metadata will be updated, but the JSON-RPC function APIs will not and will start returning errors. Isn't that an obvious sign of a critical flaw in the design?

@jacogr
Copy link
Contributor

jacogr commented Apr 7, 2022

I love a coherent API - where I can map the calls to be encoded and the responses to be decoded.

RpcCallRequest does not appear in the metadata (you can check here for all types extracted on Substrate master) since -

(a) it is not used in storage
(b) it is not used in an event
(c) it is not used in an extrinsic
(d) it is not used in an error
(e) ... it is only used on RPC, these are not in metadata when only a-d is specified

Point (e) is what I would like fixed and what this issue is all about :) This applies to any RPC interface there, new or old (probably easier in old since it is sprawling, but quicker in new when kept in mind from the start).

@tomaka
Copy link
Contributor

tomaka commented Apr 7, 2022

RpcCallRequest is just the list of parameters of ContractsApi_call, right?
If so, then they are in the metadata that you've linked.

Again, RpcCallRequest is just part of a shortcut for state_call. You could simply directly call state_call with the list of parameters.

@jacogr
Copy link
Contributor

jacogr commented Apr 7, 2022

You are linking to a call which takes discrete params, the RPC interface takes a struct. So basically in this case the SDK developers should to map the runtime call and create and own struct out of it to pass to the defined RPC? This means I need to have an in-depth understanding of the RPC myself to re-implement it.

I'm not suggesting the RPC should be there - I'm suggesting that I cannot and really should not make my own stuff up here. If that is the exposed interface, all good that is the way it should be called as documented.

TL;DR 100% in agreement here that if there is a state call available since it mimics and that call is metadata specified, it doesn't really make sense to duplicate it again with a specific RPC. If there is an RPC however, it needs to be specified.

PS: The sp_api::decl_runtime_apis! is not exposed in the runtime metadata, only extrinsics are call-like (these obviously don't have a return type). You can check searching for instance for get_storage. The call & instantiate have equivalent extrinsics.

@seunlanlege
Copy link
Contributor

Having rpc metadata greatly simplifies the lives of people who directly use substrate.

Here i am writing a relayer and having to have a tab open of my custom rpc methods, when intellisense + subxt generated types would've made this a breeze.

the old jsonrpc actually supported this by generating a client interface, i'm not sure if the new one does.

@xlc
Copy link
Contributor

xlc commented Aug 12, 2022

I think this one could be closed in favor of #11648
Implementation-wise it is more easier to annotate runtime APIs compare to custom RPCs.

@ascjones
Copy link
Contributor Author

Agreed

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Z2-medium Can be fixed by a coder with good Rust knowledge but little knowledge of the codebase.
Projects
None yet
Development

No branches or pull requests

7 participants