-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Task]: endpoint for account state retrieval (FPI) #485
Comments
We don't need to do this for all public accounts - only for ones we are interested in involving in out transactions. But we do have to stay in sync with them. Overall, I think the endpoint could look something like this: message GetAccountStateRequest {
// ID of the account for which we'd like to retrieve the state.
AccountId account_id = 1;
// Keys of storage maps for which we'd like to get values.
repeat StorageMapKey storage_map_keys = 2;
}
message StorageMapKey {
// Index of the storage slot containing the storage map.
uint32 slot_index = 1;
// Key for which we want to request the value.
Digest key = 2;
}
message GetAccountStateResponse {
// Block number at which the state of the account was returned.
fixed32 block_num = 1;
// Account header consisting of account_id, vault_root, storage_root, code_root, and nonce.
AccountHeader header = 2;
// Authentication path from the account_root for the bloke header to the account.
MerklePath account_proof = 3;
/// Values of all account storage slots (max 256).
repeat Digest storage_slots = 4;
// A list of key-value pairs (and their corresponding proofs) for the requested keys.
repeat StorageMapItem map_items = 5;
}
message StorageMapItem {
// Index of the storage slot containing the storage map.
uint32 slot_index = 1;
// Opening containing key, value, and a proof attesting that the key opens to the value.
SmtOpening opening = 2;
} Some open questions:
|
I propose the next interface: import "smt.proto";
import "merkle.proto";
message GetAccountStateRequest {
// List of account state requests.
repeated AccountStateRequest account_state_requests = 1;
}
message AccountStateRequest {
// ID of the account for which we'd like to retrieve the state.
account.AccountId account_id = 1;
// Keys of storage maps for which we'd like to get values.
repeated StorageMapKey storage_map_keys = 2;
// List of asset IDs (asset keys in vault SMT) for which we'd like to get values.
// If empty, all assets will be returned.
repeated digest.Digest asset_ids = 3;
}
message StorageMapKey {
// Index of the storage slot containing the storage map.
uint32 slot_index = 1;
// Key for which we want to request the value.
digest.Digest key = 2;
}
message GetAccountStateResponse {
// Block number at which the state of the account was returned.
fixed32 block_num = 1;
// List of account state infos for the requested account keys.
repeated AccountStateResponse account_state_infos = 2;
}
message AccountStateResponse {
// Account header consisting of account_id, vault_root, storage_root, code_root, and nonce.
/*account.*/AccountHeader header = 1;
// Authentication path from the account_root for the block header to the account.
merkle.MerklePath account_proof = 2;
/// Values of all account storage slots (max 255).
repeated digest.Digest storage_slots = 3;
// A list of key-value pairs (and their corresponding proofs) for the requested keys.
repeated StorageMapItem map_items = 4;
// A list of assets (and their corresponding proofs) for the requested asset IDs.
repeated smt.SmtOpening assets = 5;
}
message AccountHeader {
// Account ID.
account.AccountId account_id = 1;
// Vault root hash.
digest.Digest vault_root = 2;
// Storage root hash.
digest.Digest storage_root = 3;
// Code root hash.
digest.Digest code_root = 4;
// Account nonce.
uint32 nonce = 5;
}
message StorageMapItem {
// Index of the storage slot containing the storage map.
uint32 slot_index = 1;
// Opening containing key, value, and a proof attesting that the key opens to the value.
smt.SmtOpening opening = 2;
} I took into account all the questions below:
For each account requested I added optional list of asset IDs. The response contains information necessary to recreate vault SMT with only requested assets.
In the proposed solution we are able to request multiple accounts in a single request.
I think, it's enough to return just Two things are unclear for me:
What do you think, @bobbinth, @Mirko-von-Leipzig? |
The interface looks sensible to me.
I think one of the discussed options was that we always return all scalar values by default, and that non-scalar data must be requested by the interface you've defined. Alternatively we need additional params; but I think just always sending all of it is fine. This really depends on the usage patterns.
I don't think we should default to returning all assets. This would end poorly for accounts with many assets (are these bounded in some way?). These sorts of endpoints very quickly become DoS vectors - even by well-intentioned callers. We will likely have to limit the query amounts at some stage in the future. So we should definitely not begin with Is it possible for a caller to not know about all assets/keys? i.e. is there a use case for "send me all of it" so that the user can learn about unknown data? |
This would be meaningful only for fungible assets - so, not sure we should do it this way. What I would probably do for now is add an optional flag - something like
As @Mirko-von-Leipzig mentioned, we should always return the full list of scalar values for all storage slots. In the worst case, this would be 8KB of data - but in most cases will probably be less than a few hundred bytes. I've also realized that for storage slots, we need to return also slot types. So, the response would look something like this: message AccountStateResponse {
// Account header consisting of account_id, vault_root, storage_root, code_root, and nonce.
account.AccountHeader header = 1;
// Authentication path from the account_root for the block header to the account.
merkle.MerklePath account_proof = 2;
/// Values of all account storage slots (max 255).
StorageHeader storage_header = 3;
// A list of key-value pairs (and their corresponding proofs) for the requested keys.
repeated StorageMapItem map_items = 4;
// An optional list of all assets in the account.
repeated bytes assets = 5;
} In the above, storage slots are defined as storage header - which would be similar to the StorageHeader struct we recently introduced in Also, for assets I used just bytes because describing them in protobuf format may be tricky because we have fungible and non-fungible assets.
Totally agreed. We'll need to think of ways how to handle accounts with lots of storage and many assets. The main issue is how to return all desired data to client without locking the database for too long while keeping the data consistent. This is all due to the fact that we keep only the latest account states, and so, we can't return data "in chunks" against some old state. I don't have a great solution to this yet. |
If we store the account update diffs then we can do this by paginating. Some Ethereum client do their initial state sync like this, and starknet has similar plans for p2p. Each page we return is a chunk of continuous storage. Because its a continuous chunk, the proof of the entire chunk consists of a proof of the left-most and right-most element. The chunks can originate from different blocks. Once all chunks are received, we also need the account storage diff from the earliest to the latest block in the set. We apply these diffs and now we should be at the state of the latest block. If we don't want to store all the account diffs (I'm still not familiar with what we do/don't store), we could do it on an adhoc basis. As in, when the node receives an account sync request it starts storing the updates to that account specifically, until the request chain completes or we determine lack of interest due to a timeout between requests. This is fairly complex to implement though. And also requires the receiver to understand how to build the merkle trees. Storage proofs are merkle trees, right? |
We store the diffs, but currently not super efficiently. We can make it much more efficient though (I thought there was an issue for that, but I can't find it now).
I think we an assume that if a user wants to retrieve storage, they ether know exactly what they need (e.g., they know the keys for the maps), or they need the whole thing. In the "whole think" case, we don't actually need to return the Merkle paths - the user will be able to instantiate account storage from raw data.
This could work, but seems complicated from the user's standpoint. Would be awesome to simplify it somehow - but if we are dealing with a truly big state (e.g., 100 MB), not sure if there is a simple solution. |
The proofs are protection against bad chunk data in a p2p context; so if the caller 100% trusts us then no need for the proofs; or they can do the final check at the very end.
If we're storing the diffs then we can also paginate the data at a given block height by aggregating the diffs backwards from the latest. However this can get expensive. |
I think we can limit the use case here. We only need to get this info close to the chain tip - so, maybe we say that this endpoint works as long as you want data within 100 blocks of the chain tip. This way, there won't ever be more than 100 deltas to apply to w/e diff we get. |
I was thinking more about the ideas described in #290 (comment) (i.e., having dedicated tables to store account deltas). |
Resolved by #506 |
Node's components affected by this task
What should be done?
Support FPI (foreign procedure invocation) in the node by adding a new endpoint which outputs the state of a given public account.
FPI is the ability to invoke procedures from other accounts. This requires knowledge of that accounts state to perform, so we need to provide access to this information.
Currently the only way to access random public account data is by staying in sync with the chain via diffs. This is impractical as it requires syncing with all public account (iiuc) and staying in sync. This endpoint will allow fetching the account state once-off.
How should it be done?
rpc
andstore
.What data to include is still up for discussion but a starting point might be to always include the flat state items and to only include the map items specified by the caller.
When is this task done?
New endpoint has been added.
Additional context
See miden-base#847 for the original foreign procedure invocation context.
The text was updated successfully, but these errors were encountered: