Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch eth_getStorageAt #23925

Closed
FeurJak opened this issue Nov 17, 2021 · 29 comments
Closed

Batch eth_getStorageAt #23925

FeurJak opened this issue Nov 17, 2021 · 29 comments

Comments

@FeurJak
Copy link

FeurJak commented Nov 17, 2021

Is there an available method to do a batch eth_getStorageAt to retrieve multiple storage variables? I know the exact indexes all-ready, but it's inefficient for me to make a single rpc call per variable.

@FeurJak
Copy link
Author

FeurJak commented Nov 17, 2021

Okay, just discovered eth_getProof.... but this seems to take quite a bit of time just to get 3 variables (~1.2ms)..

@fjl
Copy link
Contributor

fjl commented Nov 18, 2021

You can also use a JSON-RPC batch request. While this may still incur some internal overhead for each storage slot, it will at least save the overhead of sending the requests.

@FeurJak
Copy link
Author

FeurJak commented Nov 18, 2021

You can also use a JSON-RPC batch request. While this may still incur some internal overhead for each storage slot, it will at least save the overhead of sending the requests.

I'm using go & I"m calling "eth_getStorageAt" via an rpclient (i..e. rpclient.CallContext). I can write my own batch request but I believe these requests are still handled one by one internally.

My understanding is that a new stateDB and stateObj has to be created for every request, so I'm looking to create a new function (i.e. getBatchStorageAt, GetBatchState, GetBatchCommitedState...) which will return an array of storage variables based on an array of indexes for 1 address.

@fjl
Copy link
Contributor

fjl commented Nov 18, 2021

I understand you are talking about creating a new method here, just trying to avoid it. Let's explore the existing alternatives before adding a new RPC method.

If the performance of RPC batch is not enough for you, you can also get multiple slots using eth_call. In geth, this method accepts a third parameter containing 'state overrides' (see documentation). To get any number of storage slots, you could encode your request as an EVM program and run it in the context of your target contract.

@fjl
Copy link
Contributor

fjl commented Nov 18, 2021

My understanding is that a new stateDB and stateObj has to be created for every request

This step should be kind of cheap because a lot of things are cached internally. I'd prefer you try the RPC batch first and check if the performance is good enough for your use case.

@FeurJak
Copy link
Author

FeurJak commented Nov 19, 2021

My understanding is that a new stateDB and stateObj has to be created for every request

This step should be kind of cheap because a lot of things are cached internally. I'd prefer you try the RPC batch first and check if the performance is good enough for your use case.

Fair point, I'll measure the performance of all-ready available methods before going forward with creating new methods.

With Eth_Call, not too clear about the stateDiff format as it's not given in the examples, would you know what the format for that parameter is? (i.e. "stateDiff": {"0x0000000000000000000000000000000000000001": {"code": "0x6080604052600760005260206000f3fe"}} ) ?

@georgercarder
Copy link
Contributor

@FeurJak , could your batching needs be met with their GraphQL api?

@s1na
Copy link
Contributor

s1na commented Jan 19, 2022

This is the best graphql query I could come up with for re this issue:

query slots($blocknum: Long!, $addr: Address!, $key0: Bytes32!, $key1: Bytes32!, $key2: Bytes32!) {
  slot0: block (number: $blocknum) {
    account(address: $addr) { storage (slot: $key0)}
  },
    slot1: block (number: $blocknum) {
    account(address: $addr) { storage (slot: $key1)}
  },
    slot2: block (number: $blocknum) {
    account(address: $addr) { storage (slot: $key2)}
  }
}

With the variables being something like (example is a random contract from Goerli):

{
  "blocknum": "6004069",
  "addr":"0xD3909f56dFb3B6F11Fc0d82Ef6022C6485EB0AE0",
  "key0":"0x0000000000000000000000000000000000000000000000000000000000000003",
  "key1": "0x7461cd233bf72e6c1fab8f2ffafce9cd48c96e1b422444c78aeae2cdd60843c8",
  "key2": "0x7461cd233bf72e6c1fab8f2ffafce9cd48c96e1b422444c78aeae2cdd60843c9"
}

This could've been re-written to become a bit shorter had graphql supported parameterized fragments (graphql/graphql-spec#204). Right now this is what we're stuck with which sucks a bit if you have a dynamic and variable list of slots you want to query each time. You'll have to do some string juggling to construct the query.

I'm going to try that next and compare the performance with a doing several eth_getStorageAt as a next step.

@s1na
Copy link
Contributor

s1na commented Feb 1, 2022

Surprisingly for large numbers of storage slots requested GraphQL can be slower than sending a series of separate JSON-RPC requests. I wrote a script which gets the prestate for a block, turns that into an access-list and then does first: a series of eth_getStorageAt calls, and then fills a query such as the one above with all the storage slots. Here are the results for a few mainnet blocks (on a live node, so data noisy). Note the slow graphql response when number of storage slots exceeds 1000:

$ node index.js 14121860
Access-list has 215 accounts with 1430 storage slots
json-rpc requests took 571ms
graphql request took 1214ms

$ node index.js 14121859
Access-list has 48 accounts with 358 storage slots
json-rpc requests took 188ms
graphql request took 73ms

$ node index.js 14121858
Access-list has 92 accounts with 680 storage slots
json-rpc requests took 310ms
graphql request took 233ms

$ node index.js 14121857
Access-list has 219 accounts with 1718 storage slots
json-rpc requests took 749ms
graphql request took 4901ms

$ node index.js 14121856
Access-list has 152 accounts with 1544 storage slots
json-rpc requests took 622ms
graphql request took 1134ms

$ node index.js 14121855
Access-list has 13 accounts with 109 storage slots
json-rpc requests took 72ms
graphql request took 15ms

@FeurJak
Copy link
Author

FeurJak commented Feb 1, 2022

Sorry for not replying to you guys, had no idea that this thread was still alive.

I unfortunately can't go down the GraphQL route for my application.
I ended up with writing my own eth_GetBatchStorageAt on /internal/ethapi/api.go & state.GetStatesFromStorage(addresses,keys) , core/state/statedb.go:

func (s *PublicBlockChainAPI) GetBatchStorageAt(ctx context.Context, addresses []string, keys []string, blockNrOrHash rpc.BlockNumberOrHash) ([]common.Hash, error) { state, _, err := s.b.StateAndHeaderByNumberOrHash(ctx, blockNrOrHash) if state == nil || err != nil { return nil, err } res := state.GetStatesFromStorage(addresses,keys) return res[:], state.Error() }

func (s *StateDB) GetStatesFromStorage(addrs []string, hashes []string) []common.Hash { var _res []common.Hash for _, _addr := range(addrs) { stateObject := s.getStateObject(common.HexToAddress(_addr)) if stateObject != nil { _res = append(_res, stateObject.GetStatesFromStorage(s.db, hashes)...) } else { for i := 0; i < len(hashes); i++ { _res = append(_res, common.HexToHash("")) } } } return _res }

Current solution is very quick.

@s1na
Copy link
Contributor

s1na commented Feb 3, 2022

@FeurJak I'm trying to find the fastest way to get a batch of storage values without having to add a new method to the API and I kinda hijacked your thread to share the results. Sorry about that :) Feel free to mute the thread if you're not interested in this anymore.

By splitting the storage slots and doing multiple GraphQL queries (say each with 100 slots) I managed to gain a 3-5x speedup compared to JSON-RPC. I updated the gist with the new script if anyone's interested. 3-5x is still not the speed-up I was hoping for so I'll keep the thread open for now in case I find another optimization.

@FeurJak
Copy link
Author

FeurJak commented Feb 3, 2022

@s1na I'm very much interested ! I think the batch storage lookup requirement would be needed by many others and the GraphQL looks to be a viable solution without adding a new method to the API.

@FeurJak
Copy link
Author

FeurJak commented Feb 3, 2022

@s1na I am curious as to why it would provide a 3-5x speedup compared to JSON-RPC though.

@fjl
Copy link
Contributor

fjl commented Feb 4, 2022

@s1na Did you find out where the slowdown in big queries comes from? I find it surprising that this splitting of queries is necessary.

@s1na
Copy link
Contributor

s1na commented Feb 4, 2022

@fjl almost the whole time is spent here: https://github.com/graph-gophers/graphql-go/blob/eae31ca73eb3473c544710955d1dbebc22605bfe/graphql.go#L212. First loop of the Validation function.

@FeurJak well GraphQL has turned out to be much faster than JSON-RPC in many cases, e.g. ~100x in an experiment I did for fetching receipts for a range of blocks. Mostly because you're packing in a large number of requests in one query and avoiding all the network latency.

Edit: created an issue upstream: graph-gophers/graphql-go#499

@fjl
Copy link
Contributor

fjl commented Feb 4, 2022

Thanks Sina! Great find!

@s1na
Copy link
Contributor

s1na commented Feb 16, 2022

So it turns out one of the specified validation rules (OverlappingFieldsCanBeMerged ) has quadratic complexity which causes my query to blow up. So I modified the query to minimize the number of selection fields. Now it looks like this:

query slots {
  block {
    account0: account(address: "0x...") {
      slot0: storage (slot: "0x..."),
      slot1: storage (slot: "0x...")
    },
    account1: account(address: "0x...") {
      slot0: storage (slot: "0x...")
    }
  }
}

In the meanwhile I also prototyped a JSON-RPC batch request. You can find the implementation for both in the same gist. Now to some results:

$ node index.js 14132672
Access-list has 245 accounts with 1896 storage slots
json-rpc requests took 656ms
16 graphql request took 88ms
New graphql request took 39ms
Batch json-rpc req took 82ms

$ node index.js 14132673
Access-list has 134 accounts with 1681 storage slots
json-rpc requests took 581ms
11 graphql request took 76ms
New graphql request took 56ms
Batch json-rpc req took 52ms

$ node index.js 14132679
Access-list has 163 accounts with 1160 storage slots
json-rpc requests took 436ms
11 graphql request took 61ms
New graphql request took 22ms
Batch json-rpc req took 34ms

$ node index.js 14132676
Access-list has 209 accounts with 1730 storage slots
json-rpc requests took 580ms
15 graphql request took 104ms
New graphql request took 31ms
Batch json-rpc req took 50ms

$ node index.js 14132681
Access-list has 156 accounts with 1253 storage slots
json-rpc requests took 452ms
10 graphql request took 61ms
New graphql request took 65ms
Batch json-rpc req took 39ms

For anyone needing to fetch many storage slots I recommend first the new GraphQL query, and then the JSON-RPC batch request until there's an update from graphql-go upstream about this validation rule complexity.

@FeurJak
Copy link
Author

FeurJak commented Feb 17, 2022

awesome work s1na, I'll be sure to spread this around to those who have the same issue.

@berktaylan
Copy link

Sorry for bumping this. But i want to get reserves ( getreserves function) for 10k uniswap lp , per block. What is the fastest way to to this. Anyone have idea or recommedation ?

Thanks.

@FeurJak
Copy link
Author

FeurJak commented May 4, 2022

@berktaylan. You can index the storage keys for the reserves for the 10k Lps, and read directly off storage. You can read my blog post (go through tutorial 1 to 3) : https://medium.com/@fejleuros/tutorial-1-part-1-350694af2632

@FeurJak
Copy link
Author

FeurJak commented May 4, 2022

@berktaylan assuming the 10k Lps are just from Uniswap and its a V2 Lp, I believe the storage key is 0x08 for Reserves, so you can utilize the GraphQL query that s1na did to do a batch read of storage.

@FeurJak FeurJak closed this as completed May 4, 2022
@berktaylan
Copy link

@FeurJak hi,

so i tried with 5k lps ( yes they are v2 ) with the batch request its 250 ms minimum and sometimes 1k ms randomly.

with graphql its 1k to 2k ms.

Im planning to go with batch request and aiming for 10k lps. Any tricks to make my batch requests faster ?

Thanks.

@FeurJak
Copy link
Author

FeurJak commented May 5, 2022

@berktaylan which "batch request" method are you referring to?

@berktaylan
Copy link

berktaylan commented May 5, 2022

@FeurJak

for (let i = 0; i < lpv.length; i++) {
reqs.push({
method: 'eth_getStorageAt',
params: [0x${lpv[i]}, "0x08", 'latest'],
id: 0,
jsonrpc: '2.0',
})
}

wssend.send(JSON.stringify(reqs));

Not sure am i doing it correctly, so our goal was run it through db cache instead of EVM ? and i think eth_getStorageAt runs over cache/db not evm ?

@FeurJak
Copy link
Author

FeurJak commented May 5, 2022

Yes that's correct, eth_getStorageAt reads from the latest state (cached or fetched from disk), not EVM. But it needs to fetch the latest block state first and I believe this process is halted when the node is syncing a new state. Maybe this is why you have 1k ms randomly.

Without modifying GETH you won't get any better speed really.

@s1na
Copy link
Contributor

s1na commented May 6, 2022

with graphql its 1k to 2k ms.

I'm surprised its slower by so much. Which query is that, this one?

@berktaylan
Copy link

berktaylan commented May 6, 2022

with graphql its 1k to 2k ms.

I'm surprised its slower by so much. Which query is that, this one?

Yes like this, i think same

  for (let i = 0; i < lpv.length; i++) {
        reqs.push(`account${i}: account(address: "0x${lpv[i]}") { 
            slot0: storage (slot: "0x0000000000000000000000000000000000000000000000000000000000000008")
         }`)

const query = gql`query slots { block { ${reqs.join(',\n')} } }`;

   timex = performance.now();
request('http://127.0.0.1:8545/graphql', query, { number: 'latest' })
    .then((res) => { console.log(performance.now()-timex) })
    .catch((err) => { console.log(err) })
 }

@tinoh9
Copy link

tinoh9 commented Oct 31, 2022

So it turns out one of the specified validation rules (OverlappingFieldsCanBeMerged ) has quadratic complexity which causes my query to blow up. So I modified the query to minimize the number of selection fields. Now it looks like this:

query slots {
  block {
    account0: account(address: "0x...") {
      slot0: storage (slot: "0x..."),
      slot1: storage (slot: "0x...")
    },
    account1: account(address: "0x...") {
      slot0: storage (slot: "0x...")
    }
  }
}

In the meanwhile I also prototyped a JSON-RPC batch request. You can find the implementation for both in the same gist. Now to some results:

$ node index.js 14132672
Access-list has 245 accounts with 1896 storage slots
json-rpc requests took 656ms
16 graphql request took 88ms
New graphql request took 39ms
Batch json-rpc req took 82ms

$ node index.js 14132673
Access-list has 134 accounts with 1681 storage slots
json-rpc requests took 581ms
11 graphql request took 76ms
New graphql request took 56ms
Batch json-rpc req took 52ms

$ node index.js 14132679
Access-list has 163 accounts with 1160 storage slots
json-rpc requests took 436ms
11 graphql request took 61ms
New graphql request took 22ms
Batch json-rpc req took 34ms

$ node index.js 14132676
Access-list has 209 accounts with 1730 storage slots
json-rpc requests took 580ms
15 graphql request took 104ms
New graphql request took 31ms
Batch json-rpc req took 50ms

$ node index.js 14132681
Access-list has 156 accounts with 1253 storage slots
json-rpc requests took 452ms
10 graphql request took 61ms
New graphql request took 65ms
Batch json-rpc req took 39ms

For anyone needing to fetch many storage slots I recommend first the new GraphQL query, and then the JSON-RPC batch request until there's an update from graphql-go upstream about this validation rule complexity.

Hi @s1na , I wonder how is the performance of JSON-RPC batch compares with the new GraphQL nowadays?

@s1na
Copy link
Contributor

s1na commented Oct 31, 2022

@tinoh99 I don't believe things have changed much since that benchmark. But the gist I posted contains the scripts I used. Feel free to compare them for yourself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants