Skip to content
This repository was archived by the owner on Nov 6, 2020. It is now read-only.

Implement EIP234 block_hash for eth_getLogs#9256

Merged
sorpaas merged 15 commits into
masterfrom
sp-eip234
Aug 13, 2018
Merged

Implement EIP234 block_hash for eth_getLogs#9256
sorpaas merged 15 commits into
masterfrom
sp-eip234

Conversation

@sorpaas
Copy link
Copy Markdown
Collaborator

@sorpaas sorpaas commented Jul 31, 2018

rel #9251

If block_hash and from_block/to_block present, return error. This also changes eth_getLogs to return error if any of block_hash/from_block/to_block cannot be found.

@sorpaas sorpaas added A0-pleasereview 🤓 Pull request needs code review. M6-rpcapi 📣 RPC API. labels Jul 31, 2018
@sorpaas sorpaas added this to the 2.1 milestone Jul 31, 2018
@MicahZoltu
Copy link
Copy Markdown

If the blockhash supplied doesn't match any known block, will it return the error specified in EIP-234?

@fabioberger
Copy link
Copy Markdown

fabioberger commented Jul 31, 2018

Thanks for this @sorpaas!

The part of EIP 234 @MicahZoltu is referring to is:

If no block exists with that hash then an error should be returned with a code of -32000, a message of "Block not found." and a data of "0xbl0cbl0cbl0cbl0cbl0cbl0cbl0cbl0cbl0cbl0cbl0cbl0cbl0cbl0cbl0cbl0c"

Where 0xbl0cbl0cbl0cbl0cbl0cbl0cbl0cbl0cbl0cbl0cbl0cbl0cbl0cbl0cbl0cbl0c is substituted with the blockHash of the request.

BlockNumber::Latest | BlockNumber::Pending => BlockId::Latest,
};

let (from_block, to_block) = match self.block_hash {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO it would be good to return a RPC error in case block_hash and from_block &/ to_block is specified.

@sorpaas sorpaas added A3-inprogress ⏳ Pull request is in progress. No review needed at this stage. and removed A0-pleasereview 🤓 Pull request needs code review. labels Jul 31, 2018
@sorpaas
Copy link
Copy Markdown
Collaborator Author

sorpaas commented Jul 31, 2018

@MicahZoltu The current implementation will return empty vector. I think this may be just better align with the rest of eth_getLogs definition? If you specify a non-existent fromBlock or toBlock, we would also return empty vector right now.

@MicahZoltu
Copy link
Copy Markdown

@sorpaas Returning an empty array not only is against EIP-234, but it is also a massive source of problems for dapps right now. Augur is all but broken because of this behavior and the devs are grasping at some pretty crazy solutions to try to work around it while waiting for EIP-234 to be implemented.

You are correct that it aligns with previous behavior, but the previous behavior is not good behavior. Also, Geth has implemented EIP-234 and they return an error if blockhash does not exist so returning an error would be in line with Geth behavior.


For some context, the reason the current behavior is so problematic is because when a user requests logs for some block and they get back an empty array they cannot differentiate between "the block you requested has no logs" and "we don't know about the block you requested". This problem is magnified greatly when a load balanced backend like Infura is used, where each of your requests go to a different node. For example, you may get notification that block 0xabcd is the new head, but then when you ask for logs in block 0xabcd the node you speak to hasn't yet received block 0xabcd, so it returns an empty array.

Even without load balanced backends, this is still a problem because reorgs could occur such that you receive notification about a block, but by the time you ask for logs for that block it has been reorged out so you get an empty array. You then wait to poll for a new head for a while and by the time you poll for new head the block has been reorged back in. In this scenario, if the reorged block is pruned from the node when reorged out (so not available to fetch logs) you will incorrectly think that the block held no logs.

@sorpaas
Copy link
Copy Markdown
Collaborator Author

sorpaas commented Jul 31, 2018

@MicahZoltu Sounds good! Will change this.

Given that this is a developer's pain point, I wonder whether we would also consider to change fromBlock and toBlock's behavior to be return error if it is non-existent? It's also really easy for any user to fallback to the old return-empty-vector behavior (by casting undefined/error to empty array). cc @5chdn @tomusdrw

@tomusdrw
Copy link
Copy Markdown
Collaborator

@sorpaas Makes perfect sense for me, let's expose the errors and mark that as a breaking change.

@tomusdrw tomusdrw added the B7-releasenotes 📜 Changes should be mentioned in the release notes of the next minor version release. label Jul 31, 2018
sorpaas added 2 commits August 1, 2018 11:51
…ck is found

This also changes PollFilter to store the EthFilter type, instead of the jsonrpc one, saving repeated conversion.
Use the old behavior (unwrap_or_default) for anywhere else.
@sorpaas sorpaas added A0-pleasereview 🤓 Pull request needs code review. and removed A3-inprogress ⏳ Pull request is in progress. No review needed at this stage. labels Aug 1, 2018
@MicahZoltu
Copy link
Copy Markdown

IMO, "fixing" this (return an error instead of empty if block is missing) makes sense for block by number as well. I can't come up with a use-case where you would want an empty array instead of an error.

I tend to advise people against using the block number mechanism in general for anything live. If you are trying to backfill all of history, then a number range works fine and isn't likely to cause you any problems, but you really should stop ~10 blocks back from head (beyond reasonable reorg distance) and then switch to something more robust like fetching blocks by hash (or currently, since that doesn't exist, do really complicated things to verify what you are getting near head is what you think).

@ghost
Copy link
Copy Markdown

ghost commented Aug 3, 2018

FWIW, the Forecast Foundation has an active 250 REP bounty on this implementation once it's merged. Whoever is responsible for the implementation, please ping me (@tk) on the Augur Discord or (@tomkysar) on Twitter to arrange payment.

Comment thread rpc/src/v1/types/filter.rs Outdated
(BlockId::Hash(hash), BlockId::Hash(hash))
},
None => (self.from_block.map_or_else(|| BlockId::Latest, &num_to_id),
self.to_block.map_or_else(|| BlockId::Latest, &num_to_id)),
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This indentation displays correctly on my editor. I think why it looks weird here is probably because the diff's + mark pushed tabs one level deeper.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me wrap that to a new line anyway.

@avonian
Copy link
Copy Markdown

avonian commented Aug 3, 2018

seems to me like that reward needs to go to @sorpaas :)

...once its merged that is

@andresilva
Copy link
Copy Markdown
Contributor

@sorpaas Please rebase/merge master and make sure to include the fix from #9285 when resolving conflicts.

Copy link
Copy Markdown
Contributor

@andresilva andresilva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall. Just a question regarding the returned error. Also can we add a test for this in mocked/eth.rs?

Comment thread rpc/src/v1/helpers/errors.rs Outdated
Error {
code: ErrorCode::ServerError(codes::UNSUPPORTED_REQUEST),
message: "One of the block specified in filter (fromBlock, toBlock or blockHash) cannot be found".into(),
data: None,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EIP234 requires that data include the hash of the requested block. Since we're also returning an error on to/from what do you think we should do? Is it really useful to return the requested hash if not found?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's equally useful to return the from/to block, since there're two and the caller might want to know which. Let me fix this.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few thoughts on this:

  1. If a user asks for blocks 6,000,000 to 7,000,000 but the highest block is 6,500,000, should we interpret that to mean the user wants "all blocks from 6,000,000 to latest"?
  2. If we consider it an error to ask for more blocks than latest, can the error data be the lowest block number that wasn't present? So in the above example the error data would be 6,500,001?

I haven't put a whole lot of thought into the above, @epheph may have some feelings since he has experienced a lot of the pain around this. My gut tells me that it should return an error because otherwise the user doesn't know which blocks had no logs and which blocks were missing, so it doesn't know what to ask for again. This is especially problematic when dealing with load balanced servers like Infura where each request may be routed differently and thus requests may go to a server that is behind (thus missing blocks that another had near head).

Typing all that out makes me reasonably confident that an error is in fact correct behavior. In that case, letting the user know what blocks are missing (rather than just "some blocks in the range") could prove useful to them, especially if they are attempting to fetch a very large range.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MicahZoltu Yes, the current behavior would return error with the error data pointing to block 7,000,000. :)

I think returning 6,500,001 may be too costly on the client side. It needs to (at least partially) iterate over the block range.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An error when your range exceeds the available blocks, even by a single block, is the best action. Returning that 6,500,001 is where that cut-off is would be nice for troubleshooting (and more correct), but not an absolutely requirement. None of the applications I've worked on would have taken any other action besides re-requesting the entire batch that was originally requested.

(It's possible the app could read up to the available block and then wait for more to become available, but requesting large ranges of blocks is usually done for blocks significantly in the past. I believe this error will occur most when you are asking for small numbers of blocks very close to head)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah like you said, I can't think of a use-case where we may query a non-existent fromBlock or toBlock. If a query doesn't care about it, it can just set toBlock to latest or null.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have cases where we query potentially-non-existent blocks. When using a pool of eth nodes (such as Infura), one might receive a websocket notification of newHead block X (or when getBlockByNumber("latest") ), then in a follow-up query to getLogs from that block, it might hit a node that does not yet have that block. In this case, latest probably isn't good because latest is likely to have changed. My view of the purpose of EIP-234 is to remove the race condition that exists between a blockNumber and a blockHash.

I don't mind getting an error back if the block doesn't exist, i just don't want to get back something I can't distinguish from a successful fetch of no logs.

//
// If we are sure the block does not exist (where val > best_block_number), then return error. Note that we
// don't need to care about pending blocks here because RPC query sets pending back to latest (or handled
// pending logs themselves).
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our current use cases are mostly fine. We have cases where we set fromBlock to a nonexistent future block (in filter_changes), but we actually just expect empty result in that case. I didn't find any case where we set fromBlock to a current block, but toBlock to a nonexistent block.

But please do note that this is a breaking change.

Copy link
Copy Markdown
Contributor

@andresilva andresilva Aug 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to validate that from <= to as well? I don't think we validate that on the RPC layer.

Copy link
Copy Markdown
Collaborator

@dvdplm dvdplm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm with some minor grumbles

fn logs(&self, filter: Filter) -> Vec<LocalizedLogEntry> {
fn logs(&self, filter: Filter) -> Result<Vec<LocalizedLogEntry>, BlockId> {
match self.error_on_logs.read().as_ref() {
Some(id) => return Err(id.clone()),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to return both the block ID and the block hash?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This BlockId contains BlockId::Number and BlockId::Hash, so it may be fine here -- the tester can choose which type of BlockId it wants to test. :)

Comment thread rpc/src/v1/helpers/errors.rs Outdated

pub fn filter_block_not_found(id: BlockId) -> Error {
Error {
code: ErrorCode::ServerError(codes::UNSUPPORTED_REQUEST),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is UNSUPPORTED_REQUEST the correct code here? Is that the -32000 from the EIP? If yes, maybe add a comment to that effect.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use UNSUPPORTED_REQUEST for other params-invalid errors, like the above filter_not_found. So I think it may be fine. :)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, you know best! It sounds a little like HTTP 501 Not Implemented but is more like HTTP 400 Bad Request, hence my confusion.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment thread rpc/src/v1/helpers/errors.rs Outdated
pub fn filter_block_not_found(id: BlockId) -> Error {
Error {
code: ErrorCode::ServerError(codes::UNSUPPORTED_REQUEST),
message: "One of the block specified in filter (fromBlock, toBlock or blockHash) cannot be found".into(),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: "One of the blocks…"

Comment thread rpc/src/v1/tests/mocked/eth.rs Outdated
let tester = EthTester::default();
tester.client.set_error_on_logs(Some(BlockId::Hash(H256::from([5u8].as_ref()))));
let request = r#"{"jsonrpc": "2.0", "method": "eth_getLogs", "params": [{"limit":1,"blockHash":"0x0000000000000000000000000000000000000000000000000000000000000000"}], "id": 1}"#;
let response = r#"{"jsonrpc":"2.0","error":{"code":-32000,"message":"One of the block specified in filter (fromBlock, toBlock or blockHash) cannot be found","data":"0x0500000000000000000000000000000000000000000000000000000000000000"},"id":1}"#;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same typo here: "blockS"

let num_to_id = |num| match num {
BlockNumber::Num(n) => BlockId::Number(n),
BlockNumber::Earliest => BlockId::Earliest,
BlockNumber::Latest | BlockNumber::Pending => BlockId::Latest,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if it's relevant here, but if BlockNumber::Pending is deprecated, maybe we should add a match arm here and print a warn! about the deprecation? I've seen other places in the code where we do this.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still use that -- it's only that we don't deal with it for eth_getLogs. For eth_filter namespace, pending logs are added on top of the logs returned by Client::logs if we specify BlockNumber::Latest.

Copy link
Copy Markdown
Contributor

@andresilva andresilva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LGTM. Just one more question regarding errors/validation.

//
// If we are sure the block does not exist (where val > best_block_number), then return error. Note that we
// don't need to care about pending blocks here because RPC query sets pending back to latest (or handled
// pending logs themselves).
Copy link
Copy Markdown
Contributor

@andresilva andresilva Aug 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to validate that from <= to as well? I don't think we validate that on the RPC layer.

loop {
let header = match chain.block_header_data(&current_hash) {
Some(val) => val,
None => return Err(BlockId::Hash(current_hash)),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From RPC-side you'll input block numbers for from/to but in this case you would get back an error that is a block hash. I don't think it's very problematic since this error should be rare, since we do range validation before starting the loop. We could coerce the blockid error depending on the type used in filter, i.e. if it uses block numbers return error using block number, otherwise use hash. But maybe this is added complexity for a case that is rare (those blocks should be on your local db and it shouldn't fail).

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part of the code only deals with the case where if either from_block or to_block is a block hash. If both of them are numbers (which is the case for RPC's fromBlock and toBlock, then we won't go into this else branch (https://github.com/paritytech/parity-ethereum/pull/9256/files/4b45f9ac913f7ef9d7e3d3cda22a2d0c28dc23ea#diff-40c7c65265e511d07158f62712f69573R1850). If user inputs block numbers, he/she can only get back block numbers as errors. :)

@sorpaas
Copy link
Copy Markdown
Collaborator Author

sorpaas commented Aug 7, 2018

(Github's interface is probably broken. Cannot comment on your first review message.)

Do we want to validate that from <= to as well? I don't think we validate that on the RPC layer.

Yeah in that case currently the bloom filter's iterator would just return empty result. I'm going to change this to Err(to_block).

Copy link
Copy Markdown
Contributor

@andresilva andresilva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sorpaas sorpaas merged commit a6df452 into master Aug 13, 2018
@sorpaas sorpaas deleted the sp-eip234 branch August 13, 2018 07:47
// If from is greater than to, then the current bloom filter behavior is to just return empty
// result. There's no point to continue here.
if from > to {
return Err(filter.to_block.clone());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error returned from the RPC is "One of the blocks specified in filter (fromBlock, toBlock or blockHash) cannot be found", which seems confusing because the from and to blocks can be found, they are just out of order.

@ghost
Copy link
Copy Markdown

ghost commented Aug 19, 2018

Following up here - happy to see this has been merged!

Who here is the claimer of the bounty? Please get in touch, and we can pay you out :-)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

A0-pleasereview 🤓 Pull request needs code review. B7-releasenotes 📜 Changes should be mentioned in the release notes of the next minor version release. M6-rpcapi 📣 RPC API.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants