Skip to content
This repository was archived by the owner on Nov 6, 2020. It is now read-only.

Remove accounts bloom#11589

Merged
vorot93 merged 10 commits into
masterfrom
dp/chore/remove-accounts-bloom
Apr 22, 2020
Merged

Remove accounts bloom#11589
vorot93 merged 10 commits into
masterfrom
dp/chore/remove-accounts-bloom

Conversation

@dvdplm
Copy link
Copy Markdown
Collaborator

@dvdplm dvdplm commented Mar 29, 2020

Remove the accounts existence bloom filter.

The bloom has numerous problems:

  1. does not scale as the number of accounts increase; resizing the bloom takes several hours
  2. does not handle reorgs
  3. is one-size-fits all, which means that small chains still need to run with a bloom filter sized for ETH, i.e. able to handle 100M accounts (and counting) at a non-negligible runtime cost (120+Mb ram, 7 hash functions for each lookup etc)

For this reason this PR removes the bloom filter entirely. See https://github.com/openethereum/openethereum/pull/11581 for more details and benchmarks.

Remove the accounts existence bloom filter. Used to compare performance of running with no bloom to running with an optimized bloom (see https://github.com/openethereum/openethereum/pull/11581).
@dvdplm dvdplm self-assigned this Mar 29, 2020
@dvdplm dvdplm added the A3-inprogress ⏳ Pull request is in progress. No review needed at this stage. label Mar 29, 2020
dvdplm added 5 commits March 31, 2020 22:34
* master: (25 commits)
  Update .gitmodules (#11628)
  ethcore/res: activate ecip-1088 phoenix on classic (#11598)
  Upgrade parity-common deps to latest (#11620)
  Fix Goerli syncing (#11604)
  deps: switch to upstream ctrlc (#11617)
  Deduplicating crate dependencies (part 3 of n) (#11614)
  Deduplicating crate dependencies (part 2 of n, `slab`) (#11613)
  Actually save ENR on creation and modification (#11602)
  Activate POSDAO on xDai chain and update bootnodes (#11610)
  Activate on-chain randomness in POA Core (#11609)
  Deduplicating crate dependencies (part 1 of n) (#11606)
  Update enodes for POA Sokol (#11611)
  Remove .git folder from dogerignore file so vergen library can get build date and commit hash in the binary generatio vergen library can get build date and commit hash in the binary generation (#11608)
  Reduced gas cost for static calls made to precompiles EIP2046/1352 (#11583)
  [easy] `ethcore-bloom-journal` was renamed to `accounts-bloom` (#11605)
  Use serde_json to export hardcoded sync (#11601)
  Node Discovery v4 ENR Extension (EIP-868) (#11540)
  Fix compile warnings (#11595)
  Update version to 3.0.0-alpha.1 (#11592)
  ethcore/res: bump canon fork hash for mordor and kotti testnets (#11584)
  ...
Comment thread ethcore/db/src/db.rs Outdated
Comment thread ethcore/db/src/db.rs Outdated
Comment thread parity/blockchain.rs Outdated
Comment on lines +418 to +420
// Sometimes when importing a small number of blocks (~10) the report seems
// inaccurate, so we sleep a little before getting the report.
std::thread::sleep(Duration::from_secs(1));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huh?

Copy link
Copy Markdown
Collaborator Author

@dvdplm dvdplm Apr 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I have no idea wtf. If you try importing 5-10 blocks the report is all messed up, claiming wrong number of blocks were imported. I tossed in this in and moved on without investigating further.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was hoping we could figure this out before the merge.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Firstly apologies for adding this change that has nothing to do with the rest of the PR. I'll remove it.

Secondly, I was not overly worried about it because the import is nondeterministic right? Some (user configurable) verifier threads are going to process the import queue and if the number is small maybe there's a race on the ClientReport so giving it a little more time, I thought, was "fine"?

Comment on lines +185 to +191
for (n, (k,_)) in db.iter(COL_ACCOUNT_BLOOM).enumerate() {
batch.delete(COL_ACCOUNT_BLOOM, &k);
if n > 0 && n % 10_000 == 0 {
info!(target: "migration", " Account Bloom entries queued for deletion: {}", n);
}
}
batch.delete(COL_ACCOUNT_BLOOM, ACCOUNT_BLOOM_HASHCOUNT_KEY);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could use the new (not published yet) kvdb api to simplify this paritytech/parity-common#368

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This migration is really quick to run. We can tweak this when we get around to updating the kvdb-*s.

Co-Authored-By: Andronik Ordian <write@reusable.software>
@dvdplm dvdplm marked this pull request as ready for review April 14, 2020 11:32
@dvdplm dvdplm added A0-pleasereview 🤓 Pull request needs code review. and removed A3-inprogress ⏳ Pull request is in progress. No review needed at this stage. labels Apr 14, 2020
Comment thread ethcore/snapshot/src/lib.rs Outdated
dvdplm added 2 commits April 14, 2020 13:37
…h/parity-ethereum into dp/chore/remove-accounts-bloom

* 'dp/chore/remove-accounts-bloom' of github.com:paritytech/parity-ethereum:
  Update ethcore/db/src/db.rs
@dvdplm
Copy link
Copy Markdown
Collaborator Author

dvdplm commented Apr 16, 2020

One concern that was raised about removing the bloom filter was it might hurt performance in older parts of the chain severely. I tried this by syncing eth from genesis up to block 2.7M.

No blooms:

./openethereum-no-blooms --chain=eth --no-warp --no-jsonrpc --cache-size-db=10000 --pruning=archive
2020-04-14 13:26:42 UTC Syncing    #8386 0xee35…342c
…
2020-04-14 18:59:52 UTC Syncing #2708504 0x1804…c303     2.20 blk/s   12.6 tx/s    2.1 Mgas/s

Took 5h33m10s.

With bloom:

./parities/parity-master-april-10 --chain=eth --no-warp --no-jsonrpc --cache-size-db=10000 --pruning=archive
2020-04-16 14:06:52 UTC Syncing    #1905 0x44c0…dd4e
…
2020-04-16 20:33:58 UTC Syncing #2708504 0x1804…c303     2.00 blk/s   11.2 tx/s    1.7 Mgas/s

Took 06h27m06s

The big difference was something I did not expect.

@dvdplm dvdplm requested a review from ordian April 20, 2020 12:09
Comment thread parity/blockchain.rs Outdated
Comment on lines +418 to +420
// Sometimes when importing a small number of blocks (~10) the report seems
// inaccurate, so we sleep a little before getting the report.
std::thread::sleep(Duration::from_secs(1));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was hoping we could figure this out before the merge.

Copy link
Copy Markdown

@vorot93 vorot93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@vorot93 vorot93 added A8-looksgood 🦄 Pull request is reviewed well. and removed A0-pleasereview 🤓 Pull request needs code review. labels Apr 22, 2020
@vorot93 vorot93 merged commit c85300c into master Apr 22, 2020
@ordian ordian deleted the dp/chore/remove-accounts-bloom branch April 22, 2020 08:32
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

A8-looksgood 🦄 Pull request is reviewed well.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants