Accounts bloom#2357
Conversation
|
could we put the source for bloom filter in this repo? |
|
@rphmeier it's a fork from other guy crate |
|
can be made against master now. |
| pub const ACCOUNT_BLOOM_SPACE: usize = 1048576; | ||
| pub const DEFAULT_ACCOUNT_PRESET: usize = 1000000; | ||
|
|
||
| pub const ACCOUNT_BLOOM_SPACE_COLUMN: &'static[u8] = b"accounts_bloom"; |
There was a problem hiding this comment.
Should be called ACCOUNT_BLOOM_SPACE_KEY, Column is something else in rocksdb terms.
d6e3b15 to
67e68e9
Compare
33e181a to
c45e992
Compare
b73c7ce to
260e75f
Compare
| let account_trie = try!(TrieDB::new(state_db.as_hashdb(), &state_root).map_err(|e| Error::Custom(format!("Cannot open trie: {:?}", e)))); | ||
| for (ref account_key, _) in account_trie.iter() { | ||
| let account_key_hash = H256::from_slice(&account_key); | ||
| bloom.set(account_key_hash.sha3().as_slice()); |
There was a problem hiding this comment.
account_key_hash is an address hash already, no need to sha3() it here
| try!(db.write(batch)); | ||
|
|
||
| trace!(target: "migration", "Finished bloom update"); | ||
| println!("Done."); |
| } | ||
|
|
||
| let bloom = Bloom::from_parts(&bloom_parts, hash_count as u32); | ||
| trace!(target: "account_bloom", "Bloom is {:?} full, hash functions number = {:?}", bloom.how_full(), hash_count); |
There was a problem hiding this comment.
"hash functions number" -> "hash function count" or "number of hash functions"
|
|
||
| trace!(target: "migration", "Generated {} bloom updates", bloom_journal.entries.len()); | ||
|
|
||
| let batch = DBTransaction::new(&db); |
There was a problem hiding this comment.
Unsure if this can or can not happend concurrently
would be nice if @rphmeier review this
There was a problem hiding this comment.
Becomes a non-issue if you switch the strategy to the one described in my other comment.
DBTransaction exists entirely outside of the rocksdb layer so it's safe itself, and then db.write may race against any other writes. Afaik the migrations are single-threaded so it probably shouldn't.
| return Ok(()) | ||
| } | ||
|
|
||
| println!("Adding accounts bloom (one-time upgrade). Please don't close parity."); |
There was a problem hiding this comment.
use info! instead of println!
"Please don't close" should be irrelevant. The whole upgrade should be one transaction or pa process that can be canceled
There was a problem hiding this comment.
All migration messages are displayed using println!
It is indeed in one transaction, so i will remove this pray
|
@rphmeier should sign off, too. |
rphmeier
left a comment
There was a problem hiding this comment.
couple issues with the migration, haven't looked at the bloom filter crate itself but the API calls look right.
| // in-place upgrades that do nothing when called repeatedly | ||
| fn run_inplace_upgrades(path: &Path) -> Result<(), Error> { | ||
| try!(migrations::upgrade_account_bloom(path)); | ||
| Ok(()) |
There was a problem hiding this comment.
I don't think this strategy will work once we add more migrations on top of this one, since we need this account bloom migration to happen before others which may affect it.
Why not change Migration to supply Arc<Database> as source, and copy everything over for each column, and then additionally if column == COL_STATE, you also do the same stuff as upgrade_account_bloom would?
I think this would also require backporting my pre_columns fixes from master as consolidated db migrations are currently broken.
There was a problem hiding this comment.
it's perfectly safe to run this migration any time on the upgraded database - it just won't do anything
any further upgrades should just go after this line
i don't want to introduce any new logic to the migration in beta and at this pr
|
|
||
| trace!(target: "migration", "Generated {} bloom updates", bloom_journal.entries.len()); | ||
|
|
||
| let batch = DBTransaction::new(&db); |
There was a problem hiding this comment.
Becomes a non-issue if you switch the strategy to the one described in my other comment.
DBTransaction exists entirely outside of the rocksdb layer so it's safe itself, and then db.write may race against any other writes. Afaik the migrations are single-threaded so it probably shouldn't.
| //! Bloom upgrade | ||
|
|
||
| use client::{DB_COL_EXTRA, DB_COL_HEADERS, DB_NO_OF_COLUMNS, DB_COL_STATE}; | ||
| use state_db::{ACCOUNT_BLOOM_SPACE, DEFAULT_ACCOUNT_PRESET, StateDB}; |
There was a problem hiding this comment.
all constants should be reproduced in the migration implementation for backwards compatibility -- there is no guarantee that they will remain the same as they were at the database version you're migrating.
There was a problem hiding this comment.
it's not a migration strictly, it's inplace upgrade
it's perfectly guaranteed that the columns should be the same otherwise all other logic will instantly fail
| Ok(Some(acc)) => AccountEntry::Cached(Account::from_rlp(acc)), | ||
| Ok(None) => AccountEntry::Missing, | ||
| Err(e) => panic!("Potential DB corruption encountered: {}", e), | ||
| let maybe_acc = if self.db.check_account_bloom(a) { |
There was a problem hiding this comment.
this check can be saved just by caching the value of the previous query
| pub const ACCOUNT_BLOOM_SPACE: usize = 1048576; | ||
| pub const DEFAULT_ACCOUNT_PRESET: usize = 1000000; | ||
|
|
||
| pub const ACCOUNT_BLOOM_HASHCOUNT_KEY: &'static[u8] = b"account_hash_count"; |
| .expect("Low-level database error"); | ||
|
|
||
| hash_count_entry.is_some() | ||
| } |
There was a problem hiding this comment.
this function is only used for the migration; maybe it should be implemented there? (especially as it relies on external constants)
on top of @arkpar #2308
still missing:
snapshot restoration bloom updatemiration