Canonical state cache#2308
Conversation
|
(note that this is against |
| self.interest.insert(EventSet::writable()); | ||
| io.update_registration(self.token).ok(); | ||
| } | ||
| io.update_registration(self.token).ok(); |
There was a problem hiding this comment.
This is a fix from the master branch. Prevents peer disconnects in some cases.
| /// Drop this object and return the underlieing database. | ||
| fn drain(self) -> Box<JournalDB> { self.block.state.drop().1 } | ||
| fn drain(mut self) -> StateDB { | ||
| self.block.state.commit_cache(); |
There was a problem hiding this comment.
if commit_cache is now necessary to be called as part of LockedBlock's general maintenance, what guarantees are there that the rest of the codebase is using it correctly?
There was a problem hiding this comment.
this is now removed
gavofyork
left a comment
There was a problem hiding this comment.
this is a pretty substantial change to the internals; i'd like to see the new model of usage documented properly.
|
so this introduces a secondary cache for account data in |
gavofyork
left a comment
There was a problem hiding this comment.
looks pretty good in general.
a bit of docs on usage, particularly for State::commit_cache and some argument as to why it will get called when needed (e.g. in drain) would be nice to have.
| e.remove(); | ||
| } | ||
| }, | ||
| _ => () |
| Ok(()) | ||
| } | ||
|
|
||
| pub fn commit_cache(&mut self) { |
There was a problem hiding this comment.
why the separate function?
There was a problem hiding this comment.
Made if private. Called automatically on drop.
Whitespace and optional symbols
|
Added in 25d66f4: |
| &mut Some(ref mut acc) => not_default(acc), | ||
| slot @ &mut None => *slot = Some(default()), | ||
| &mut AccountEntry::Cached(ref mut acc) => not_default(acc), | ||
| slot @ _ => *slot = AccountEntry::Cached(default()), |
There was a problem hiding this comment.
why not just slot =>...?
gavofyork
left a comment
There was a problem hiding this comment.
a few questions and some additional docs would be good.
| match self.storage_overlay.borrow_mut().entry(key) { | ||
| Entry::Occupied(ref mut entry) if entry.get().1 != value => { | ||
| entry.insert((Filth::Dirty, value)); | ||
| match self.storage_changes.entry(key) { |
There was a problem hiding this comment.
Does this work properly if the storage was previously cached? What about if the change is later back to the original value?
There was a problem hiding this comment.
storage_changes will hold the same value as storage_cache but there's nothing wrong with that. Everything will be merged properly on commit_storage
There was a problem hiding this comment.
what if the two hold different values? is there a proof that this can never happen?
There was a problem hiding this comment.
It may happen and is supported. storage_cache reflects the contents of the disk database, storage_changes keeps changes over this contents. I'd prefer caching and tracking changes to be split into different structs, so that only the state can have changes. This is left for a future PR.
| // Overlay on trie-backed storage - tuple is (<clean>, <value>). | ||
| storage_overlay: RefCell<HashMap<H256, (Filth, H256)>>, | ||
| // Cache of trie-backed storage | ||
| storage_cache: RefCell<LruCache<H256, H256>>, |
There was a problem hiding this comment.
Should be made explicit in the struct docs about the meaning (and precedence) of these members. Particularly what happens when the same key appears in both.
| our_cache.insert(k.clone() , v.clone()); //TODO: cloning should not be required here | ||
| } | ||
| *self = other; | ||
| self.storage_cache = RefCell::new(our_cache); |
There was a problem hiding this comment.
seems a little over-complex in terms of assignments; why do our_cache and updated_cache need pulling out only to be replaced here? why reassign self?
| } | ||
| } | ||
|
|
||
| fn clone_basic(&self) -> AccountEntry { |
There was a problem hiding this comment.
maybe add some docs for this function - difficult to immediately see what it's trying to do
| } | ||
| } | ||
|
|
||
| fn clone_dirty(&self) -> Option<AccountEntry> { |
There was a problem hiding this comment.
maybe add some docs for this function - difficult to immediately see what it's trying to do
| // check local cache first | ||
| if let Some(value) = self.cache.borrow().get(address).and_then(|a| | ||
| match *a { | ||
| AccountEntry::Cached(ref account) => account.modified_storage_at(key), |
There was a problem hiding this comment.
why only modified_storage_at (don't we want to see the cached storage value?)
There was a problem hiding this comment.
Storage key search and update now works like this:
- If there's an entry for the account in the local cache check for the key and return it if found.
- If there's an entry for the account in the global cache check for the key or load it into that account.
- If account is missing in the global cache load it into the local cache and cache the key there.
There was a problem hiding this comment.
refactored to reflect algorithm mentioned above
|
Added docs |
gavofyork
left a comment
There was a problem hiding this comment.
Generally looks good. Quite a few questions though.
|
|
||
| /// Get cached storage value if any. Returns `None` if the | ||
| /// key is not in the cache. | ||
| pub fn cached_storage_at(&self, key: &H256) -> Option<H256> { |
There was a problem hiding this comment.
Just call this from storage_at instead of the copy'n'paste?
| if let Err(e) = res { | ||
| warn!("Encountered potential DB corruption: {}", e); | ||
| } | ||
| self.storage_cache.borrow_mut().insert(k, v); |
There was a problem hiding this comment.
i guess there's no issue here with DB commits being pending while storage_cache is already updated? (db gets written out when t is dropped, right?)
There was a problem hiding this comment.
This is just the local cache. Changes to the shared canonical state cache are applied along with the database write.
| balance: self.balance.clone(), | ||
| nonce: self.nonce.clone(), | ||
| storage_root: self.storage_root.clone(), | ||
| storage_cache: Self::empty_storage_cache(), |
There was a problem hiding this comment.
why not copy the cache? waste of memory?
There was a problem hiding this comment.
Exactly. And waste of time copying it around all the time. Storage queries will be cached in the shared Account anyway
| pub fn clone_dirty(&self) -> Account { | ||
| let mut account = self.clone_basic(); | ||
| account.storage_changes = self.storage_changes.clone(); | ||
| account.code_cache = self.code_cache.clone(); |
There was a problem hiding this comment.
is code_cache a cache of the database's current code entries? or does it contain dirty changes? if changes, then should probably be renamed code_changes, if a simple cache like storage_cache then surely if should be assigned only in clone_all?
There was a problem hiding this comment.
There's no distinction currently. I'll add it.
| let mut cache = self.storage_cache.borrow_mut(); | ||
| for (k, v) in other.storage_cache.into_inner().into_iter() { | ||
| cache.insert(k.clone() , v.clone()); //TODO: cloning should not be required here | ||
| } |
There was a problem hiding this comment.
shouldn't we be doing something with other's storage_changes? asserting it's empty, perhaps? if it isn't always empty, it's probably safest to require the call-site to explicitly drop that data first, since dropping potentially important changes implicitly doesn't seem particularly anti-fragile.
There was a problem hiding this comment.
there's already an assert for this at the beginning of the function
| }, | ||
| AccountEntry::Missing => { | ||
| self.db.cache_account(address, None); | ||
| } |
There was a problem hiding this comment.
missing , (or superfluous one above and below - i good with either rule)
| account.cache_code(&AccountDB::from_hash(self.db.as_hashdb(), addr_hash)); | ||
| fn update_account_cache(require: RequireCache, account: &mut Account, address: &Address, db: &HashDB) { | ||
| match require { | ||
| RequireCache::None => (), |
| /// State database abstraction. | ||
| /// Manages shared global state cache. | ||
| /// A clone of `StateDB` may be created as canonical or not. | ||
| /// For canonical clones cache changes are accumulated and applied |
There was a problem hiding this comment.
is this additional caching write-through? does it leave the state and block dbs in inconsistent states when there's an unexpected exit during a run of canon blocks?
There was a problem hiding this comment.
It is write-through, writes are not affected by this PR
| let mut cache = self.account_cache.lock(); | ||
| for (address, account) in self.cache_overlay.drain(..) { | ||
| if let Some(&mut Some(ref mut existing)) = cache.accounts.get_mut(&address) { | ||
| if let Some(new) = account { |
There was a problem hiding this comment.
what would account == None mean?
There was a problem hiding this comment.
account is known to be missing in the DB
| } | ||
|
|
||
| /// Apply pending cache changes. | ||
| fn commit_cache(&mut self) { |
There was a problem hiding this comment.
this doesn't write through to the DB - is that not an issue on unexpected shutdown?
There was a problem hiding this comment.
this only updated in-memory cache, writes are unaffected
|
looks good - be good to get those minor doc points and DRY in prior to merge. |
|
Pushed documenation and syle changes |
Still left to do: