Skip to content
This repository was archived by the owner on Jan 22, 2025. It is now read-only.

Check for incorrect account hash value on snapshot ingest (v2)#7559

Merged
ryoqun merged 3 commits intosolana-labs:masterfrom
ryoqun:check-account-hashes-in-snapshot
Dec 20, 2019
Merged

Check for incorrect account hash value on snapshot ingest (v2)#7559
ryoqun merged 3 commits intosolana-labs:masterfrom
ryoqun:check-account-hashes-in-snapshot

Conversation

@ryoqun
Copy link
Copy Markdown
Contributor

@ryoqun ryoqun commented Dec 19, 2019

(This PR is based on @sakridge 's #7427)

Problem

No checking on snapshot ingestion that data which is hashed matches the hash value in the append vec.

Summary of Changes

  • Preparatory refactorings (which @ryoqun did) :
    • This PR makes the verification function return an enum, not bool to make it easy to unit test for individual bad cases. I think this is a desired prudent approach here because these are corner cases which is rarely touched on the production so it's quite possible for bit rot and silent false-{positive, negative}s.
    • In turn, because of the new enum, I did some naming clarification for logically clear naming for each member of the enum because otherwise the new and radical naming would be stranger to the other parts of code or the traditional naming based on the status-quo would be too confusing.
  • Re-hash the account data and check for differences (which original @sakridge 's PR did)
    • Additionally, This PR does some modifications to the original logic. Namely, making the added check actually work, bailing out early to avoid wasted following calculation and thereby fixing low security risks if the new check failed.

Naming changes

1. internal_state => bank_hash at the AccountDB layer:

The word internal_state is originated from bank module's Bank::{hash,verify_hash}_internal_state, and used only here. The names for banks are fine as it is, because it is calculating (=hash-ing, verb) the state of Bank, which is internal from the other layers, and verify-ing the calculation (=hash, noun) itself.

But the internal_state isn't clear-cut fit for AccountsDB. The same word is used for different things. AccountsDB calls its returning BankHash its "internal_state" while bank calls its returning Hash including AccountDB's BankHash its "internal_state". It's confusing and AccountsDB should just use the bank_hash name. Because of following section's reasoning, I chose it over the other candidate (= slot_hash).

In my opinion, using exactly same names for different layers is a bit confusing unless they are perfectly delegating work and forwarding args and concepts or they are well-defined concepts across different layers and components. These are not true for the internal_state.

Finally by introducing bank_hash, it cleanly align with the account.hash (= account_hash) in the new enum.

2. slot_hashes => bank_hashes at the AccountsDB API

Sadly, SlotHashes/slot_hashes are already taken and established as a sysvar. While both components use the same word "slot hashes", yet its definitions are different. Bank/AccountsDB's slot_hash is one of inputs for the sysvar's "slot hash". It's confusing at best. So, just use the straight naming derived from type name of BankHash here.

I avoided renaming the AccontsDB::slot_hashes field to AccountsDB::bank_hashes as it will introduce too noise for this PR, which is already a bit sizable. I'll do a quick dumb-replace PR after this PR.

Part of #7167

Comment thread sdk/src/hash.rs
use std::mem;
use std::str::FromStr;

pub const HASH_BYTES: usize = 32;
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A tiny nicety. This aligns with BANK_HASH_BYTES in bank_hash.rs.

self.store_with_hashes(slot_id, accounts, &hashes);
}

fn store_with_hashes(&self, slot_id: Slot, accounts: &[(&Pubkey, &Account)], hashes: &[Hash]) {
Copy link
Copy Markdown
Contributor Author

@ryoqun ryoqun Dec 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extracted into new fn purely for the purpose of unit-testing.

/// the accounts
min_num_stores: usize,

/// slot to BankHash and a status flag to indicate if the hash has been initialized or not
Copy link
Copy Markdown
Contributor Author

@ryoqun ryoqun Dec 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit aggressive, but delete this comment as it's outdated and now adds no additional information by the comment.

/// distribute the accounts across storage lists
pub next_id: AtomicUsize,

/// write version
Copy link
Copy Markdown
Contributor Author

@ryoqun ryoqun Dec 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit aggressive here, but remove this comment as it adds no additional information to the commented code itself. use fewer comments and remove otherwise as it's too common for engineers to forget to maintain them up-to-date (including me!) as this is the case.

/// Keeps tracks of index into AppendVec on a per slot basis
pub accounts_index: RwLock<AccountsIndex<AccountInfo>>,

/// Account storage
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this as the same reason as this.

}

#[derive(Debug)]
pub enum BankHashVerificatonError {
Copy link
Copy Markdown
Contributor Author

@ryoqun ryoqun Dec 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion, the naming here is very clear after all the hustle of naming changes of this PR. :)

let hash = BankHash::from_hash(&Self::hash_account(slot, &account, pubkey));
let hash = Self::hash_account(slot, &account, pubkey);
if hash != account.hash {
*mismatch_found = true;
Copy link
Copy Markdown
Contributor Author

@ryoqun ryoqun Dec 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Originally assigned false, but this didn't work because the Default::default() of bool is false... So I inverted the boolean value.

*mismatch_found = true;
}
if *mismatch_found {
return;
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added by me, bail out early as the PR description explains.

}

pub fn verify_hash_internal_state(&self, slot: Slot, ancestors: &HashMap<Slot, usize>) -> bool {
let mut hash_state = BankHash::default();
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to its use site and renamed.

}

#[test]
fn test_verify_bank_hash() {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the coverage report from the CI will reveal, these tests does the 100% coverage. :p

Comment thread runtime/src/accounts_db.rs Outdated
let ancestors = vec![(some_slot, 0)].into_iter().collect();

let accounts = &[(&key, &account)];
// update AccountsDB's hash state but discard real account hashes
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, hash state should be written as bank hash to reflect new naming...

Comment thread runtime/src/accounts.rs
}
}

pub fn hash_internal_state(&self, slot_id: Slot) -> BankHash {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed from verb_object to noun_preposition style here, as it's rather a getter fn, not doing actual calculating here.

Comment thread runtime/src/accounts.rs
pub fn verify_hash_internal_state(&self, slot: Slot, ancestors: &HashMap<Slot, usize>) -> bool {
self.accounts_db.verify_hash_internal_state(slot, ancestors)
pub fn verify_bank_hash(&self, slot: Slot, ancestors: &HashMap<Slot, usize>) -> bool {
self.accounts_db.verify_bank_hash(slot, ancestors).is_ok()
Copy link
Copy Markdown
Contributor Author

@ryoqun ryoqun Dec 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the very layer boundary of "Bank<=>AccountsDB", we discard detailed error and encapsulate the result into a bool.

Comment thread runtime/src/bank.rs
self.rc
.accounts
.verify_hash_internal_state(self.slot(), &self.ancestors)
.verify_bank_hash(self.slot(), &self.ancestors)
Copy link
Copy Markdown
Contributor Author

@ryoqun ryoqun Dec 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this place, we semantically translate verify-ing Bank's internal_state into AccountDB's bank_hash. So outwards rename propagation from AccountsDB ends here. :)

@ryoqun ryoqun requested a review from sakridge December 19, 2019 05:26
@codecov
Copy link
Copy Markdown

codecov Bot commented Dec 19, 2019

Codecov Report

Merging #7559 into master will decrease coverage by 12.2%.
The diff coverage is 82.6%.

@@            Coverage Diff            @@
##           master   #7559      +/-   ##
=========================================
- Coverage    77.7%   65.5%   -12.3%     
=========================================
  Files         244     245       +1     
  Lines       52151   61966    +9815     
=========================================
+ Hits        40556   40601      +45     
- Misses      11595   21365    +9770

sakridge
sakridge previously approved these changes Dec 19, 2019
Copy link
Copy Markdown
Contributor

@sakridge sakridge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm; thanks @ryoqun

@mergify mergify Bot dismissed sakridge’s stale review December 19, 2019 23:58

Pull request has been modified.

@ryoqun
Copy link
Copy Markdown
Contributor Author

ryoqun commented Dec 20, 2019

lgtm; thanks @ryoqun

Thanks for quick review! I'm merging this with a passing CI build with this really tiny commit after the LGTM: 2eb146f.

@ryoqun ryoqun merged commit 3c361eb into solana-labs:master Dec 20, 2019
@ryoqun ryoqun changed the title Check account hashes in snapshot Check for incorrect hash value on snapshot ingest (v2) Dec 20, 2019
@ryoqun ryoqun changed the title Check for incorrect hash value on snapshot ingest (v2) Check for incorrect account hash value on snapshot ingest (v2) Feb 10, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants