validator: remove optional remote accounts hash consistency check#31279
validator: remove optional remote accounts hash consistency check#31279t-nelson merged 1 commit intosolana-labs:masterfrom
Conversation
Codecov Report
@@ Coverage Diff @@
## master #31279 +/- ##
=======================================
Coverage 81.5% 81.5%
=======================================
Files 733 733
Lines 207009 206941 -68
=======================================
+ Hits 168731 168735 +4
+ Misses 38278 38206 -72 |
There was a problem hiding this comment.
I'm OK with removing the --halt-on-trusted-validators-accounts-hash-mismatch CLI flag.
Without the flag, if a node calculates the accounts hash incorrectly, it'll find out later (basically whenever the account is next used, which has a maximum of rent collection duration). Maybe this is fine? I'm guessing most validators do not have this flag set anyway, so there's no change of behavior for them.
When the Epoch Accounts Hash feature is enabled, then the accounts hash will be part of consensus directly, since it'll be part of the bank hash once per epoch. That'll be the proper way to ensure safety for the whole cluster.
One interesting possibility is w.r.t. snapshot download in bootstrap. If a known validator calculates the accounts hash wrong due to a disk issue and an accounts storage file is bad, then it would be possible for a new validator to download this bad snapshot with the bad account. Again, it'll find out once that account is accessed next.
@HaoranYi, requesting your review here too, since you've recently been interacting with this code. Specifically around the accounts_hash_fault_injector. Do you rely on --halt-on-trusted-validators-accounts-hash-mismatch for any testing? If not, can we also remove ``accounts_hash_fault_injector`? (that would be for a different PR)
|
No, we don't rely on this cli argument for the fault injection test. |
steviez
left a comment
There was a problem hiding this comment.
Given Brook's insight about accounts hashes becoming part of consensus, that makes me feel better about ripping this check out altogether.
I thought about whether keeping a warning in place would be useful, but I don't think it would be.
- Suppose a node
Nis running with this flag for a set of known validators{K1, K2, ..., Kn} - If one of the known validators
Kideviates,Nwould get a warning. - But,
N's operator can't do anything to fixKidirectly, so seemingly not super helpful
| if matches.is_present("halt_on_known_validators_accounts_hash_mismatch") { | ||
| validator_config.halt_on_known_validators_accounts_hash_mismatch = true; | ||
| warn!("the `--halt-on-known-validators-accounts-hash-mismatch` argument is deprecated. please remove it from the command line"); |
There was a problem hiding this comment.
Checkout this struct and following for deprecated args. It
- Allows for consistent warning messages across deprecated args
- Gets deprecated arg handling out of the way of actual logic
But unfortunately, not immediately obvious to move stuff there unless you're already aware of it.
Line 1636 in 04bbf3b
|
I think we still want this - I just had one minor request and looks like things need a merge resolution now |
steviez
left a comment
There was a problem hiding this comment.
LGTM.
Looking at this again after a few weeks, I still think ripping this out is the right move. If this flag had wide adoption, a single node experiencing a bug or fault could cause a domino effect.
Additionally, we can't know if the node that we're checking against or we deviated on a slot, yet, this code makes only our node panic. Hypothetically, we could do some sampling of N nodes, but to make this robust we're basically trying to implement a stripped down consensus. Better to kill this altogether and let the feature Brooks previously mentioned (accounts hash becoming part of consensus) take effect.

Problem
old debug code lying around doing old debug code things
Summary of Changes
remove it