Skip to content
This repository was archived by the owner on Jan 22, 2025. It is now read-only.

Don't reset credits_observed due to stale voters#13836

Merged
ryoqun merged 3 commits intosolana-labs:masterfrom
ryoqun:really-correct-credits-observed-rewriting
Nov 30, 2020
Merged

Don't reset credits_observed due to stale voters#13836
ryoqun merged 3 commits intosolana-labs:masterfrom
ryoqun:really-correct-credits-observed-rewriting

Conversation

@ryoqun
Copy link
Copy Markdown
Contributor

@ryoqun ryoqun commented Nov 26, 2020

Problem

#13680 resets stake account's credits_observed to 0 (returned from calculate_points_and_credits()) for stale vote accounts. That makes the stake account is incorrectly eligible for inflation rewords:

let (points, credits_observed) = self.calculate_points_and_credits(
vote_state,
stake_history,
inflation_point_calc_tracer,
fix_stake_deactivate,
);
// Drive credits_observed forward unconditionally when rewards are disabled
if point_value.rewards == 0 && fix_stake_deactivate {
return Some((0, 0, credits_observed));
}

For fun fact, at the next epoch, credits_observed will reset back to vote's credits. Then, yet another next epoch, it'll be again be 0. and so on.

So, when the inflation is activated at the odd-number after the stake_program_v2 activation, our inflation would be skewed. On the other hand, it won't be skewed at the even-number after the activation.

Summary of Changes

  • return current credits_observed value from calculate_points_and_credits()

todo

  • write a test
  • think about gating?

Fixes #

@ryoqun ryoqun marked this pull request as draft November 26, 2020 22:48
@ryoqun ryoqun marked this pull request as ready for review November 26, 2020 22:48
@ryoqun ryoqun marked this pull request as draft November 26, 2020 22:49
// if there is no newer credits since observed, return no point
if new_vote_state.credits() <= self.credits_observed {
return (0, 0);
if fix_stake_deactivate {
Copy link
Copy Markdown
Contributor Author

@ryoqun ryoqun Nov 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For testnet, we can freeride on this fix_stake_deactivate flag (backed by already activated stake_program_v2) only after inflation is started.
For mainnet-beta, we are still safe to edit this.

Copy link
Copy Markdown
Contributor Author

@ryoqun ryoqun Nov 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strictly speaking, we don't need this fix_stake_deactivate flag at all. That's because we can prove we never use the second item in this returned tuple statically if fix_stake_deactivate = false; this becomes relevant only with fix_stake_deactivate = false. Any I'm adding this for consistency and just be in the safe side.

Comment thread programs/stake/src/stake_state.rs
@codecov
Copy link
Copy Markdown

codecov Bot commented Nov 26, 2020

Codecov Report

Merging #13836 (b265662) into master (6048342) will increase coverage by 0.0%.
The diff coverage is 100.0%.

@@           Coverage Diff           @@
##           master   #13836   +/-   ##
=======================================
  Coverage    82.0%    82.0%           
=======================================
  Files         381      381           
  Lines       93220    93232   +12     
=======================================
+ Hits        76530    76541   +11     
- Misses      16690    16691    +1     

) -> (u128, u64) {
// if there is no newer credits since observed, return no point
if new_vote_state.credits() <= self.credits_observed {
return (0, 0);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, returning (_, 0) was never a good idea to begin with.... Semantically, credits_observed shouldn't be 0 in all cases. Although, it had not been used up to #13680, I really tempted to fix this at #13325. In retrospect, I really wish I just did it right at the time.......

Copy link
Copy Markdown
Contributor Author

@ryoqun ryoqun Nov 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lessons learnt: never violate invariants (= credits_observed can't be 0) (introduced at #10914, https://github.com/solana-labs/solana/pull/10914/files#r531390890) nevertheless at-the-time code was irrelevant of the violation.

it surely will bite future maintainers like this. (hah, past me bit current me...)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh dang... so obvious in retrospect! How'd I miss that 🤦‍♂️

@ryoqun
Copy link
Copy Markdown
Contributor Author

ryoqun commented Nov 27, 2020

Combined with #13837, I confirmed this pr fixes this stale stake state (pun intended).

$ cat tds-pre-rewrite9.csv | grep -E 'earned_|21yEAinmJfFWFx6QfYcPygvGqUj2vYtQZsrDf6dTNHAN' 
account,owner,old_balance,new_balance,data_size,delegation,delegation_owner,effective_stake,delegated_stake,rent_exempt_reserve,activation_epoch,deactivation_epoch,earned_epochs,epoch,epoch_credits,epoch_points,epoch_stake,old_credits_observed,new_credits_observed,base_rewards,stake_rewards,vote_rewards,commission,cluster_rewards,cluster_points
21yEAinmJfFWFx6QfYcPygvGqUj2vYtQZsrDf6dTNHAN,Stake11111111111111111111111111111111111111,25000000000,25000000000,200,A2Q7aAanJe9vETdYoYdS191YZGoMZ17CGJJuUEZmHX9,Vote111111111111111111111111111111111111111,24997717120,24997717120,2282880,95,N/A,0,N/A,N/A,N/A,N/A,4021838,0,0,0,0,100,N/A,N/A
==> 2020-11-27 12:17:31.166366184|0
$ cat tds-pre-rewrite10.csv | grep -E 'earned_|21yEAinmJfFWFx6QfYcPygvGqUj2vYtQZsrDf6dTNHAN' 
account,owner,old_balance,new_balance,data_size,delegation,delegation_owner,effective_stake,delegated_stake,rent_exempt_reserve,activation_epoch,deactivation_epoch,earned_epochs,epoch,epoch_credits,epoch_points,epoch_stake,old_credits_observed,new_credits_observed,base_rewards,stake_rewards,vote_rewards,commission,cluster_rewards,cluster_points
21yEAinmJfFWFx6QfYcPygvGqUj2vYtQZsrDf6dTNHAN,Stake11111111111111111111111111111111111111,25000000000,25000000000,200,A2Q7aAanJe9vETdYoYdS191YZGoMZ17CGJJuUEZmHX9,Vote111111111111111111111111111111111111111,24997717120,24997717120,2282880,95,N/A,0,N/A,N/A,N/A,N/A,4021838,4021838,0,0,0,100,N/A,N/A

@ryoqun
Copy link
Copy Markdown
Contributor Author

ryoqun commented Nov 27, 2020

sad, ci passed without chaining the test while I definitely changed implementation code....

The diff coverage is 100.0%.

vanity metrics here. ;)

@ryoqun
Copy link
Copy Markdown
Contributor Author

ryoqun commented Nov 27, 2020

well, I'm regarding that it's very rare for these critical bugs to slip through my eyes. So I went to solo postmortem on this incident.

In short, I think this was just a fluke. To my best, I had several protection to notice this but all of it failed at once to my surprise.

I guess the likelihood is like once in a year... I don't have clear counter-measures against this kind of unnoticed bugs. Alternatively, I think I should be more careful. (too obvious)....

Anyway, concluding this as rather a fluke, I believe there should be no more other similar critical bugs due to my lack of general competence.

As a detail, I failed to notice this bug at these individual opportunities:

@t-nelson
Copy link
Copy Markdown
Contributor

No need to beat yourself @ryoqun! I was all over that code for #13680 and #13712 and managed to miss it as well. The staking logic is major 🤕

@ryoqun ryoqun marked this pull request as ready for review November 28, 2020 12:46
@ryoqun
Copy link
Copy Markdown
Contributor Author

ryoqun commented Nov 28, 2020

No need to beat yourself @ryoqun! I was all over that code for #13680 and #13712 and managed to miss it as well. The staking logic is major face_with_head_bandage

thanks for saying so. well, I don't have that bitter feeling anymore. I just wanted to learn something from this oversight. ;)

Copy link
Copy Markdown
Contributor

@t-nelson t-nelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Just a comment typo nit.

Nice job catching this one!

Comment thread programs/stake/src/stake_state.rs Outdated
)
);

// credis_observed isn't reset for stale vote accounts while no rewards
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// credis_observed isn't reset for stale vote accounts while no rewards
// credits_observed isn't reset for stale vote accounts while no rewards

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a suggestion for this comment (which I found confusing):
credits_observed remains at previous level when vote_state credits are not advancing and inflation is disabled

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

admittedly, I had some difficulty to describe this.. Thank for helping writing. :)

CriesofCarrots
CriesofCarrots previously approved these changes Nov 29, 2020
Copy link
Copy Markdown
Contributor

@CriesofCarrots CriesofCarrots left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm too! One suggestion that I think makes that comment a little more clear.

Comment thread programs/stake/src/stake_state.rs Outdated
)
);

// credis_observed isn't reset for stale vote accounts while no rewards
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a suggestion for this comment (which I found confusing):
credits_observed remains at previous level when vote_state credits are not advancing and inflation is disabled

@mvines
Copy link
Copy Markdown
Contributor

mvines commented Nov 30, 2020

This needs a v1.4 label right? And just to confirm, we're safe to land this ungated?

@ryoqun ryoqun added v1.3 and removed v1.3 labels Nov 30, 2020
@ryoqun
Copy link
Copy Markdown
Contributor Author

ryoqun commented Nov 30, 2020

This needs a v1.4 label right?

Yeah, I added one. Assuming stake program v2 isn't activated on mainnet-beta with v1.3, we can skip v1.3

And just to confirm, we're safe to land this ungated?

Yeah, that should be so. I'll check this now.

=> EDIT: Yeah, I don't need this; I double-checked this again with testnet/mainnet ledgers to justify the lack of gating.

@mergify mergify Bot dismissed CriesofCarrots’s stale review November 30, 2020 10:07

Pull request has been modified.

@ryoqun ryoqun merged commit e81c2c8 into solana-labs:master Nov 30, 2020
mergify Bot pushed a commit that referenced this pull request Nov 30, 2020
* Don't reset credits_observed due to stale voters

* Add tests

* Fix comment

(cherry picked from commit e81c2c8)
mergify Bot added a commit that referenced this pull request Nov 30, 2020
* Don't reset credits_observed due to stale voters

* Add tests

* Fix comment

(cherry picked from commit e81c2c8)

Co-authored-by: Ryo Onodera <ryoqun@gmail.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants