Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix wrong ref counting #1358

Merged
merged 3 commits into from
Sep 1, 2023
Merged

Fix wrong ref counting #1358

merged 3 commits into from
Sep 1, 2023

Conversation

eskimor
Copy link
Member

@eskimor eskimor commented Sep 1, 2023

in case of multiple inserts at same height.

More tests and sanity checking coming, but this is definitely wrong and also very likely the culprit we saw on Kusama.

in case of multiple inserts at same height.
@eskimor eskimor added the T0-node This PR/Issue is related to the topic “node”. label Sep 1, 2023
Comment on lines +138 to +145
if self
.candidates_by_block_number
.entry(block_number)
.or_default()
.insert(candidate_hash);
.insert(candidate_hash)
{
self.candidates.insert(candidate_hash);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this different from previous logic? I don't really understand, it should have worked for multiple blocks per height previously. This if will fail only in case of insertion of the same CandidateHash, which shouldn't happen, but even then it would be the same as previously.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, it's because of ref counting. nvm

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, because the inserts in candidates is ref counted. Therefore if we insert at the same height multiple times, we would increase the refcount each time. But in cleanup we would only decrement it once, hence we would never clean it.

Demonstration: The modified test fails without that change (pruning does not happen).

In case of forks there can easily be the same candidate included twice at the same height.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this will only happen if we insert the same candidate (hash), not different candidates at the same height, no? How is this possible?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess if there are multiple forks at the same height including the same candidate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes same hash, same height.

     - B
A<
     - B'

A candidate gets backed in A, afterwards we get a fork. Bitfields will be imported in both forks. Result: same candidate included at the same height.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really nasty. Good catch!

@eskimor eskimor enabled auto-merge (squash) September 1, 2023 18:08
@eskimor eskimor merged commit 23a2b7b into master Sep 1, 2023
103 of 108 checks passed
@eskimor eskimor deleted the rk-fix-wrong-refcounting branch September 1, 2023 18:56
ordian added a commit that referenced this pull request Sep 7, 2023
* master: (25 commits)
  Markdown linter (#1309)
  Update `fmt` file and some authors (#1379)
  Bump the known_good_semver group with 1 update (#1375)
  Bump proc-macro-warning from 0.4.1 to 0.4.2 (#1376)
  feat: add futures api to `TransactionPool` (#1348)
  Ensure cumulus/bridges is ignored by formatter and run it (#1369)
  substrate: chain-spec paths corrected in zombienet tests (#1362)
  contracts: Update to wasmi 0.31 (#1350)
  [improve docs]: Template pallet (#1280)
  [xcm-emulator] Unignore cumulus integration tests (#1247)
  Fix wrong ref counting (#1358)
  Use cached session index to obtain executor params (#1190)
  fix typos (#1339)
  Use bandersnatch-vrfs with locked dependencies ref (#1342)
  Bump bs58 from 0.4.0 to 0.5.0 (#1293)
  Contracts: `seal0::balance` should return the free balance (#1254)
  Logs: add extra debug log for negative rep changes (#1205)
  Added short-benchmarks for cumulus (#1183)
  [xcm-emulator] Improve hygiene and clean up (#1301)
  Bump the known_good_semver group with 1 update (#1347)
  ...
Daanvdplas pushed a commit that referenced this pull request Sep 11, 2023
* Fix wrong ref counting

in case of multiple inserts at same height.

* More warn

---------

Co-authored-by: eskimor <[email protected]>
@Polkadot-Forum
Copy link

This pull request has been mentioned on Polkadot Forum. There might be relevant details there:

https://forum.polkadot.network/t/stalled-parachains-on-kusama-post-mortem/3998/1

serban300 pushed a commit to serban300/polkadot-sdk that referenced this pull request Apr 8, 2024
serban300 pushed a commit to serban300/polkadot-sdk that referenced this pull request Apr 8, 2024
serban300 pushed a commit to serban300/polkadot-sdk that referenced this pull request Apr 8, 2024
serban300 pushed a commit to serban300/polkadot-sdk that referenced this pull request Apr 8, 2024
serban300 pushed a commit to serban300/polkadot-sdk that referenced this pull request Apr 9, 2024
serban300 pushed a commit to serban300/polkadot-sdk that referenced this pull request Apr 9, 2024
serban300 pushed a commit to serban300/polkadot-sdk that referenced this pull request Apr 9, 2024
serban300 pushed a commit to serban300/polkadot-sdk that referenced this pull request Apr 9, 2024
serban300 pushed a commit to serban300/polkadot-sdk that referenced this pull request Apr 9, 2024
serban300 pushed a commit to serban300/polkadot-sdk that referenced this pull request Apr 9, 2024
serban300 pushed a commit to serban300/polkadot-sdk that referenced this pull request Apr 10, 2024
serban300 pushed a commit to serban300/polkadot-sdk that referenced this pull request Apr 10, 2024
bkchr pushed a commit that referenced this pull request Apr 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T0-node This PR/Issue is related to the topic “node”.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants