Skip to content

feat(pruner): respect batch size per run#4246

Merged
shekhirin merged 25 commits intomainfrom
alexey/pruner-batch-size
Aug 23, 2023
Merged

feat(pruner): respect batch size per run#4246
shekhirin merged 25 commits intomainfrom
alexey/pruner-batch-size

Conversation

@shekhirin
Copy link
Member

@shekhirin shekhirin commented Aug 17, 2023

Problem

Previously batches inside the pruner were only used to print the progress, but the pruning itself was done in one go: one large MDBX transaction that deleted the data in tables according to the provided pruning modes.

This approach had one big downside: MDBX can't do a good job of deleting large chunks of data. This is due to how MDBX deletions work: pages previously occupied by the data aren't just deleted but instead placed inside a "freelist" data structure, which consequent data insertions use to minimize the allocations of new pages. The problem arises when we delete a lot of data in one go: the freelist grows too much, and to insert e.g. a new large entry into Transactions table, the MDBX needs to scan the freelist for overflow pages first. It might take a lot of time, and we noticed it with Optimism transactions that took 30 overflow pages to insert.

Solution

Delete only small portions of data in one pruner run, allowing the data from new blocks to fill the gaps in database to prevent the freelist from growing too much and causing the aforementioned issue. It works by re-using the BatchSizes struct, but instead using it to limit the amount of rows to delete per prune part.

It means that:

  1. To transition the node from archive to full, we will need to gradually prune a lot of data (account/storage changesets/history take the most) which will take a lot of time. So in practice it becomes more feasible to just re-sync the full node.
  2. To prune the TxSenders table (which is the only table that's populated during the pipeline sync even if the prune mode says to prune it) up to the specified block, the pruner will need to run a certain amount of times, allowing the live sync to insert new data between pruner runs. If we run the pruner every 5 blocks, prune 1000 transaction senders per run and have 2 billion transaction senders to prune, it will take 2_000_000_000 transaction senders / 1000 transaction senders per run = 2_000_000 pruner runs and 2_000_000 pruner runs / (5 blocks * 12 seconds) = 9.2 hours.

@codecov
Copy link

codecov bot commented Aug 17, 2023

Codecov Report

Merging #4246 (407bdcc) into main (1343644) will increase coverage by 0.07%.
The diff coverage is 81.66%.

Impacted file tree graph

Files Changed Coverage Δ
crates/consensus/beacon/src/engine/prune.rs 94.11% <ø> (ø)
crates/primitives/src/prune/checkpoint.rs 100.00% <ø> (ø)
crates/revm/src/executor.rs 90.28% <16.66%> (-0.45%) ⬇️
...tes/storage/provider/src/providers/database/mod.rs 35.29% <25.00%> (-0.50%) ⬇️
crates/prune/src/pruner.rs 83.52% <81.40%> (+4.88%) ⬆️
crates/interfaces/src/test_utils/generators.rs 98.20% <100.00%> (+2.72%) ⬆️
crates/primitives/src/prune/mod.rs 87.17% <100.00%> (+50.33%) ⬆️
crates/stages/src/stages/sender_recovery.rs 92.50% <100.00%> (+0.04%) ⬆️
crates/stages/src/stages/tx_lookup.rs 96.38% <100.00%> (+0.02%) ⬆️
...torage/provider/src/providers/database/provider.rs 79.71% <100.00%> (-0.12%) ⬇️

... and 13 files with indirect coverage changes

Flag Coverage Δ
integration-tests 16.73% <0.00%> (-0.09%) ⬇️
unit-tests 63.85% <81.66%> (+0.09%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
reth binary 26.11% <ø> (ø)
blockchain tree 82.56% <ø> (ø)
pipeline 90.07% <100.00%> (+<0.01%) ⬆️
storage (db) 74.71% <84.61%> (-0.06%) ⬇️
trie 94.84% <ø> (-0.04%) ⬇️
txpool 47.94% <ø> (-0.53%) ⬇️
networking 77.47% <ø> (-0.09%) ⬇️
rpc 58.80% <ø> (-0.01%) ⬇️
consensus 63.53% <ø> (ø)
revm 31.97% <16.66%> (-0.04%) ⬇️
payload builder 6.78% <ø> (ø)
primitives 86.31% <100.00%> (+0.21%) ⬆️

@shekhirin shekhirin force-pushed the alexey/pruner-batch-size branch from 8d90ebb to d838d1c Compare August 18, 2023 14:54
@shekhirin shekhirin force-pushed the alexey/pruner-batch-size branch from d838d1c to adffcf6 Compare August 18, 2023 14:57
@shekhirin shekhirin force-pushed the alexey/pruner-batch-size branch 2 times, most recently from c26026c to 128a293 Compare August 18, 2023 16:24
@shekhirin shekhirin force-pushed the alexey/pruner-batch-size branch from 128a293 to 54a67f6 Compare August 18, 2023 17:10
@shekhirin shekhirin marked this pull request as ready for review August 18, 2023 17:19
/// Returns number of rows pruned.
pub fn prune_table_with_range_in_batches<T: Table>(
/// Returns number of total unique keys and total rows pruned pruned.
pub fn prune_table_with_range<T: Table>(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: doc what the bool represents

Copy link
Collaborator

@joshieDo joshieDo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic seems good, lets test this bad boy on mainnet

@shekhirin shekhirin force-pushed the alexey/pruner-batch-size branch from a560441 to 4939ab9 Compare August 21, 2023 11:46
@shekhirin shekhirin force-pushed the alexey/pruner-batch-size branch from 45d801b to bedb5a4 Compare August 22, 2023 10:11
@shekhirin shekhirin force-pushed the alexey/pruner-batch-size branch from 6511021 to 1ac798b Compare August 22, 2023 12:12
@shekhirin shekhirin force-pushed the alexey/pruner-batch-size branch from bdf64ed to b941ed4 Compare August 22, 2023 12:59
.modes
.contract_logs_filter
.lowest_block_with_distance(tip_block_number, pruned_block)?
.lowest_block_with_distance(tip_block_number, initial_last_pruned_block)?
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it correct? shouldn't it be last_pruned_block?

Copy link
Collaborator

@joshieDo joshieDo Aug 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be correct, this argument works as a lower bound.

if we set it to last_pruned_block something of the kind would happen:

example:

  • (distance(200), addrA)
  • contract log pruner goes until distance(128).
  • last_pruned_block would be the block from distance(128)

Which would mean that we'd never go through the blocks between distance(200) -> distance(128) for the tip of this particular run. However, for future runs with higher tips, they should get checked and cleaned-up

@shekhirin shekhirin force-pushed the alexey/pruner-batch-size branch from d4dcc1f to 407bdcc Compare August 23, 2023 17:06
@shekhirin shekhirin added this pull request to the merge queue Aug 23, 2023
Merged via the queue into main with commit 312cf72 Aug 23, 2023
@shekhirin shekhirin deleted the alexey/pruner-batch-size branch August 23, 2023 17:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants