Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add experimental x-trie-log subcommand for one-off backlog prune #6188

Closed
wants to merge 2 commits into from

Conversation

siladu
Copy link
Contributor

@siladu siladu commented Nov 20, 2023

PR description

Part of #5390

TODO

  • Test on a throwaway mainnet node: Add experimental x-trie-log subcommand for one-off backlog prune #6188 (comment)
    • took 8m 48s to prune 293458 trie logs (18 GB) on ~5 week old node
  • Test on one of our oldest nodes to understand duration - my guess is it will be in the order of minutes, hopefully less than 1 hour.
  • If it's really slow, could optimise by loading all trie logs once and reuse for count as well as prune (or just prune instead of count).

To install on a node

cd /opt/besu
sudo wget -O 6188.tar.gz https://output.circle-artifacts.com/output/job/a7a06df9-cf33-4115-89d1-413c86facd71/artifacts/0/distributions/besu-23.10.3-SNAPSHOT.tar.gz
sudo tar zxf 6188.tar.gz --transform s/besu-23.10.3-SNAPSHOT/6188/
time sudo /opt/besu/6188/bin/besu --config-file=/etc/besu/config.toml storage x-trie-log count

Usage and examples

$BESU storage x-trie-log -h
Usage: besu storage x-trie-log [-hV] [COMMAND]
Manipulate trie logs
  -h, --help      Show this help message and exit.
  -V, --version   Print version information and exit.
Commands:
  count  This command counts all the trie logs
  prune  This command prunes all trie log layers below the retention threshold,
           including orphaned trie logs.
$BESU --genesis-file=genesis.json --data-path=besu --data-storage-format=BONSAI storage x-trie-log count
...
Counting trie logs...
trieLog count: 512
 - canonical count: 512
 - fork count: 0
 - orphaned count: 0
$BESU --genesis-file=genesis.json --data-path=besu --data-storage-format=BONSAI storage x-trie-log prune
$BESU --genesis-file=genesis.json --data-path=besu --data-storage-format=BONSAI --Xbonsai-trie-log-retention-threshold=512 --Xbonsai-trie-log-pruning-limit=513 storage x-trie-log prune
...
Counting trie logs before prune...
trieLog count: 615
 - canonical count: 512
 - fork count: 0
 - orphaned count: 103

Total to prune = 615 (total) - 512 (retention threshold) =
=> 103

Estimated number of batches = max(103 (total to prune) / 513 (batch size), 1) = 1

Pruning batch 1
-----------------
2023-11-20 19:45:19.639+10:00 | main | INFO  | TrieLogPruner | Loading first 513 trie logs from database...
2023-11-20 19:45:19.649+10:00 | main | DEBUG | TrieLogPruner | Pruned 91 orphaned trie logs from database...
2023-11-20 19:45:19.649+10:00 | main | INFO  | TrieLogPruner | Loaded 422 trie logs from database
2023-11-20 19:45:19.657+10:00 | main | DEBUG | TrieLogPruner | pruned 0 trie logs from 0 blocks
Number pruned in batch = 91
Running total number pruned =
=> 91 of 103

Pruning batch 2
-----------------
2023-11-20 19:45:19.657+10:00 | main | INFO  | TrieLogPruner | Loading first 513 trie logs from database...
2023-11-20 19:45:19.665+10:00 | main | DEBUG | TrieLogPruner | Pruned 12 orphaned trie logs from database...
2023-11-20 19:45:19.665+10:00 | main | INFO  | TrieLogPruner | Loaded 501 trie logs from database
2023-11-20 19:45:19.666+10:00 | main | DEBUG | TrieLogPruner | pruned 0 trie logs from 0 blocks
Number pruned in batch = 12
Running total number pruned =
=> 103 of 103

Trie log prune complete!

Counting trie logs after prune...
trieLog count: 512
 - canonical count: 512
 - fork count: 0
 - orphaned count: 0

Copy link

github-actions bot commented Nov 20, 2023

  • I thought about documentation and added the doc-change-required label to this PR if updates are required.
  • I thought about the changelog and included a changelog update if required.
  • If my PR includes database changes (e.g. KeyValueSegmentIdentifier) I have thought about compatibility and performed forwards and backwards compatibility tests

Signed-off-by: Simon Dudley <[email protected]>
@siladu siladu marked this pull request as draft November 20, 2023 11:04
@siladu siladu changed the title Add x-trie-log subcommand for one-off backlog prune Add experimental x-trie-log subcommand for one-off backlog prune Nov 20, 2023
@siladu siladu added the TeamGroot GH issues worked on by Groot Team label Nov 20, 2023
@siladu siladu mentioned this pull request Nov 21, 2023
@siladu
Copy link
Contributor Author

siladu commented Nov 21, 2023

[email protected]:/opt/besu$ time sudo /opt/besu/6188/bin/besu --config-file=/etc/besu/config.toml storage x-trie-log count
...
Counting trie logs...
trieLog count: 293970
 - canonical count: 293098
 - fork count: 872
 - orphaned count: 0

real	1m35.013s


[email protected]:/opt/besu$ time  /opt/besu/6000/bin/besu --config-file=/etc/besu/config.toml operator x-rocksdb usage
...
****** Column family 'TRIE_LOG_STORAGE' size: 15 GiB ******
Number of live snapshots : 0
Number of keys : 293970
Total size of SST Files : 18 GiB
Size of live SST Filess : 18 GiB
Column family size (with getColumnFamilyMetaData) : 18 GiB
[email protected]:/opt/besu$ time sudo /opt/besu/6188/bin/besu --config-file=/etc/besu/config.toml storage x-trie-log prune
...
Counting trie logs before prune...
trieLog count: 293970
 - canonical count: 293098
 - fork count: 872
 - orphaned count: 0

Total to prune = 293970 (total) - 512 (retention threshold) =
=> 293458

Estimated number of batches = max(293458 (total to prune) / 30000 (batch size), 1) = 9

Pruning batch 1
-----------------
2023-11-21 02:55:35.495+00:00 | main | INFO  | TrieLogPruner | Loading first 30000 trie logs from database...
2023-11-21 02:55:43.237+00:00 | main | DEBUG | TrieLogPruner | Pruned 0 orphaned trie logs from database...
2023-11-21 02:55:43.238+00:00 | main | INFO  | TrieLogPruner | Loaded 30000 trie logs from database
2023-11-21 02:55:43.448+00:00 | main | DEBUG | TrieLogPruner | pruned 29952 trie logs from 29948 blocks
Number pruned in batch = 29952
Running total number pruned =
=> 29952 of 293458

Pruning batch 2
-----------------
2023-11-21 02:55:43.449+00:00 | main | INFO  | TrieLogPruner | Loading first 30000 trie logs from database...
2023-11-21 02:55:54.019+00:00 | main | DEBUG | TrieLogPruner | Pruned 0 orphaned trie logs from database...
2023-11-21 02:55:54.020+00:00 | main | INFO  | TrieLogPruner | Loaded 30000 trie logs from database
2023-11-21 02:55:54.168+00:00 | main | DEBUG | TrieLogPruner | pruned 29894 trie logs from 29885 blocks
Number pruned in batch = 29894
Running total number pruned =
=> 59846 of 293458

Pruning batch 3
-----------------
2023-11-21 02:55:54.168+00:00 | main | INFO  | TrieLogPruner | Loading first 30000 trie logs from database...
2023-11-21 02:56:09.171+00:00 | main | DEBUG | TrieLogPruner | Pruned 0 orphaned trie logs from database...
2023-11-21 02:56:09.176+00:00 | main | INFO  | TrieLogPruner | Loaded 30000 trie logs from database
2023-11-21 02:56:09.310+00:00 | main | DEBUG | TrieLogPruner | pruned 29838 trie logs from 29834 blocks
Number pruned in batch = 29838
Running total number pruned =
=> 89684 of 293458

Pruning batch 4
-----------------
2023-11-21 02:56:09.311+00:00 | main | INFO  | TrieLogPruner | Loading first 30000 trie logs from database...
2023-11-21 02:56:35.866+00:00 | main | DEBUG | TrieLogPruner | Pruned 0 orphaned trie logs from database...
2023-11-21 02:56:35.872+00:00 | main | INFO  | TrieLogPruner | Loaded 30000 trie logs from database
2023-11-21 02:56:36.005+00:00 | main | DEBUG | TrieLogPruner | pruned 29789 trie logs from 29782 blocks
Number pruned in batch = 29789
Running total number pruned =
=> 119473 of 293458

Pruning batch 5
-----------------
2023-11-21 02:56:36.005+00:00 | main | INFO  | TrieLogPruner | Loading first 30000 trie logs from database...
2023-11-21 02:57:09.467+00:00 | main | DEBUG | TrieLogPruner | Pruned 0 orphaned trie logs from database...
2023-11-21 02:57:09.469+00:00 | main | INFO  | TrieLogPruner | Loaded 30000 trie logs from database
2023-11-21 02:57:09.592+00:00 | main | DEBUG | TrieLogPruner | pruned 29740 trie logs from 29727 blocks
Number pruned in batch = 29740
Running total number pruned =
=> 149213 of 293458

Pruning batch 6
-----------------
2023-11-21 02:57:09.593+00:00 | main | INFO  | TrieLogPruner | Loading first 30000 trie logs from database...
2023-11-21 02:57:46.739+00:00 | main | DEBUG | TrieLogPruner | Pruned 0 orphaned trie logs from database...
2023-11-21 02:57:46.744+00:00 | main | INFO  | TrieLogPruner | Loaded 30000 trie logs from database
2023-11-21 02:57:46.867+00:00 | main | DEBUG | TrieLogPruner | pruned 29687 trie logs from 29675 blocks
Number pruned in batch = 29687
Running total number pruned =
=> 178900 of 293458

Pruning batch 7
-----------------
2023-11-21 02:57:46.868+00:00 | main | INFO  | TrieLogPruner | Loading first 30000 trie logs from database...
2023-11-21 02:58:33.087+00:00 | main | DEBUG | TrieLogPruner | Pruned 0 orphaned trie logs from database...
2023-11-21 02:58:33.091+00:00 | main | INFO  | TrieLogPruner | Loaded 30000 trie logs from database
2023-11-21 02:58:33.217+00:00 | main | DEBUG | TrieLogPruner | pruned 29629 trie logs from 29615 blocks
Number pruned in batch = 29629
Running total number pruned =
=> 208529 of 293458

Pruning batch 8
-----------------
2023-11-21 02:58:33.218+00:00 | main | INFO  | TrieLogPruner | Loading first 30000 trie logs from database...
2023-11-21 02:59:27.971+00:00 | main | DEBUG | TrieLogPruner | Pruned 0 orphaned trie logs from database...
2023-11-21 02:59:27.975+00:00 | main | INFO  | TrieLogPruner | Loaded 30000 trie logs from database
2023-11-21 02:59:28.123+00:00 | main | DEBUG | TrieLogPruner | pruned 29584 trie logs from 29576 blocks
Number pruned in batch = 29584
Running total number pruned =
=> 238113 of 293458

Pruning batch 9
-----------------
2023-11-21 02:59:28.124+00:00 | main | INFO  | TrieLogPruner | Loading first 30000 trie logs from database...
2023-11-21 03:00:30.754+00:00 | main | DEBUG | TrieLogPruner | Pruned 0 orphaned trie logs from database...
2023-11-21 03:00:30.759+00:00 | main | INFO  | TrieLogPruner | Loaded 30000 trie logs from database
2023-11-21 03:00:30.888+00:00 | main | DEBUG | TrieLogPruner | pruned 29538 trie logs from 29530 blocks
Number pruned in batch = 29538
Running total number pruned =
=> 267651 of 293458

Pruning batch 10
-----------------
2023-11-21 03:00:30.889+00:00 | main | INFO  | TrieLogPruner | Loading first 30000 trie logs from database...
2023-11-21 03:01:40.510+00:00 | main | DEBUG | TrieLogPruner | Pruned 0 orphaned trie logs from database...
2023-11-21 03:01:40.514+00:00 | main | INFO  | TrieLogPruner | Loaded 26319 trie logs from database
2023-11-21 03:01:40.627+00:00 | main | DEBUG | TrieLogPruner | pruned 25807 trie logs from 25796 blocks
Number pruned in batch = 25807
Running total number pruned =
=> 293458 of 293458

Trie log prune complete!

Counting trie logs after prune...
trieLog count: 512
 - canonical count: 512
 - fork count: 0
 - orphaned count: 0

real	8m48.266s


[email protected]:/opt/besu$ time sudo /opt/besu/6000/bin/besu --config-file=/etc/besu/config.toml operator x-rocksdb usage

****** Column family 'TRIE_LOG_STORAGE'  size: 207 MiB ******
Number of live snapshots : 0
Number of keys : 512
Total size of SST Files : 18 GiB
Size of live SST Filess : 18 GiB
Column family size (with getColumnFamilyMetaData) : 218 MiB

@gfukushima
Copy link
Contributor

gfukushima commented Dec 11, 2023

My last run on this with a medium-running node (3.5 months) gave this results:

Running total number pruned =
=> 1700328 of 1700330
No progress in 5 batches, exiting

real 195m11.516s

DB was reduced by 64GB
Before:

| Column Family                  | Keys            | Column Size  | SST Files Size  |
| TRIE_LOG_STORAGE               | 1670744         | 49 GiB       | 77 GiB          |

After:

| Column Family                  | Keys            | Column Size  | SST Files Size  |
| TRIE_LOG_STORAGE               | 77              | 11 GiB       | 13 GiB          |

Comment on lines +145 to +149
rootWorldStateStorage
.streamTrieLogKeys(limit)
.map(Bytes32::wrap)
.map(Hash::wrap)
.forEach(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am guessing that most users of this subcommand are going to want to prune pretty close to head. Meaning there will be far more trielogs deleted than retained. This implementation is safe, but probably going to be pretty slow in most cases.

IMO we could/should add an alternate implementation that checks that the number of retained trielogs is below a certain reasonable threshold. We copy those trielogs to an alternate column family, truncate the trielog family, and move the trielogs back into the newly truncated CF. That should make this operation markedly faster for what I suspect will be the typical use case (offline pruning a huge number of trielogs)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to be clear, I would start from the blockchain storage and get the latest 'x' hashes that we want to retain rather than streaming all of the keys in the trielog CF

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
TeamGroot GH issues worked on by Groot Team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants