Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Full node takes up more space than archive node #7238

Open
chenqping opened this issue Jun 18, 2024 · 10 comments
Open

Full node takes up more space than archive node #7238

chenqping opened this issue Jun 18, 2024 · 10 comments
Labels
bonsai forest non mainnet (private networks) not related to mainnet features - covers privacy, permissioning, IBFT2, QBFT question

Comments

@chenqping
Copy link

chenqping commented Jun 18, 2024

Description

we deployed two nodes for high availability, one archive node and one full node(snap sync both Forest and Bonsai tried), the space archive node takes up is only 3.9 G, while Forest format takes up to 8.7G, Bonesai takes up to 5.6G, very weird as we think full node should save space.

Acceptance Criteria

  • full node should take up less space

Steps to Reproduce (Bug)

  1. configure an archive node and a full node with snap sync Bonsai format
  2. let full node connect to archive node to sync, archive node connects to external validators to sync(we may change both to connect outside directly to sync later)
  3. du the data directory

Expected behavior: [What you expect to happen]
The full node data directory should be smaller
Actual behavior: [What actually happens]
The full node data directory is bigger
Frequency: [What percentage of the time does it occur?]
always

Logs (if a bug)

Please post relevant logs from Besu (and the consensus client, if running proof of stake) from before and after the issue.

Versions (Add all that apply)

  • Software version: 23.10.1
  • Java version: 17.0.6
  • OS Name & Version: red hat enterprise 8.9
  • Kernel Version: 4.18.0-513.11.1.e18_9.x86_64
  • Virtual Machine software & version:
  • Docker Version:
  • Cloud VM, type, size:
  • Consensus Client & Version if using Proof of Stake:

Smart contract information (If you're reporting an issue arising from deploying or calling a smart contract, please supply related information)

  • Solidity version [solc --version]
  • Repo with minimal set of deployable/reproducible contract code - please provide a link
  • Please include specifics on how you are deploying/calling the contract
  • Have you reproduced the issue on other eth clients

Additional Information (Add any of the following or anything else that may be relevant)

  • Besu setup info - qbft consensus
  • System info - memory, CPU
@non-fungible-nelson
Copy link
Contributor

Hi there - can you try updating the nodes and seeing if anything changes? We have made some improvements to the database in subsequent versions.

My hunch is that over time, the Archive node will absolutely be larger. We keep more data around in Full nodes to help with block processing performance like caches. Over time, this will not increase linearly, but the Archive node will.

@matkt might also have some insight into this, and also commands you can run to give the size of your database, perhaps.

@non-fungible-nelson non-fungible-nelson added question forest bonsai non mainnet (private networks) not related to mainnet features - covers privacy, permissioning, IBFT2, QBFT labels Jun 24, 2024
@matkt
Copy link
Contributor

matkt commented Jun 25, 2024

could you share your configuration (flags etc) for each bonsai test ?

@matkt
Copy link
Contributor

matkt commented Jun 25, 2024

could you also run

./bin/besu --data-path=/data/besu storage rocksdb usage

in order to have more info on your database for each step

@chenqping
Copy link
Author

chenqping commented Jun 28, 2024

@matkt

could you also run

./bin/besu --data-path=/data/besu storage rocksdb usage

in order to have more info on your database for each step

Hi, upload the snapshots from the two nodes, and configuration
full node
full-node-storage
archive node
archive-node-storage

#Sync
sync-mode="X_SNAP"
data-storage-format="BONSAI"
#bonsai-historical-block-limit=256
fast-sync-min-peers=1

@chenqping
Copy link
Author

chenqping commented Jun 28, 2024

Hi there - can you try updating the nodes and seeing if anything changes? We have made some improvements to the database in subsequent versions.

My hunch is that over time, the Archive node will absolutely be larger. We keep more data around in Full nodes to help with block processing performance like caches. Over time, this will not increase linearly, but the Archive node will.

@matkt might also have some insight into this, and also commands you can run to give the size of your database, perhaps.

hi @non-fungible-nelson, which version, and do you hv any calculation formula or ratio of full node storage vs archive nodes?

@matkt
Copy link
Contributor

matkt commented Jun 28, 2024

@matkt

could you also run

./bin/besu --data-path=/data/besu storage rocksdb usage

in order to have more info on your database for each step

Hi, upload the snapshots from the two nodes, and configuration full node full-node-storage archive node archive-node-storage

#Sync sync-mode="X_SNAP" data-storage-format="BONSAI" #bonsai-historical-block-limit=256 fast-sync-min-peers=1

your screenshot seems to be invalid . the full node don't have any state , only the blockchain is saved. and the archive has the column of a forest node and the size seems to really small.
is your node syncing ?

@matkt
Copy link
Contributor

matkt commented Jun 28, 2024

it will be nice to share your logs when your bonsai nodes are starting to be sure you have the good configuration

@chenqping
Copy link
Author

chenqping commented Jun 28, 2024

it will be nice to share your logs when your bonsai nodes are starting to be sure you have the good configuration

Hi matkt thanks for reponding, from the eth_syncing API call, the archive node is false but in fact always importing blocks from an external source, the full node (follows the archive node) shows it's always syncing with start, current, and highest, so in our scenario, the archive node is always ahead of the full node

here uploads the full node log we configured rolling, so here gave the current log file
besu.log

@matkt
Copy link
Contributor

matkt commented Jun 28, 2024

thanks but I need more logs. when you restart your node you should have something

####################################################################################################
#                                                                                                  #
# Besu version 24.6.0                                                                              #
#                                                                                                  #
# Configuration:                                                                                   #
# Network: Mainnet                                                                                 #
# Network Id: 1                                                                                    #
# Data storage: Bonsai                                                                             #
# Sync mode: Checkpoint                                                                            #
# RPC HTTP APIs: FLEET,TRACE,ADMIN,DEBUG,NET,ETH,WEB3,TXPOOL                                       #
# RPC HTTP port: 8545                                                                              #
# Engine APIs: ENGINE,ETH                                                                          #
# Engine port: 8551                                                                                #
# Engine JWT: /etc/jwt-secret.hex                                                                  #
# Using LAYERED transaction pool implementation                                                    #
# Using STACKED worldstate update mode                                                             #
# Limit trie logs enabled: retention: 512; prune window: 30000                                     #
#                                                                                                  #
# Host:                                                                                            #
# Java: openjdk-java-21                                                                            #
# Maximum heap size: 3.90 GB                                                                       #
# OS: linux-x86_64                                                                                 #
# glibc: 2.35                                                                                      #
# jemalloc: 5.2.1-0-gea6b3e973b477b8061e0076bb257dbd7f3faa756                                      #
# Total memory: 15.60 GB                                                                           #
# CPU cores: 4                                                                                     #
#                                                                                                  #
# Plugin Registration Summary:                                                                     #
####################################################################################################","throwable":""}

also regarding the log you are sharing your don't sync at all.

are you running a qbft network ? if you want to use snapsync with a qbft network there is a PR in order to enable that #7140

for the moment you can use fastsync if you want to sync quickly

@chenqping
Copy link
Author

chenqping commented Jul 2, 2024

Hi @matkt sorry for late update, as mentioned, we used two nodes above to follow block producing network with 4 qbft nodes, one is archive node, and the other is snap bonsai sync configuration, we want to compare archive node with full node in private chains, how much storage can save versus archive node, here attach the full node start log for diagnosis, thanks. I also tried fast sync ,it indeed synced fast but storage a little higher than archive node too. Also what's the difference between fast sync and snap sync ,thanks!
full node start log.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bonsai forest non mainnet (private networks) not related to mainnet features - covers privacy, permissioning, IBFT2, QBFT question
Projects
None yet
Development

No branches or pull requests

3 participants