Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core/trie: persist TrieJournal to journal file instead of kv database #2341

Merged
merged 15 commits into from
Apr 15, 2024

Conversation

jingjunLi
Copy link
Contributor

@jingjunLi jingjunLi commented Mar 26, 2024

Description

This PR provides another way to persist TrieJournal by write it into a log file instead of a kv database.

Use --journalFile flag to enable/disable the journal file feature.

Rationale

Persisting TrieJournal to KV database will cause problems below

Changes:

Data Format

Journal Version
Disk Root
Disk Layer
-- **Layer Length**
-- Root
-- State ID
-- Buffer Nodes
-- **Layer Checksum**
Diff Layer
-- **Layer Length**
-- Root
-- Block
-- Trie Nodes
-- Journal Accounts
-- **Layer Checksum**

Journal Path: geth/chaindata/ancient/state.journal
Checksum: SHA256

Testing

Function test

  • Normal Scenario:
    • When starting up for the first time, if the journal does not exist, blocks are appended normally.
    • Upon receiving a kill -15 signal, the journal is persisted with a size consistent with the file.
    • After the normal startup, blocks are appended normally.

image

The loaded diff layer matches the persistence saved as described above.
  • Abnormal Scenario: Manually truncate a portion of the snap file (geth/chaindata/ancient/state.journal) and restart.
    • When the number of blocks appended is less than 460,000 ("Rewound to block with state" number=0), the blockchain rewinds to block 0 and resumes appending blocks once completed.
    • When the number of blocks appended is greater than 460,000, the blockchain rewinds to a position slightly behind the head number, and resumes appending blocks once completed.
      image

Performance test

Scenario: Using TestJournal, each layer has 4K and 8K account operations.

WriteJournal LoadJournal Dirty State Size
New Journal WAL 6.149s 12.496s 1.19GiB
Old KVDB 6.341s 12.340s 1.21GiB
New Journal WAL 11.724s 20.525s 1.85GiB
Old KVDB 11.107s 19.415s 1.85GiB

Conclusion: The use of Journal WAL and storing trieJournalKey directly into kvdb results in nearly identical WAL write times during shutdown and WAL read/parsing times during startup.

@jingjunLi jingjunLi changed the title Fix size out of bounds core/state: replace writing TrieJournal to the database with a journal file Mar 26, 2024
@jingjunLi jingjunLi force-pushed the fix_size_out_of_bounds branch 3 times, most recently from 30d1328 to a63a4b9 Compare April 1, 2024 08:50
@jingjunLi jingjunLi requested a review from zzzckck as a code owner April 1, 2024 08:50
@jingjunLi jingjunLi closed this Apr 2, 2024
@jingjunLi jingjunLi reopened this Apr 7, 2024
cmd/geth/pruneblock_test.go Outdated Show resolved Hide resolved
@fynnss fynnss changed the title core/state: replace writing TrieJournal to the database with a journal file core: persist TrieJournal to journal file instead of kv database Apr 8, 2024
@fynnss fynnss changed the title core: persist TrieJournal to journal file instead of kv database core/trie: persist TrieJournal to journal file instead of kv database Apr 8, 2024
triedb/pathdb/journal.go Outdated Show resolved Hide resolved
triedb/pathdb/journal.go Outdated Show resolved Hide resolved
triedb/pathdb/journal.go Outdated Show resolved Hide resolved
cmd/utils/flags.go Outdated Show resolved Hide resolved
cmd/utils/flags.go Outdated Show resolved Hide resolved
eth/backend.go Outdated Show resolved Hide resolved
@zzzckck
Copy link
Collaborator

zzzckck commented Apr 9, 2024

The solution is ok for me, it would bring some advantages:
1.avoid the 2GB large KV limit of PebbleDB, as TrieJournal could > 2GB
2.avoid set large content into the PebbleDB, which would be deleted on restart, additional DBCompact cost. Log file is much more efficient.
I was a little bit concerned about the code intrusion, but I think the change is not too much and is acceptable.

It can be merged once there are >2 approval.

cmd/utils/flags.go Outdated Show resolved Hide resolved
eth/ethconfig/config.go Outdated Show resolved Hide resolved
triedb/pathdb/database.go Outdated Show resolved Hide resolved
triedb/pathdb/journal.go Outdated Show resolved Hide resolved
triedb/pathdb/journal.go Outdated Show resolved Hide resolved
triedb/pathdb/journal.go Outdated Show resolved Hide resolved
triedb/pathdb/journal.go Outdated Show resolved Hide resolved
RenRick
RenRick previously approved these changes Apr 9, 2024
Copy link

@RenRick RenRick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

triedb/pathdb/journal.go Outdated Show resolved Hide resolved
triedb/pathdb/journal.go Outdated Show resolved Hide resolved
@jingjunLi jingjunLi added the r4r ready for review label Apr 11, 2024
Copy link
Contributor

@fynnss fynnss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link

@RenRick RenRick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zzzckck zzzckck merged commit 3a6e3c6 into bnb-chain:develop Apr 15, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
r4r ready for review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants