feat(static-file): incremental changeset offset storage#21593
Closed
gakonst wants to merge 1 commit into
Closed
Conversation
…cture Currently, changeset offsets (Vec<ChangesetOffset>) are stored inline in SegmentHeader and fully rewritten on every commit. For segments with 500k+ blocks, this means ~8MB rewritten per commit even when appending a single block. This PR adds the infrastructure for incremental changeset offset storage: 1. ChangesetOffsetsMeta: Lightweight metadata struct (len + version) that will replace the inline Vec in SegmentHeader 2. ChangesetOffsetWriter: Append-only writer for the new .csoff sidecar file 3. ChangesetOffsetReader: O(1) random-access reader using fixed 16-byte records 4. CHANGESET_OFFSETS_FILE_EXTENSION constant for the new file type 5. Design doc explaining the migration strategy and crash consistency Performance impact: - Append: O(total_blocks) -> O(1) (16 bytes per block) - Commit overhead: ~8MB for 500k blocks -> ~100 bytes (header only) - Prune: O(remaining_blocks) -> O(1) (len update only) Follow-up PRs will: - Integrate the sidecar into SegmentHeader (replace Vec with Meta) - Update StaticFileProviderRW commit/prune paths - Add migration logic for existing segments
Member
Author
CI Fix BlockedThe base commit (feat: sparse trie as cache #21583) introduced inconsistencies in
Options:
Closing this draft PR until the base is stable. Will re-open with a clean implementation. |
Member
Author
|
Closing due to inconsistencies in base commit. Will re-open after fixing upstream issues. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replace inline
Vec<ChangesetOffset>inSegmentHeaderwith a separate.csoffsidecar file for incremental append/prune operations.Problem
Previously, changeset offsets were stored as
Vec<ChangesetOffset>inSegmentHeaderand fully rewritten on every commit. For segments with 500k+ blocks, this meant ~8MB rewritten per commit even when appending a single block.Solution
.csoffsidecar file (fixed 16-byte records)SegmentHeadernow stores onlychangeset_offsets_len: u64(count)Performance Impact
Changes
reth-static-file-types
ChangesetOffsetsWriter/ChangesetOffsetsReaderfor sidecar I/Ochangeset_offsets: Vecwithchangeset_offsets_len: u64inSegmentHeaderreth-nippy-jar
changeset_offsets_path()for.csofffile pathsdelete()to clean up sidecar filesreth-provider
ChangesetOffsetsWritertoStaticFileProviderRWread_changeset_offset()/read_changeset_offsets()toStaticFileJarProvidermanager.rsto read offsets from sidecar filetruncate_changesetsto truncate sidecar fileBreaking Change
This changes the static file format for changeset segments. Existing changeset static files are not backwards compatible and must be regenerated.
Testing
cargo test -p reth-static-file-types cargo check -p reth-providercc @joshieDo @mattsse