Skip to content

perf(storage): Static file provider delete_segment should use unwind_queue logic#21880

Closed
yongkangc wants to merge 10 commits intomainfrom
reth-343/delete-segment-unwind-queue
Closed

perf(storage): Static file provider delete_segment should use unwind_queue logic#21880
yongkangc wants to merge 10 commits intomainfrom
reth-343/delete-segment-unwind-queue

Conversation

@yongkangc
Copy link
Member

@yongkangc yongkangc commented Feb 6, 2026

Summary
delete_segment and delete_segment_below_block previously deleted static file jars immediately during the pruner run, before DatabaseProvider::commit().

This bypassed the unwind commit ordering (DB first, then static files) that ensures crash consistency, a crash after jar deletion but before DB commit left the database referencing jars that no longer existed on disk.

Solution:
This PR defers jar deletions to commit() time, matching the existing prune_on_commit pattern. has_unwind_queued() now returns true when deletions are queued, routing DatabaseProvider::commit() through the unwind path (DB commits first, then static files are deleted).

A single initialize_index() call replaces the per-jar reindex loop.

Expected Impact
Crash-consistent commit ordering for segment deletions, and O(1) reindexing instead of O(n) per-jar reindex during pruning.

Resolves RETH-343

@yongkangc yongkangc added the A-static-files Related to static files label Feb 6, 2026
@yongkangc yongkangc self-assigned this Feb 6, 2026
@github-project-automation github-project-automation bot moved this to Backlog in Reth Tracker Feb 6, 2026
yongkangc

This comment was marked as spam.

Make delete_segment and delete_segment_below_block in the static file
provider use an unwind queue pattern for consistency with the hot/cold
separation architecture. Deletions are now queued and only executed
when commit() is called, matching the existing prune_on_commit pattern.

- Add delete_queue to StaticFileProviderInner for deferred jar deletions
- Refactor delete_jar into delete_jar + delete_jar_no_reindex
- delete_segment/delete_segment_below_block now queue instead of
  immediately deleting
- commit() drains the delete queue, resets affected writers, executes
  deletions, and rebuilds the index once
- has_unwind_queued() checks the delete queue
- finalize() errors if delete queue is non-empty
- Add reset(segment) to StaticFileWriters to clear stale writer handles
- Add 3 tests for deferred deletion behavior

Amp-Thread-ID: https://ampcode.com/threads/T-019c30ef-1a42-75c9-b61d-81e062acc279
- Use highest expected range end (BTreeMap last key) instead of
  index.max_block for highest jar exclusion in delete_segment_below_block,
  fixing incorrect deletion of partially-filled highest jars
- Replace existing queue entries for the same segment on re-queue,
  preventing duplicate deletions if called twice before commit

Amp-Thread-ID: https://ampcode.com/threads/T-019c30ef-1a42-75c9-b61d-81e062acc279
…ntics, add 6 tests

- Extract collect_headers_and_queue() to deduplicate header loading
  between delete_segment and delete_segment_below_block
- Use scoped blocks instead of drop(indexes) for index reads
- Rename commit result variable to first_delete_err with comment
  clarifying reindex failure takes priority
- Use take_while iterator instead of manual loop in
  delete_segment_below_block
- Add tests: finalize guard, empty segment (both APIs), partial
  highest jar preservation, delete_below+delete_segment interaction,
  commit with empty queue noop

Amp-Thread-ID: https://ampcode.com/threads/T-019c3135-f7cb-7168-bc87-28a27f381a5a
- Failed jar deletions are re-enqueued for retry on next commit instead
  of being permanently lost.
- Rename has_delete_queued to has_queued_deletions.
- Remove obvious doc comments, trim test comments.
- Add test_write_after_full_deletion and
  test_delete_segment_then_delete_below_block.

Amp-Thread-ID: https://ampcode.com/threads/T-019c314c-645a-7311-905b-52de28ceaefb
If delete_jar_no_reindex fails midway through the deletion loop,
remaining operations are re-inserted into the queue so the next
commit() retries them. initialize_index() now always runs regardless
of deletion success to keep the in-memory index consistent.

Amp-Thread-ID: https://ampcode.com/threads/T-019c3170-799e-757d-ab53-6c1439d32885
@yongkangc yongkangc force-pushed the reth-343/delete-segment-unwind-queue branch from b1801ab to c9cdc25 Compare February 6, 2026 06:07
@github-actions
Copy link
Contributor

github-actions bot commented Feb 6, 2026

⚠️ Changelog not found.

A changelog entry is required before merging. We've generated a suggested changelog based on your changes:

Preview
---
reth-provider: minor
---

Added deferred deletion for static file jars with transactional commit semantics. Jar deletions requested via `delete_segment` and `delete_segment_below_block` are now queued and executed atomically during `StaticFileWriter::commit`, preventing inconsistencies between database state and on-disk files. Failed deletions are automatically re-queued for retry on subsequent commits.

Add changelog to commit this to your branch.

@yongkangc yongkangc marked this pull request as ready for review February 6, 2026 06:09
@yongkangc yongkangc requested a review from Rjected February 6, 2026 06:09
@yongkangc yongkangc changed the title feat(storage): defer delete_segment operations until commit perf(storage): Static file provider delete_segment should use unwind_queue logic Feb 6, 2026
- Add docstrings for queue_delete, has_queued_deletions, take_delete_queue,
  queue_delete_raw, delete_jar_no_reindex, reset, commit, has_unwind_queued,
  and finalize
- Add step-by-step section comments inside commit() for readability
- Rename max_block -> expected_end in delete_segment_below_block to clarify
  that the comparison uses expected block range ends (BTreeMap keys), not
  actual block numbers, preventing confusion about partially-filled jars
- Add comment explaining why highest jar exclusion is correct for
  partially-filled jars

Amp-Thread-ID: https://ampcode.com/threads/T-019c3194-a863-7159-9eed-8263c91452ce
- Add error semantics doc to commit() clarifying that DB changes are
  already durable when this method runs, and callers must not roll back.
- Clarify delete_segment_below_block docs: explicitly document that the
  highest jar is always preserved even when its blocks are below the
  threshold, and point to delete_segment for full removal.
- Add test for requeue-on-failure path: sabotages a jar's config file
  to force deletion failure, verifies unprocessed ops are requeued and
  has_unwind_queued remains true across retries.

Amp-Thread-ID: https://ampcode.com/threads/T-019c31f9-d89c-72bd-99bc-7937843012b4
@yongkangc yongkangc closed this Feb 11, 2026
@github-project-automation github-project-automation bot moved this from Backlog to Done in Reth Tracker Feb 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-static-files Related to static files

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

1 participant