Skip to content

feat(cli): add reth db migrate-v2 for v1→v2 storage migration#23422

Merged
klkvr merged 11 commits into
mainfrom
klkvr/db-migrate-v2
Apr 18, 2026
Merged

feat(cli): add reth db migrate-v2 for v1→v2 storage migration#23422
klkvr merged 11 commits into
mainfrom
klkvr/db-migrate-v2

Conversation

@decofe
Copy link
Copy Markdown
Member

@decofe decofe commented Apr 9, 2026

Adds reth db migrate-v2 — an offline CLI command that migrates a v1 (MDBX-only) database to v2 (static files + RocksDB hybrid).

Motivation

Teams running v1 archive/pruned nodes need a way to migrate to v2 storage without re-syncing from scratch.

Changes

  • crates/cli/commands/src/db/migrate_v2.rs: migration logic
  • crates/cli/commands/src/db/mod.rs: wire subcommand, pipeline rebuild, compaction

Approach

Only migrate data that cannot be recomputed (changesets + receipts), then clear everything that can be recomputed and run the pipeline to rebuild it.

Migration phases

  1. Preflight — verify v1 settings, check changeset static file targets are empty
  2. AccountChangeSets → static files (all blocks first..=tip, including empty)
  3. StorageChangeSets → static files (all blocks first..=tip, including empty)
  4. Receipts → static files (reuses Segment::copy_to_static_files; skipped if receipts_log_filter pruning is enabled)
  5. Flip StorageSettings to v2
  6. Clear recomputable tables — TransactionSenders, TransactionHashNumbers, AccountsHistory, StoragesHistory, PlainAccountState, PlainStorageState, AccountsTrie, StoragesTrie, plus migrated changeset MDBX tables
  7. Reset stage checkpoints — SenderRecovery, TransactionLookup, IndexAccountHistory, IndexStorageHistory, MerkleExecute, MerkleUnwind → 0
  8. Run pipelineDefaultStages with noop downloaders/consensus/evm + max_block=tip. Stages already at tip (Headers, Bodies, Execution) no-op; reset stages rebuild their tables
  9. Compact MDBX via mdbx_env_copy + swap

Pruned node support

Uses PruneCheckpoints per segment to find the first unpruned block (AccountHistory, StorageHistory, Receipts). Static file writers start at the correct block via get_writer(first_block, segment).

Safety

  • All migration providers use disable_long_read_transaction_safety()
  • Metadata update happens before clearing — interrupted clear still leaves a valid v2 database
  • Receipts with receipts_log_filter pruning stay in MDBX
  • MDBX compaction uses rename-swap with backup for crash safety

Prompted by: klkvr

Co-authored-by: Arsenii Kulikov <62447812+klkvr@users.noreply.github.com>
Amp-Thread-ID: https://ampcode.com/threads/T-019d725d-8f9f-740f-abae-cea809eeb511
decofe and others added 8 commits April 9, 2026 14:38
Adds PlainAccountState and PlainStorageState to the --prune-mdbx
table list (superseded by HashedAccounts/HashedStorages in v2).

Runs mdbx_env_copy with MDBX_CP_COMPACT after pruning to reclaim
freed space, then swaps the compacted copy in after dropping the
DB handle.

Co-authored-by: Arsenii Kulikov <62447812+klkvr@users.noreply.github.com>
Amp-Thread-ID: https://ampcode.com/threads/T-019d725d-8f9f-740f-abae-cea809eeb511
Static files may not start at block 0 on pruned nodes. Use
cursor.first() to find the actual first available block, then
get_writer(first_block, segment) instead of latest_writer(segment).
Also call ensure_at_block(tip) for TransactionSenders to fill
trailing empty blocks.

Co-authored-by: Arsenii Kulikov <62447812+klkvr@users.noreply.github.com>
Amp-Thread-ID: https://ampcode.com/threads/T-019d725d-8f9f-740f-abae-cea809eeb511
- Remove --prune-mdbx flag: pruning migrated tables (including
  PlainAccountState/PlainStorageState) and compaction are now
  always performed as part of the migration.
- Skip receipt migration when receipts_log_filter pruning is
  enabled (receipts must stay in MDBX for log filter queries).
- Remove Command fields — it's now a unit struct.

Co-authored-by: Arsenii Kulikov <62447812+klkvr@users.noreply.github.com>
Amp-Thread-ID: https://ampcode.com/threads/T-019d725d-8f9f-740f-abae-cea809eeb511
On pruned nodes, BlockBodyIndices may not start at block 0. Use
cursor.first() to find the first available block, same as the
other segment migrations.

Co-authored-by: Arsenii Kulikov <62447812+klkvr@users.noreply.github.com>
Amp-Thread-ID: https://ampcode.com/threads/T-019d725d-8f9f-740f-abae-cea809eeb511
BlockBodyIndices is never pruned, so cursor.first() always returns
block 0. Use PruneCheckpoints instead to find the first unpruned
block per segment (SenderRecovery, AccountHistory, StorageHistory,
Receipts).

Also clear AccountsTrie and StoragesTrie tables and reset the
MerkleExecute stage checkpoint to 0 so the trie is rebuilt on
next startup.

Co-authored-by: Arsenii Kulikov <62447812+klkvr@users.noreply.github.com>
Amp-Thread-ID: https://ampcode.com/threads/T-019d725d-8f9f-740f-abae-cea809eeb511
… the rest

Simplify migrate-v2 to only copy data that can't be recomputed:
- AccountChangeSets → static files
- StorageChangeSets → static files
- Receipts → static files (unless log filter pruning)

Then clear all recomputable tables and reset their stage
checkpoints to 0. The pipeline will rebuild on next startup:
- TransactionSenders (SenderRecovery)
- TransactionHashNumbers (TransactionLookup)
- AccountsHistory / StoragesHistory (IndexAccountHistory / IndexStorageHistory)
- AccountsTrie / StoragesTrie (MerkleExecute)
- PlainAccountState / PlainStorageState (no longer needed in v2)

Also use disable_long_read_transaction_safety() on all migration
providers to avoid tripping the long-tx watchdog.

Co-authored-by: Arsenii Kulikov <62447812+klkvr@users.noreply.github.com>
Amp-Thread-ID: https://ampcode.com/threads/T-019d725d-8f9f-740f-abae-cea809eeb511
Build a DefaultStages pipeline with noop downloaders/consensus/evm
and max_block=tip, then run it after migration. Stages already at
tip (Headers, Bodies, Execution) will no-op; stages with reset
checkpoints (SenderRecovery, TransactionLookup, IndexAccountHistory,
IndexStorageHistory, MerkleExecute) will execute and rebuild their
tables.

Compact MDBX after the pipeline finishes.

Co-authored-by: Arsenii Kulikov <62447812+klkvr@users.noreply.github.com>
Amp-Thread-ID: https://ampcode.com/threads/T-019d725d-8f9f-740f-abae-cea809eeb511
…peline

Move all logic (pipeline + compaction) into migrate_v2::Command.
mod.rs is now a thin wrapper: init env → execute → reopen → pipeline.

Compaction now runs BEFORE the pipeline so the pipeline operates
on the smaller compacted database.

Flow: migrate data → flip v2 → clear tables → compact MDBX →
swap → reopen → run pipeline to rebuild.

Co-authored-by: Arsenii Kulikov <62447812+klkvr@users.noreply.github.com>
Amp-Thread-ID: https://ampcode.com/threads/T-019d725d-8f9f-740f-abae-cea809eeb511
@emmajam
Copy link
Copy Markdown
Member

emmajam commented Apr 10, 2026

nice!

@github-project-automation github-project-automation Bot moved this from Backlog to In Progress in Reth Tracker Apr 15, 2026
@shekhirin shekhirin requested a review from gakonst as a code owner April 15, 2026 17:21
@klkvr klkvr added this pull request to the merge queue Apr 18, 2026
Merged via the queue into main with commit 03a308d Apr 18, 2026
34 checks passed
@klkvr klkvr deleted the klkvr/db-migrate-v2 branch April 18, 2026 18:21
@github-project-automation github-project-automation Bot moved this from In Progress to Done in Reth Tracker Apr 18, 2026
crazywriter1 pushed a commit to crazywriter1/tempo that referenced this pull request Apr 23, 2026
Automated nightly update of reth dependencies from `paradigmxyz/reth`
main branch.

## Upstream reth changes


[`98ebc34...7839f3d`](paradigmxyz/reth@98ebc34...7839f3d)

🔗 Amp thread:
https://ampcode.com/threads/T-019db884-a7c3-738b-8f38-cc04f8942d8a
**Engine**
- Suppress persistence during payload building
([#23618](paradigmxyz/reth#23618))
- Align Amsterdam endpoint validation
([#23625](paradigmxyz/reth#23625))
- Revert [#23541](paradigmxyz/reth#23541) and
[#23578](paradigmxyz/reth#23578)
([#23646](paradigmxyz/reth#23646))
- Let consensus impls control which errors are transient
([#23668](paradigmxyz/reth#23668))
- Configure invalid header cache hit eviction
([#23670](paradigmxyz/reth#23670))

**Perf**
- Relax executor reset thresholds for re-execute
([#23617](paradigmxyz/reth#23617))
- Replace `BTreeMap` with `imbl::OrdMap` in `BestTransactions`
([#23621](paradigmxyz/reth#23621))
- Avoid reopening `.csoff` on every changeset lookup
([#23687](paradigmxyz/reth#23687))
- Disable read tx timeout during re-execute
([#23680](paradigmxyz/reth#23680))

**P2P / Net**
- Add snap/2 wire helpers and messages
([#23611](paradigmxyz/reth#23611))
- Optionally fetch BAL with full blocks
([#23629](paradigmxyz/reth#23629))
- Discv5 enabled by default
([#23686](paradigmxyz/reth#23686))

**DB**
- Add `reth db migrate-v2` for v1→v2 storage migration
([#23422](paradigmxyz/reth#23422))
- Detect and warn about ZFS
([#23685](paradigmxyz/reth#23685))

**BAL**
- Scaffold BAL store abstraction
([#23596](paradigmxyz/reth#23596))
- Enable BAL building in ethereum payload
([#23597](paradigmxyz/reth#23597))
- Add parallelization and batch IO flags
([#23663](paradigmxyz/reth#23663))

**Refactor**
- Make `WorkerPool` lazy by default
([#23627](paradigmxyz/reth#23627))
- Encapsulate state fetching in db provider
([#23656](paradigmxyz/reth#23656))
- Remove `TrieNodeProvider`
([#23658](paradigmxyz/reth#23658))
- Unify opaque consensus error helpers
([#23669](paradigmxyz/reth#23669))

**Payload**
- Add gas limit and slot number to `BlockOrPayload`
([#23624](paradigmxyz/reth#23624),
[#23626](paradigmxyz/reth#23626))

**Bench**
- Add CLI flag to fetch balances by default; require local benchmark
data ([#23655](paradigmxyz/reth#23655),
[#23679](paradigmxyz/reth#23679))

**Deps**
- Bump alloy crates to 2.0.1
([#23677](paradigmxyz/reth#23677)),
rustls-webpki
([#23681](paradigmxyz/reth#23681)), weekly
`cargo update`
([#23628](paradigmxyz/reth#23628))

**Testing**
- Remove unsafe `env::set_var(RUST_LOG)` from tests
([#23672](paradigmxyz/reth#23672))
- Address nightly clippy warnings
([#23630](paradigmxyz/reth#23630))

## Migrations

🔗 Amp thread:
https://ampcode.com/threads/T-019db884-dc46-71f5-a823-00c3a16191d4
- **Reth dependency bump**: All `reth-*` git dependencies updated from
rev `98ebc34` to `7839f3d`
- **Alloy version bump**: `alloy-*` crates updated from `2.0.0` to
`2.0.1`; `alloy-evm` changed from `0.33.2` to `0.33.0`
- **`ConsensusError::Other` → `ConsensusError::msg`**: All
`ConsensusError::Other(...)` calls migrated to
`ConsensusError::msg(...)`, which accepts `&str`/`impl Display` directly
instead of requiring `String` (removes `.to_string()` calls for string
literals)
- **`deny.toml` license exceptions**: Added MPL-2.0 exceptions for
`bitmaps`, `imbl`, and `imbl-sized-chunks` (new transitive dependencies)

[GitHub
Workflow](https://github.com/tempoxyz/tempo/actions/runs/24816009191)

---------

Co-authored-by: Alexey Shekhirin <github@shekhirin.com>
sieniven pushed a commit to okx/reth that referenced this pull request Apr 28, 2026
…digmxyz#23422)

Co-authored-by: Arsenii Kulikov <62447812+klkvr@users.noreply.github.com>
Co-authored-by: klkvr <klkvrr@gmail.com>
Co-authored-by: Alexey Shekhirin <github@shekhirin.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants