Skip to content

Simplify and harden re-org handling (single writer path in mapping-sync)#1820

Merged
librelois merged 7 commits intomasterfrom
elois/refactor-reorgs
Feb 19, 2026
Merged

Simplify and harden re-org handling (single writer path in mapping-sync)#1820
librelois merged 7 commits intomasterfrom
elois/refactor-reorgs

Conversation

@librelois
Copy link
Member

@librelois librelois commented Feb 13, 2026

Goal

Simplify and harden re-org handling by moving canonical block-number mapping updates to a single writer path in mapping-sync, while keeping RPC reads non-mutating and preserving non-null eth_getBlockByNumber("latest") behavior.

What Changed

  • Centralized canonical block-number reconciliation in mapping-sync:

    • Added client/mapping-sync/src/kv/canonical_reconciler.rs as the single path that updates canonical number-to-hash mappings.
    • Re-org handling now reconciles a computed canonical window around the affected range.
    • Background repair batches now scan backward from the latest finalized/cursor so recent canonical blocks are repaired first.
    • Cursor updates, pointer advancement, and reconciliation stats are consolidated for monotonic/idempotent behavior.
  • Removed opportunistic mapping writes outside that path:

    • sync_block no longer conditionally writes number mappings.
    • RPC number-resolution no longer repairs/stomps mappings as a side effect.
  • Kept RPC reads non-mutating and resilient:

    • resolve_canonical_substrate_hash_by_number is read-only.
    • latest resolution now uses latest indexed block with bounded fallback to nearest readable canonical ancestor (LATEST_READABLE_SCAN_LIMIT = 128) plus a cached readable-latest hint.
    • eth_getBlockByNumber / eth_getBlockByHash now still return rich blocks when tx statuses are temporarily missing (status slots are filled as None) to avoid transient nulls.

Tests

  • Rust:

    • Added reconciliation tests for idempotency, monotonic latest-pointer behavior, and backward-priority batch repair.
    • Added tests for latest-readable selection logic and read-only canonical-hash resolution behavior.
  • TS:

    • ts-tests/tests/test-latest-block-consistency.ts: stress polling + re-org storm coverage to assert non-null latest/explicit block queries and eventual convergence.
    • ts-tests/tests/test-fee-history.ts: added bounded wait helper to handle short cache lag and reduce flakiness.

Reviewer Notes

  • Intentional architecture change: mapping-sync is now the only writer for canonical number mappings.
  • RPC paths are intentionally side-effect free.
  • Main areas to review:
    • Reconciliation window/cursor boundary handling.
    • Behavior under sustained lag/re-org pressure.
    • Any downstream assumptions that latest always equals immediate tip during indexing lag.

Compatibility / Ops Impact

  • No public API schema changes.
  • Behavior improves under lag/re-orgs (fewer null responses / fewer transient inconsistencies).
  • No operator migration steps required.

@coderabbitai
Copy link

coderabbitai bot commented Feb 13, 2026

📝 Walkthrough

Walkthrough

This PR introduces a canonical reconciliation engine for the KV mapping-sync layer that aligns frontier's canonical state with on-chain state, refactors block synchronization to use this engine, improves RPC block resolution with cached readable hash detection, and updates integration tests to handle asynchronous block availability more robustly.

Changes

Cohort / File(s) Summary
Canonical Reconciliation Engine
client/mapping-sync/src/kv/canonical_reconciler.rs, client/mapping-sync/src/kv/mod.rs
New module with reconciliation logic: ReconcileWindow, ReconcileStats types and three entry points (build_reconcile_window, reconcile_reorg_window, reconcile_from_cursor_batch). Supports ascending/descending scans, cursor-based progress tracking, and invariant validation. sync_block signature simplified to remove write_number_mapping parameter.
Reconciliation Integration
client/mapping-sync/src/kv/worker.rs
Updated to call canonical_reconciler::reconcile_from_cursor_batch instead of prior repair function; adjusted logging context from "mapping-sync" to "reconcile".
Block RPC Improvements
client/rpc/src/eth/block.rs
Added helper functions status_slots_or_missing and rich_block_or_none to refactor repeated block construction patterns; simplifies control flow in block_by_hash and block_by_number.
Cached Readable Hash Resolution
client/rpc/src/eth/mod.rs
Introduced find_readable_hash_from_number_desc helper and per-instance cache for latest readable substrate hash. Reworked latest_indexed_hash_with_block to perform bounded and exhaustive searches for readable hashes; removed storage_override parameter from resolve_canonical_substrate_hash_by_number.
Test Infrastructure Updates
ts-tests/tests/test-contract-methods.ts, ts-tests/tests/test-fee-history.ts, ts-tests/tests/test-latest-block-consistency.ts, ts-tests/tests/test-receipt-consistency.ts, ts-tests/tests/test-subscription.ts, ts-tests/tests/test-transaction-version.ts
Added polling helpers (waitForFeeHistory, waitForReceipt, waitForTxPoolPendingAtLeast, waitForTransactionSeen) and relaxed strict assertions on block/transaction consistency; replaced direct RPC calls with polling loops to handle asynchronous block availability; updated to async/await patterns.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant KVBackend as KV Backend
    participant Reconciler as Canonical Reconciler
    participant Storage as Storage Override
    participant Frontier as Frontier DB

    Client->>Reconciler: build_reconcile_window(reorg_info, new_best_hash)
    Reconciler->>KVBackend: query headers & reorg state
    KVBackend-->>Reconciler: header data
    Reconciler-->>Client: ReconcileWindow {start, end}

    Client->>Reconciler: reconcile_reorg_window(window, sync_from)
    loop For each block in range
        Reconciler->>KVBackend: get block header
        KVBackend-->>Reconciler: header
        Reconciler->>Storage: get canonical mapping
        Storage-->>Reconciler: mapping
        Reconciler->>Frontier: update canonical hash
        Frontier-->>Reconciler: ack
    end
    Reconciler->>KVBackend: update_repair_cursor(strategy)
    KVBackend-->>Reconciler: cursor updated
    Reconciler->>Reconciler: validate_latest_pointer_invariant()
    Reconciler-->>Client: ReconcileStats {scanned, updated, lag_blocks}
Loading
sequenceDiagram
    participant Eth as Eth RPC Handler
    participant Cache as Readable Hash Cache
    participant Client as Substrate Client
    participant Backend as FC Backend

    Eth->>Eth: latest_indexed_hash_with_block()
    Eth->>Cache: check cached hash
    alt Cache hit & valid
        Cache-->>Eth: cached hash
    else Cache miss or invalid
        Eth->>Client: get latest indexed block
        Client-->>Eth: latest block number
        Eth->>Eth: find_readable_hash_from_number_desc(bounded range)
        loop Descending block scan
            Eth->>Backend: is_readable(block_number)
            Backend-->>Eth: readable: bool
        end
        Eth->>Cache: update cache with readable hash
        Cache-->>Eth: ack
    end
    Eth-->>Eth: return hash with metadata
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • sorpaas
🚥 Pre-merge checks | ✅ 3 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 28.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main change: centralizing re-org handling into a single writer path in mapping-sync with simplified and hardened behavior.
Description check ✅ Passed The description is comprehensive and directly related to the changeset, explaining the architectural change, removed/modified functions, and behavioral improvements across multiple files.
Merge Conflict Detection ✅ Passed ✅ No merge conflicts detected when merging into master

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch elois/refactor-reorgs

No actionable comments were generated in the recent review. 🎉

🧹 Recent nitpick comments
ts-tests/tests/test-subscription.ts (1)

181-219: Consider applying similar timeout pattern to other subscription tests.

Other tests in this file (e.g., this one and tests at lines 221, 357, 393, 432, 474, 519) still use the manual dataResolve pattern without timeout protection. If the subscription fails to emit data, these tests would hang until the global test timeout.

Additionally, several tests use async function (done) (lines 112, 181, 221, etc.) which mixes Mocha's promise handling with callback handling—an anti-pattern that can cause subtle issues.

Consider applying the same dataPromise pattern with timeout and error handling for consistency and robustness across all subscription tests.

ts-tests/tests/test-receipt-consistency.ts (1)

31-42: Consider documenting the side-effect of block creation in waitForReceipt.

The helper calls createAndFinalizeBlockNowait between polling attempts, which creates additional blocks while waiting for a receipt. This is intentional to ensure receipts become available, but a brief comment explaining this behavior would improve clarity for future maintainers.

📝 Suggested documentation
 	async function waitForReceipt(txHash: string, timeoutMs = 10000) {
+		// Creates blocks between polls to ensure the transaction gets included
+		// if not already in a block.
 		const start = Date.now();
 		while (Date.now() - start < timeoutMs) {

Comment @coderabbitai help to get the list of available commands and usage tips.

@librelois librelois changed the title Simplify and harden re-org handling Simplify and harden re-org handling (single writer path in mapping-sync) Feb 13, 2026
@librelois librelois marked this pull request as ready for review February 13, 2026 17:17
@librelois librelois requested a review from sorpaas as a code owner February 13, 2026 17:17
type RuntimeStorageOverride = ();
}

const LATEST_READABLE_SCAN_LIMIT: u64 = 128;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't users be able to easily change this? Maybe adding an extra (optional) param for it (in EthConfiguration)?

@librelois librelois added this pull request to the merge queue Feb 19, 2026
github-merge-queue bot pushed a commit that referenced this pull request Feb 19, 2026
…nc) (#1820)

* rework re-orgs handling

* rustfmt

* clippy

* Batch reconciliation should scan backward

* On slow runners fee-history cache lags slightly behind block production

* keep block queries non-null on missing statuses; bound + cache latest fallback

* Harden RPC/latest consistency and stabilize flaky ts-tests under indexing lag
@librelois librelois removed this pull request from the merge queue due to a manual request Feb 19, 2026
@librelois librelois merged commit f82aaf2 into master Feb 19, 2026
7 checks passed
@librelois librelois deleted the elois/refactor-reorgs branch February 19, 2026 15:36
librelois added a commit to moonbeam-foundation/frontier that referenced this pull request Feb 19, 2026
…nc) (polkadot-evm#1820)

* rework re-orgs handling

* rustfmt

* clippy

* Batch reconciliation should scan backward

* On slow runners fee-history cache lags slightly behind block production

* keep block queries non-null on missing statuses; bound + cache latest fallback

* Harden RPC/latest consistency and stabilize flaky ts-tests under indexing lag
librelois added a commit to moonbeam-foundation/frontier that referenced this pull request Feb 19, 2026
…nc) (polkadot-evm#1820)

* rework re-orgs handling

* rustfmt

* clippy

* Batch reconciliation should scan backward

* On slow runners fee-history cache lags slightly behind block production

* keep block queries non-null on missing statuses; bound + cache latest fallback

* Harden RPC/latest consistency and stabilize flaky ts-tests under indexing lag
librelois added a commit to moonbeam-foundation/moonbeam that referenced this pull request Feb 20, 2026
…ntier#1824 (#3677)

* update frontier pin

* Configure AllowUnprotectedTxs to false

* Fix compile error

* Temporary allow unprotected txs to not break tests

* fix dev tests

* fix tracing tests

* fix coderabbit suggestion
arturgontijo pushed a commit to moonbeam-foundation/moonbeam that referenced this pull request Feb 23, 2026
…ntier#1824 (#3677)

* update frontier pin

* Configure AllowUnprotectedTxs to false

* Fix compile error

* Temporary allow unprotected txs to not break tests

* fix dev tests

* fix tracing tests

* fix coderabbit suggestion
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants