Skip to content

Add workflows#2

Merged
chengwenxi merged 4 commits intomainfrom
workflows
Jan 4, 2026
Merged

Add workflows#2
chengwenxi merged 4 commits intomainfrom
workflows

Conversation

@chengwenxi
Copy link
Copy Markdown
Contributor

No description provided.

@chengwenxi chengwenxi merged commit 56e575b into main Jan 4, 2026
7 checks passed
@chengwenxi chengwenxi deleted the workflows branch January 4, 2026 12:05
panos-xyz added a commit that referenced this pull request May 9, 2026
* feat(reference-index): add morph-reference-index crate with MDBX storage layer

- 4-table schema: ReferenceIndex, BlockReferenceIndex, IndexedBlocks, IndexMeta
- chain identity validation on DB open (chain_id, genesis_hash, schema_version)
- BackfillState (NotStarted/InProgress/Complete) + is_ready AtomicBool
- writer: write_block / delete_block (three-table atomic)
- reader: prefix cursor query with is_ready + lag threshold guards
- backfill: jade_first_block binary search + batched backfill with crash recovery
- reconcile: canonical hash check (offline reorg detection) + suffix gap fill
- metrics helpers for lag, progress, state, readiness
- 17 unit tests all passing

* feat(rpc): wire morph_getTransactionHashesByReference RPC + ExEx

Implements the RPC namespace, ExEx, and node integration that back the new
reference index storage layer added in the previous commit.

- morph-rpc: new morph_ namespace with getTransactionHashesByReference
  - returns -32000 "reference index initializing" before Task A completes
  - returns -32000 "reference index is behind" when lag exceeds threshold
  - validates limit/offset bounds with -32602
- morph-node: ReferenceIndexControl (watch-channel-coordinated) wires the
  startup indexing task (Task A) with the ExEx (Task B)
  - Task A: maybe_reset_jade_sentinel -> backfill -> reconcile ->
    is_ready=true -> startup FinishedHeight
  - Task B: drains notifications from node launch, gates writes on is_ready,
    gap-fills from main DB on first is_ready notification, processes
    ChainCommitted/Reverted/Reorged with three-table atomic writes, sends
    FinishedHeight(BlockNumHash)
- MorphAddOns: optional reference_index control; when present spawns Task A
  via task_executor and registers the morph_ RPC handler
- bin/morph-reth: opens reference index DB under <datadir>/morph/reference_index,
  installs the ExEx, and injects the control into MorphAddOns

* chore(reference-index): fix clippy lints (collapsible_if, redundant_clone)

* chore: cargo fmt

* test(reference-index): add integration tests for backfill + query

Three node-level integration tests verify the reference index storage layer:
- finds single morph_tx by reference after backfill
- paginates results across multiple blocks correctly
- returns empty for unrelated reference keys

Tests run backfill + reconcile directly against the node's provider
(no ExEx required), which is sufficient to validate the query path.

* fix(reference-index): fix 5 adversarial review findings

Finding 4 (Critical): NoHash → WithHash for all historical block reads
  - backfill.rs, reconcile.rs, exex/reference_index.rs fill_gap
  - reth docs say TransactionVariant::NoHash produces invalid tx hashes;
    write_block stores tx_hash in index keys, so NoHash caused silent data
    corruption in all historical indexing paths

Finding 3 (Moderate): reconcile check_start clamped to indexed_from
  - sentinel path (jade not yet active): backfill sets indexed_to=head but
    writes no IndexedBlocks, so reconcile's scan would see None != canonical
    and trigger spurious fork_height at check_start
  - fix: take max(indexed_from, indexed_to - depth) as check_start

Finding 1 (Critical): Task A startup failure upgraded to fatal
  - was: error! log and continue, leaving RPC stuck at "initializing" forever
  - now: panic propagated into spawn_critical → node shutdown on failure

Finding 5 (Moderate): first-ready gap fill is now delete-then-write
  - fill_gap renamed to fill_gap_idempotent, deletes stale entries first
  - also handles ChainReverted notification as first is_ready notification

Finding 8 (Nit): RPC method marked blocking for MDBX I/O
  - #[method(name = ..., blocking)] so jsonrpsee dispatches on blocking pool

Also: clarify write_block doc — not idempotent, caller must delete_block first

* fix(reference-index): fix 3 more confirmed bugs

Finding F2 (High): InProgress resume clamped to jade_first_block_number
  - If crash happens between InProgress write and first batch commit,
    indexed_to is still 0; resume now uses max(indexed_to+1, jade_first)
    instead of plain indexed_to+1 to avoid re-indexing pre-Jade blocks

Finding F3 (High): RPC offset/limit use alloy_serde::quantity::opt
  - geth sends hex-encoded quantities ("0x0", "0x64") on the wire;
    plain u64 serde deserializer rejected them silently
  - matches spec constraint #2: geth compatibility first

Finding F5 (Medium): paired snapshot validation implemented
  - ReferenceIndexDb::validate_paired_snapshot(main_block_hash_fn)
    checks snapshot_block_hash/number against the main DB provider
    and returns descriptive errors on mismatch or ahead-of-DB cases
  - called from run_startup_indexing before backfill starts, after
    provider is available (matching spec timing requirement)

* fix(reference-index): first-ready ChainReverted must fill gap up to parent

Scenario:
  - reconcile finishes with indexed_to = H
  - is_ready=false drain window: ExEx drains ChainCommitted{H+1}, {H+2}
    but writes nothing
  - is_ready=true, first_ready=true
  - first notification is ChainReverted { old: [H+2] }
  - revert_start = H+2, parent = H+1

Old code:
  first_ready branch only filled when revert_start <= indexed_to (false here),
  so nothing happened. handle_notification then ran delete_block(H+2) as a
  no-op and set indexed_to = parent = H+1 -- but H+1 had never been written.
  Result: indexed_to is ahead of what is actually indexed, future reconcile
  sees phantom hash mismatch on H+1.

Fix:
  In first_ready ChainReverted branch, compute parent = revert_start - 1.
  If parent > indexed_to, drain window committed canonical blocks
  (indexed_to+1..=parent) that will SURVIVE this revert; backfill them now
  via fill_gap_idempotent. handle_notification then rolls back to parent
  consistently.

  parent <= indexed_to case (revert overlaps already-indexed range) is
  already correct without special handling.

* fix(reference-index): final sweep — reconcile None skip + monotonic FinishedHeight

Finding 1 (High): reconcile treated indexed_hash=None as hash mismatch
  The sentinel path (Jade not yet activated) sets indexed_to=head but writes
  no IndexedBlocks entries.  reconcile then scanned [indexed_from, indexed_to],
  found None != canonical_hash, and falsely triggered reorg rebuild for
  pre-Jade blocks.  Fix: skip blocks where IndexedBlocks has no entry (None
  means "never indexed", not "hash mismatch").

  Note: indexed_from.max(depth_start) from an earlier round correctly narrows
  the scan range but did NOT prevent the None comparison since the range still
  included indexed_to itself in the sentinel case.

Finding 2 (Medium): startup FinishedHeight could regress past live commits
  tokio::select! is non-deterministic: if a ChainCommitted notification is
  processed before the startup watch channel fires, ExExEvent::FinishedHeight
  would be sent at H+k then regressed to H (the startup indexed_to).
  Fix: introduce send_finished_height_monotonic() that only sends when the
  new height strictly exceeds last_finished; both the startup watch arm and
  ChainCommitted/ChainReorged share this helper.  ChainReorged is the one
  permitted exception (reth ExEx docs allow height to go down on reorgs).

* fix(reference-index): InProgress+sentinel re-resolves Jade; RPC internal error scrub

Finding 1 (Critical): InProgress + sentinel crash-recovery skips Jade history
  If node crashes between writing InProgress+SENTINEL (txn 1) and writing
  Complete (txn 2), the next restart had BackfillState::InProgress and
  jade_first_block_number=SENTINEL.  The previous fix for InProgress resume
  only avoided starting from block 1, but the sentinel fast-path inside the
  InProgress branch still blindly marked Complete without re-resolving Jade.

  Fix: when InProgress+SENTINEL, re-run resolve_jade_first_block against the
  current head.  If Jade is now active, persist the real jade_first_block_number
  and continue backfill from there.  Only take the immediate-complete shortcut
  when Jade is still not active on the current head.

Finding 2 (Moderate): to_rpc_error leaked internal Database/Provider/Other
  error strings verbatim to RPC callers.  The spec only documents two
  state-gating responses; anything else is an internal failure.
  Fix: map Database/Provider/Other to a fixed -32603 "internal reference
  index error" message; log the full error internally with tracing::error!.

* chore(reference-index): remove unused paired-snapshot validation

SnapshotBlockNumber/SnapshotBlockHash metadata keys and validate_paired_snapshot()
had no write path — ops tar-compress the datadir directly so these fields were
always empty and the check was always a no-op. chain_id + genesis_hash at open()
plus reconcile's canonical hash check already cover the mismatch cases that matter.

* chore(reference-index): remove unused pub methods and dead imports

- ReferenceIndexDb: drop highest_indexed_block / highest_block_reference_index
  (no callers in the workspace)
- ReferenceIndexDb: remove unused DbCursorRO import exposed by the deletion
- ReferenceIndexReader: drop db() accessor (no callers in the workspace)

* chore(reference-index): simplify tracing targets, drop sleep, clean up error wrapping

- Extract const TARGET in backfill, reconcile, exex and handler so the
  tracing target string is defined once per file and easy to grep
- Remove the 10ms inter-batch sleep in run_backfill; on a 2.6M-block
  backfill (256-block batches) this was ~100s of pure idle time with no
  benefit
- Simplify best_block_number error path in MorphRpcHandler: use the
  existing From<ProviderError> impl instead of wrapping with eyre::eyre!

* fix(ci): satisfy clippy and cargo-deny on PR #106

- drop redundant `db.clone()` in reference_index integration test
- remove RUSTSEC-2026-0002 from advisory ignore list (no longer matched)

* fix(ci): resolve cargo-deny advisories on PR #106

Update transitive dependencies where cargo can select patched releases, and document the hickory-proto advisory exceptions that remain pinned by the current reth dependency graph.

Constraint: PR #106 cargo-deny failure blocks merge
Rejected: Broadly ignore all new advisories | patched multihash and rustls-webpki versions are available
Confidence: medium
Scope-risk: narrow
Not-tested: Skipped additional verification per user request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant