Skip to content

chore: repo hygeine refactor#19

Merged
frisitano merged 91 commits into
eth-act:feat/eip8025from
frisitano:chore/repo-hygeine-refactor
Mar 27, 2026
Merged

chore: repo hygeine refactor#19
frisitano merged 91 commits into
eth-act:feat/eip8025from
frisitano:chore/repo-hygeine-refactor

Conversation

@frisitano
Copy link
Copy Markdown
Collaborator

@frisitano frisitano commented Mar 26, 2026

Overview

  • Improved dependency management and repo hygiene with crate refactoring
  • CI gren run

frisitano and others added 30 commits March 7, 2026 22:57
Fix pre-existing lint errors in base branch:
- Unused variable fulu_fork_epoch in chain_spec.rs
- Large error variant warnings in execution_layer and slasher_service
- Dead code warnings in backfill_sync and custody_backfill_sync

Also fix missing fork check in proof_verification to reject pre-Fulu forks
…ields, use ssz_fixed_len, make peer_id required
Add #[allow(clippy::result_large_err)] to closures in http_api that
return Result<_, BeaconChainError> to fix clippy warnings.

Files modified:
- beacon_node/http_api/src/attestation_performance.rs
- beacon_node/http_api/src/attester_duties.rs
- beacon_node/http_api/src/block_packing_efficiency.rs
- beacon_node/http_api/src/block_rewards.rs
- beacon_node/http_api/src/sync_committee_rewards.rs
- beacon_node/http_api/src/sync_committees.rs
- beacon_node/http_api/src/ui.rs
- Add #[allow(clippy::result_large_err)] to test functions in
  attestation_verification.rs and store_tests.rs to fix check-code CI failure

- Combine boot_node_enr() and wait_for_boot_node_enr() into a single
  async boot_node_enr() that polls until the ENR has a valid TCP port.
  When OS-assigned ports (port 0) are used, the network service updates
  the ENR asynchronously via NewListenAddr events, so on slow CI runners
  the ENR may not have a valid port immediately after node startup.
  This fixes the fallback-simulator-ubuntu and debug-tests-ubuntu CI failures.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
CI fixes:
- Upgrade bytes to 1.11.1 to resolve RUSTSEC-2026-0007 (integer overflow vulnerability in BytesMut::reserve)
- Remove #[allow(clippy::result_large_err)] from verify_builder_bid; box at abstraction level by moving fallible Withdrawals conversion out of closure so ? uses the function's Box<InvalidBuilderPayload> return type
- Increase genesis_delay from 20s to 60s in proof_engine test fixture to accommodate node startup time

PR review changes:
- Remove pr-description.md
- Remove UnsupportedFork variant from ExecutionProofError (no fork activation check needed for EIP-8025)
- Improve ExecutionProofStatus doc comments to clarify field semantics
- Add doc comment explaining how local_execution_proof_status is maintained in NetworkGlobals
- Replace #[allow(dead_code)] with #[cfg_attr(feature = "disable-backfill", allow(dead_code))] in backfill_sync/mod.rs and custody_backfill_sync/mod.rs to properly use feature flags
- Rename on_proof_capable_peer_connected → add_peer in ProofSync to align with other sync subsystems
- Simplify find_best_proof_capable_peer in network_context.rs to use only the ExecutionProofStatus cache (no redundant ENR check), keeping only primary selection (verified peer with highest slot)
- Remove connected_proof_capable_peers() from SyncNetworkContext (cache is now the source of truth)
- Update ProofSync::start() to iterate over cache instead of calling connected_proof_capable_peers()
- Gate PendingRangeRequest → range sync on empty in-flight ExecutionProofStatus polls

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Simplify ExecutionProofStatus field doc comments
- Add ToExecutionProofStatus trait following ToStatusMessage pattern
- Remove execution proof tracking maps from SyncNetworkContext; ProofSync
  owns all execution-proof state and tracking
- Remove is_proof_capable_peer and find_best_proof_capable_peer from
  SyncNetworkContext; ProofSync now has best_peer() private method
- Remove on_execution_*_terminated methods from SyncNetworkContext
- Add range_request_peer tracking to ProofSync for peer-disconnect handling
- Add on_range_request_error() and on_root_request_error() for proper
  failure recovery (range resets to PendingRangeRequest to retry)
- Add refresh_peer_status() helper to ProofSync::start()
- Remove impossible current_slot < start_slot guard in request_proof_range
- Call proof_sync.add_peer() for all connecting peers (soft request,
  graceful failure for non-proof peers); remove is_proof_capable_peer gate
- Add comment explaining request_id=None vs Some in router.rs
- Handle range request errors in inject_error for proper retry logic

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- basic_sim/fallback_sim: GENESIS_DELAY 38 → 80 to account for boot_node_enr()
  polling overhead during node startup
- proof_engine: genesis_delay 60 → 120 for 3-node proof network (default +
  proof_generator + proof_verifier) which requires more startup time in CI

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Gives the delayed node 48 seconds (3 epochs × 8 slots × 2s) to
discover peers and form a gossip mesh before the sync check at
slot 128, instead of the previous 16 seconds (1 epoch).

The narrow 16-second window was insufficient for the node to
discover peers via discv5 and receive block 128 via gossip in CI,
causing intermittent "Head not synced for node 2" failures.

Mirrors the upstream fix in sigp#8983.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ce genesis delay

Remove NodeCustodyType::Supernode from default_client_config() which was
applied to ALL simulator nodes. This caused excessive data availability
overhead (every node custodying all columns), leading to finalization
failures in basic-simulator and missed blocks in fallback-simulator.

Supernode custody is preserved only on the boot node (construct_boot_node)
where it's needed to prevent earliest_available_slot issues for late-joining
node sync.

Also reduce GENESIS_DELAY from 80 to 45 seconds (upstream: 38). The 80s
delay was compensating for the Supernode overhead which is now removed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
frisitano and others added 29 commits March 19, 2026 04:04
…ation-test

feat: proof engine zkboost ingreation test
Closes the gap where ProofService dropped the new_payload_request_root
after requesting proofs. Now outstanding requests are tracked, and a
new background task subscribes to proof engine SSE events. When a
ProofComplete event arrives for a tracked request, the proof is
fetched, signed with a safe validator key, and submitted to the
beacon node.

Key changes:
- Track outstanding proof requests by new_payload_request_root with
  pending proof types per request
- New monitor_proof_engine_events_task subscribes to proof engine SSE
  using while-let pattern inside tokio::select with stale timeout
- Handle ProofComplete (fetch/sign/submit), ProofFailure, and timeout
  events, removing resolved proof types from the tracker
- Entry removed only when all requested proof types are resolved or
  the 300s stale timeout is hit

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ion-loop

feat: add proof engine SSE monitor for proof completion loop
* fix: add portable feature to proof_engine_zkboost_test and fix CI deps

The zkboost-tests workflow was failing because the
proof_engine_zkboost_test crate did not define the `portable` feature
flag that the Makefile passes via `--features portable`.

- Add `portable = ["types/portable"]` to Cargo.toml features
- Add system dependency installation step (cmake, clang, etc.)
- Set CC/CXX to clang for leveldb-sys compatibility

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor: remove act-specific CC/CXX env vars from workflow

The CC=clang/CXX=clang++ overrides were only needed for local act
validation, not for GitHub runners. Remove them to keep the workflow
consistent with test-suite.yml patterns.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: use Clang for C/C++ compilation in CI workflows and Dockerfile

leveldb-sys uses -Wthread-safety (a Clang-only flag) that GCC does not
support. On the fork, CI runs on ubuntu-latest where the default C++
compiler is GCC, causing all three workflows to fail. Upstream uses
custom Warp runners where this is not an issue.

Set CC=clang CXX=clang++ globally in zkboost-tests.yml, test-suite.yml,
and the Dockerfile to ensure leveldb-sys builds correctly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: clear stale leveldb-sys cmake cache before build

The cargo cache from previous runs contained cmake build artifacts
compiled with GCC. When switching to Clang (CC/CXX env vars), the stale
cmake cache triggers a partial reconfigure that incorrectly builds
benchmark targets, causing compilation errors.

Add a step to remove the cached leveldb-sys cmake build directory before
building. Also deleted all existing GitHub Actions caches to force clean
Clang-based builds.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: replace deprecated try_next() with try_recv() in beacon_chain

futures::channel::mpsc::Receiver::try_next() is deprecated in favor of
try_recv(). The return type changed: try_next() returned
Result<Option<T>, TryRecvError> while try_recv() returns
Result<T, TryRecvError>. Update match arms accordingly.

With RUSTFLAGS="-D warnings" in CI, this deprecation becomes a hard
error that blocks all jobs compiling beacon_chain.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* style: fix cargo fmt formatting in test_utils.rs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: ignore RUSTSEC-2024-0437 in cargo audit

protobuf 2.28.0 (via prometheus 0.13.4) has a known recursion crash
advisory, but it's not exploitable in our context — protobuf is only
used for Prometheus metrics serialization with trusted internal data.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: update deny.toml for zkboost/ethrex transitive dependencies

Allow crates (ethereum-types, protobuf, derivative, ark-ff) that are
banned upstream but required by zkboost's ethrex dependency chain.
Also allow git sources from lambdaclass, eth-act, paradigmxyz orgs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Nova <nova.tau.assistant@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat: add --mock-proof-engine flag for Kurtosis integration

Add mock-proof-engine feature flag that spawns an in-process mock proof
engine when enabled. This enables testing EIP-8025 in multi-node Kurtosis
networks without external proof engine dependencies.

Changes:
- Add mock-proof-engine Cargo feature to lighthouse crate
- Add --mock-proof-engine CLI flag to beacon node
- Spawn LocalProofEngine in-process when flag is set
- Auto-configure proof_engine_endpoint to mock server URL
- Add Kurtosis network config for 4-node testnet
- Add start_eip8025_testnet.sh launch script

* refactor: replace --mock-proof-engine flag with --proof-engine-endpoint http://mock

Instead of a separate CLI flag, detect the sentinel URL "http://mock" in
--proof-engine-endpoint to trigger in-process mock proof engine spawning.
This simplifies the CLI surface while keeping the same feature-gated behavior.

- Remove --mock-proof-engine CLI arg from beacon_node/src/cli.rs
- Detect http://mock in config.rs and set mock_proof_engine internally
- Add #[cfg(not(feature))] guard in main.rs for clear error when feature not compiled
- Update network_params_eip8025.yaml to use --proof-engine-endpoint=http://mock

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: improve mock proof engine logging

- Add startup log to MockProofEngineServer::new()
- Add logging to engine_verifyExecutionProofV1 endpoint
- Unify tracing target to "mock_proof_engine" (was "simulator")

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* kurtosis mock proof engine

* fix: post-merge cleanup — fmt, clippy, and missing import

- Add FixedBytesExtended import for Hash256::zero() in mock request_proofs
- Fix collapsible_if clippy warnings in proof_engine.rs and proof_sync.rs
- Apply cargo fmt formatting fixes
- Regenerate Cargo.lock

* refactor: minimize source diff to lib.rs only

Revert all source-code changes except beacon_node/execution_layer/src/lib.rs
to match origin/feat/eip8025. The remaining lib.rs diff contains:
- prefer_ok helper for combining optional results
- Non-fatal proof engine error handling in new_payload/forkchoice_updated

Kurtosis scripts are retained as test infrastructure.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: auto-register MockProofNodeClient when not pre-registered

When a mock URL (http://mock/{n}/) is used but no mock has been
pre-registered in the global registry (e.g., in standalone Kurtosis
runs vs the test simulator), create and register one on the fly
instead of panicking.

Fixes startup crash when using --proof-engine-endpoint=http://mock/0/
outside of the test simulator context.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Revert "fix: auto-register MockProofNodeClient when not pre-registered"

This reverts commit 613133d.

* fix: replace deprecated try_next() with try_recv().ok()

The futures mpsc Receiver::try_next() method is deprecated in favor of
try_recv(). Updates the match to use the new API and simplifies per clippy.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: use clang in Dockerfile to fix leveldb-sys build

The leveldb-sys crate passes -Wthread-safety to the C++ compiler,
which is a Clang-only flag. GCC rejects it, causing build failures.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: re-apply auto-register MockProofNodeClient for Kurtosis

Re-apply the auto-register fix that was previously reverted during
the minimize-diff phase. Without this, Kurtosis nodes panic on startup
with "no mock registered at index 0" when using
--proof-engine-endpoint=http://mock/0/.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: auto-register mock proof engine in VC and handle bare mock URLs

The validator client had the same panic as the beacon node when using
mock proof engine URLs. Also makes parse_mock_index accept bare
"http://mock/" URLs (defaulting to index 0) since the ethereum-package
may strip the index from vc_extra_params.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Revert "fix: replace deprecated try_next() with try_recv().ok()"

This reverts commit d317436.

* Revert "fix: use clang in Dockerfile to fix leveldb-sys build"

This reverts commit 8cfce26.

* refactor mock proof node client

* lint

---------

Co-authored-by: Nova <nova.tau.assistant@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat: implement execution proof peer scoring and validator tracking

Add three-layer defense for execution proof gossip processing:
- Layer A: ObservedExecutionProofs dedup cache (IGNORE-2, IGNORE-3)
- Layer B: Error-differentiated peer scoring (per-error penalties)
- Layer C: InvalidProofTracker for banned validators (threshold=1)

Processing order: dedup → ban check → BLS verify → engine verify.
ProofStatus::Invalid downgraded from Fatal to MidTolerance for relay
peers. RPC path also feeds the validator tracker.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: add DB persistence for invalid proof validator tracker

- Add `InvalidProofTracker` DBColumn to HotColdDB store
- SSZ-encode/decode banned validator set via PersistedInvalidProofTracker
- Load banned validators from DB on beacon chain startup
- Persist to DB on each new ban (gossip and RPC paths)
- Add SSZ round-trip test

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* style: format persist_to_store if-let chains

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* test: add persistence integration tests for InvalidProofTracker

- empty_start_fallback: load from empty DB returns default tracker
- persist_and_reload: bans survive store round-trip (simulated restart)
- persist_after_unban_survives_reload: unban + re-persist correctly
  reflected after reload

All three tests use HotColdDB::open_ephemeral with MemoryStore to
exercise the full put_item/get_item path through the StoreItem impl.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* update validator public key

* clean up

---------

Co-authored-by: Nova <nova.tau.assistant@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
- Move observe_verification_attempt to after BLS verification (cache poisoning fix)
- Remove dead InvalidHeaderFormat error variant
- Persist InvalidProofTracker at shutdown instead of on every ban
- Remove unused slot field from InvalidProofRecord
- Upgrade invalid proof peer penalty to LowToleranceError
- Wire ObservedExecutionProofs::prune at finalization with slot-based eviction
- observe_valid_proof now records slot for pruning

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* proof engine tests

* lint
* feat: execution proof sync protocol hardening

* update tests to account for ProofSyncState::Waiting
* integrate zkboost

* improvements to sync protocol

* lint: cargo sort

* remove timeout for proof node SSE event subscription
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@frisitano frisitano merged commit c898fde into eth-act:feat/eip8025 Mar 27, 2026
32 of 33 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants