fix(l1): fix broadcast_pool race and offload tx pool insertion to background task by edg-l · Pull Request #6253 · lambdaclass/ethrex

edg-l · 2026-02-24T10:28:32Z

Summary

Fix broadcast_pool race condition: clear_broadcasted_txs() wiped the entire broadcast_pool, including txs added between the read (get_txs_for_broadcast()) and the clear. With 2000 txs being inserted one-by-one and the broadcaster firing every 5ms, this reliably dropped txs that were never announced to peers. Replaced with targeted remove_broadcasted_txs() that only removes the txs that were actually fetched and broadcast.
Offload tx pool insertion to background task: Spawn the incoming Transactions message pool-insertion loop as a background tokio::spawn task instead of running it synchronously in handle_cast. This prevents the ConnectionServer from being blocked during signature recovery + storage reads for large transaction batches.

Motivation

The hive LargeTxRequest devp2p eth test intermittently fails in CI (example run). Root cause is a race condition in the transaction broadcaster:

get_txs_for_broadcast() reads the broadcast_pool (releases lock)
Meanwhile, incoming txs continue to be added to broadcast_pool
clear_broadcasted_txs() clears all of broadcast_pool, including txs added in step 2

Any tx added between steps 1 and 3 is lost forever — never broadcast but cleared from the pool. With 2000 txs being inserted sequentially and the broadcaster firing every 5ms, this race reliably drops transactions, causing the test's 2-second timeout to expire before all tx hashes are announced.

Test plan

cargo check (with and without l2 feature)
Local hive devp2p eth tests pass 3/3 runs: ./hive --sim devp2p --sim.limit "eth" --client ethrex (20/20 each run)
CI hive daily run should show P2P Eth capability back to 20/20

github-actions · 2026-02-24T10:28:57Z

🤖 Kimi Code Review

Security & Correctness Issues

1. Race condition in transaction processing (lines 952-954, 973-977)
The tx_hashes vector is collected from txs.transactions before spawning the background task, but the background task consumes txs.transactions by value. This creates a potential race where the broadcaster is notified with transaction hashes that may not actually be processed if the background task hasn't started yet.

Fix: Collect hashes after filtering in the background task, or use Arc<[Transaction]> to share ownership.

2. Missing transaction deduplication check
The code immediately spawns a task for all received transactions without checking if they're already in the pool. This could lead to unnecessary CPU usage for signature recovery on duplicate transactions.

Performance & Resource Management

3. Unbounded task spawning (line 957)
tokio::spawn() is called for every Transactions message without any backpressure mechanism. A malicious peer could send many small Transactions messages to exhaust system resources.

Fix: Use a bounded channel or semaphore to limit concurrent transaction processing tasks.

4. Missing task cancellation on peer disconnect
The spawned task continues running even if the peer disconnects. This wastes resources processing transactions from disconnected peers.

Fix: Pass an AbortHandle to the task and cancel it in the connection cleanup.

Code Quality & Best Practices

5. Inefficient cloning (line 955)
state.blockchain.clone() clones the entire blockchain handle for each message. Consider using Arc if not already the case, or pass a lightweight handle.

6. Debug logging inconsistency
The debug logs in lines 965 and 975 use peer=%peer but the original code used peer=%state.node. While functionally equivalent, maintaining consistency with the original format would be clearer.

7. Missing error handling for spawn failure
tokio::spawn() can fail if the runtime is shutting down. This should be handled gracefully.

Ethereum-Specific Concerns

8. Transaction validation timing
Moving transaction validation to a background task is generally good, but ensure that invalid transactions (wrong chain ID, insufficient gas, etc.) are still properly filtered before network propagation. The current code only checks L2-specific rules.

The change appears to be a performance optimization rather than a consensus change, so this is acceptable.

Automated review by Kimi (Moonshot AI) · custom prompt

github-actions · 2026-02-24T10:30:05Z

🤖 Codex Code Review

Findings

Potential correctness issue: tx_hashes is built from all incoming txs, but the L2/privileged filter is applied later inside the spawned task. This means InMessage::AddTxs includes hashes for txs that are explicitly rejected in L2 mode or privileged txs. If the broadcaster uses these hashes to suppress re-broadcast to the sender, you may be suppressing even though the tx never entered the pool. Consider filtering hashes to only txs that pass the L2/privileged checks (and ideally only those accepted into the pool).
File: crates/networking/p2p/rlpx/connection/server.rs:945
Unbounded task spawning per incoming Transactions message can become a DoS vector under load (many peers, large tx batches). You’ve moved CPU-heavy work off the connection task, but there’s no backpressure or limit. Consider using a bounded task queue or a semaphore to cap concurrent pool insertions.
File: crates/networking/p2p/rlpx/connection/server.rs:952

Notes

The async move captures blockchain and peer cleanly; no memory safety issues observed in this diff.
Error handling behavior for pool insertion remains the same (log and continue).

If you want, I can check the broadcaster logic to verify whether the hash suppression path makes Item 1 a real bug or just a mild inefficiency.

Next steps

I can inspect broadcaster behavior to confirm the impact of Item 1.

Automated review by OpenAI Codex · custom prompt

greptile-apps · 2026-02-24T10:30:53Z

Greptile Summary

Fixed critical race condition in transaction broadcaster that was causing transactions to be lost. The race occurred when clear_broadcasted_txs() would wipe the entire broadcast_pool including transactions added between the read and clear operations. This has been resolved by replacing the clear operation with targeted remove_broadcasted_txs() that only removes transactions that were actually fetched and broadcast.

Additionally, offloaded expensive transaction pool insertion (signature recovery and validation) to a background task to prevent blocking the ConnectionServer during large transaction batches.

Replaced clear_broadcasted_txs() with remove_broadcasted_txs(&hashes) for targeted removal
Changed broadcaster to collect and pass specific hashes of broadcast transactions
Moved transaction pool insertion to tokio::spawn background task in message handler
Extracted transaction hashes upfront before spawning background task
Cloned necessary context (blockchain, peer) for background task

Confidence Score: 4/5

Safe to merge - fixes critical race condition with well-tested changes
The race condition fix is sound and the background task offload is a standard pattern. Minor concern about unhandled spawn panic (background task silently fails if pool is shut down), but this is acceptable in practice since task failures are logged internally and don't affect correctness
No files require special attention - all changes are focused and address specific issues

Important Files Changed

Filename	Overview
crates/blockchain/mempool.rs	Renamed `clear_broadcasted_txs()` to `remove_broadcasted_txs()` with targeted removal by hash list to fix race condition
crates/networking/p2p/tx_broadcaster.rs	Updated to use targeted `remove_broadcasted_txs()` instead of clearing entire pool, and removed a trace log statement
crates/networking/p2p/rlpx/connection/server.rs	Offloaded expensive tx pool insertion to background task with `tokio::spawn` to avoid blocking ConnectionServer during validation

Sequence Diagram

sequenceDiagram
    participant B as Broadcaster
    participant M as Mempool
    participant C as ConnectionServer
    
    Note over B,M: Old behavior (race condition)
    B->>M: get_txs_for_broadcast()
    M-->>B: [tx1, tx2]
    Note over C,M: Meanwhile...
    C->>M: add_transaction(tx3)
    M->>M: broadcast_pool.insert(tx3)
    B->>M: clear_broadcasted_txs()
    M->>M: broadcast_pool.clear() ❌
    Note over M: tx3 lost! Never broadcast
    
    Note over B,M: New behavior (fixed)
    B->>M: get_txs_for_broadcast()
    M-->>B: [tx1, tx2]
    Note over C,M: Meanwhile...
    C->>M: add_transaction(tx3)
    M->>M: broadcast_pool.insert(tx3)
    B->>M: remove_broadcasted_txs([tx1, tx2])
    M->>M: remove only tx1, tx2 ✓
    Note over M: tx3 preserved for next broadcast

_{Last reviewed commit: 1ed81cb}

github-actions · 2026-02-24T10:31:46Z

Lines of code report

Total lines added: 6
Total lines removed: 0
Total lines changed: 6

Detailed view

+--------------------------------------------------------+-------+------+
| File                                                   | Lines | Diff |
+--------------------------------------------------------+-------+------+
| ethrex/crates/blockchain/mempool.rs                    | 388   | +3   |
+--------------------------------------------------------+-------+------+
| ethrex/crates/networking/p2p/rlpx/connection/server.rs | 1070  | +1   |
+--------------------------------------------------------+-------+------+
| ethrex/crates/networking/p2p/tx_broadcaster.rs         | 311   | +2   |
+--------------------------------------------------------+-------+------+

github-actions · 2026-02-24T10:33:07Z

🤖 Claude Code Review

Review: `perf(l1): offload incoming tx pool insertion to background task`

Summary

The change is well-motivated: spawning a tokio::task for expensive per-transaction work (ECDSA recovery, storage reads) is the correct approach to unblock the ConnectionServer. The overall architecture is sound. Below are my observations, roughly in order of importance.

Semantic correctness of the broadcaster notification ordering

server.rs lines 948 and 980–984

The key design decision is notifying the broadcaster with all received hashes before the background task actually inserts transactions into the pool. This is correct because AddTxs marks hashes as "known from this peer" in known_txs (to avoid re-broadcasting to the sender), while the actual broadcast in broadcast_txs() queries mempool.get_txs_for_broadcast() directly — it never reads from known_txs to determine what to broadcast.

However, it's worth noting two subtleties:

In L2 mode, blob and privileged tx hashes are added to known_txs (via the pre-computed tx_hashes) but will be rejected and never added to the pool. These dead entries live in known_txs for up to 10 minutes (the prune window). This was identical behavior in the old code (the broadcaster notification was also after-loop-with-all-hashes), so it is not a regression, but it is worth being aware of.
On validation failure, same story — invalid hashes accumulate in known_txs until the prune cycle. Again pre-existing, not a regression.

Fire-and-forget `JoinHandle`

server.rs line 956

tokio::spawn(async move { ... });

The returned JoinHandle is immediately dropped. If the task panics, tokio logs it but the error does not propagate anywhere. For insertion failures this is acceptable (they're already debug! logged), but a panic (e.g., from a poison lock in the mempool's RwLock) would be silently swallowed from the caller's perspective. Consider whether a tokio::task::JoinSet or an abort handle should be tracked at the connection level so panics can surface during teardown. This is a minor concern for a non-critical path, but a panic in add_transaction_to_pool during a RwLock poison scenario would go unnoticed.

Unbound concurrent task spawning under high peer load

Every incoming Transactions message from every peer spawns a new task. Each task can itself hold concurrent mempool write locks. The mempool uses a std::sync::RwLock (mempool.rs line 80), so concurrent tasks serialize on write operations — but the validate_transaction path does async storage reads before acquiring the write lock, meaning many tasks can concurrently be in the validation phase. This is unlikely to be a problem at current peer counts, but it would be worth documenting the implicit assumption that the number of simultaneous Transactions messages is bounded by peer count.

TOCTOU in `add_transaction_to_pool` (pre-existing, not introduced here)

blockchain.rs lines 2421–2432

if self.mempool.contains_tx(hash)? {
    return Ok(hash);
}
// ... validate (async, no lock) ...
self.mempool.add_transaction(hash, sender, ...)?;

The check at line 2421 and the insert at line 2432 are not atomic. Two background tasks (from two different peers) processing the same transaction can both pass contains_tx and both call add_transaction, inserting the hash into txs_order twice. This pre-existed the PR (multi-peer concurrency already existed), but the background-task pattern makes it easier to trigger from a single peer. The impact is limited — duplicate entries in txs_order cause the prune loop to remove an entry that no longer exists in transaction_pool, which is harmless but slightly wasteful. Worth a follow-up issue if not already tracked.

Minor: `#[cfg(feature = "l2")]` placement inside spawn

server.rs lines 954–955 and 958–964

The is_l2_mode binding is declared outside the spawn:

#[cfg(feature = "l2")]
let is_l2_mode = state.l2_state.is_supported();

and captured by value inside the async move. This is correct and idiomatic — bool is Copy. No issue here.

Positive observations

The comment at line 977–979 accurately describes the broadcaster's role and why pre-notification is safe. This is helpful for future readers.
Cloning Arc<Blockchain> (line 952) is cheap and correct for moving into the task.
Since is_synced is documented as monotonically transitioning false→true and never resetting (comment at blockchain.rs line 182), the pre-spawn check at line 947 cannot become stale in a way that causes over-insertion.
The drop of the superfluous continue after the add_transaction_to_pool error (old code line ~979) is a clean improvement — it was dead code since there was nothing after it in the loop body.

Verdict

The approach is correct and the fix is appropriate for the CI flake it targets. The two main points worth addressing are the dropped JoinHandle (a minor observability concern) and the pre-existing TOCTOU in add_transaction_to_pool (which this PR makes easier to trigger from a single peer). Neither is a blocker.

Automated review by Claude (Anthropic) · custom prompt

edg-l · 2026-02-24T10:34:03Z

Addressing bot reviews

Kimi findings

Race condition (hashes collected before task runs): By design. AddTxs only marks hashes as "known from this peer" in a known_txs map to avoid re-broadcasting back to the sender. It does NOT trigger any broadcast or require pool state. Actual broadcasting happens on a periodic timer that queries mempool directly via get_txs_for_broadcast().
Missing deduplication check before spawning: add_transaction_to_pool already does mempool.contains_tx(hash) as its very first check (blockchain.rs:2421). Adding another check before spawning would be redundant.
Unbounded task spawning (DoS): One task per Transactions message, not per transaction. The original code was equally vulnerable to flooding — it just blocked the ConnectionServer instead of spawning. Bounded concurrency could be a follow-up but is out of scope here.
Missing task cancellation on disconnect: The spawned tasks just do pool inserts and exit. No long-running loops or waits. Adding cancellation logic would add complexity for negligible gain on a non-critical path.
Inefficient blockchain.clone(): Blockchain is Arc-wrapped. Clone is a ref count bump, not a deep copy.
Debug logging inconsistency: We can't capture state.node (a reference) into the spawned async move block. peer is state.node.to_string() — functionally identical output.
Missing spawn failure handling: tokio::spawn only fails if the runtime is shutting down, at which point everything is tearing down anyway.
Transaction validation timing: Validation still happens in the spawned task before pool insertion. The ordering is unchanged — validate then insert.

Codex findings

L2-rejected hashes sent to broadcaster: Not a regression. The original code did the exact same thing — txs.transactions.iter().map(|tx| tx.hash()).collect() was called on ALL transactions without L2 filtering. The AddTxs handler only uses these hashes to suppress re-broadcast to the sender, so stale entries in known_txs are harmless (pruned after 10 minutes).
Unbounded task spawning: Same as Kimi build: add Github actions boilerplate #3 above.

Claude review

Good observations — agree with all of them. The TOCTOU in add_transaction_to_pool and dead known_txs entries for rejected hashes are both pre-existing and not regressions. The JoinHandle drop is acceptable since insertion failures are already debug!-logged inside the task.

Update: broadcast_pool race condition fix

Investigation revealed the true root cause of the flaky LargeTxRequest test was a race condition in the tx broadcaster, not the ConnectionServer blocking:

get_txs_for_broadcast() reads the broadcast_pool (releases lock)
Meanwhile, incoming txs continue to be added to broadcast_pool one-by-one
clear_broadcasted_txs() clears all of broadcast_pool, including txs added since the read

Any tx added between the read and the clear is lost forever — never broadcast but removed from the pool. With 2000 txs being inserted sequentially and the broadcaster firing every 5ms, this reliably drops transactions.

Fix: Replaced clear_broadcasted_txs() with remove_broadcasted_txs(hashes) that only removes the specific txs that were actually fetched and sent. After the fix, hive devp2p eth tests pass 3/3 full runs (20/20 each).

…kground task Two fixes for the flaky hive `LargeTxRequest` devp2p eth test: 1. Fix broadcast_pool race condition: `clear_broadcasted_txs()` wiped the entire broadcast_pool, including txs added between the read (`get_txs_for_broadcast`) and the clear. With 2000 txs inserted one-by-one and the broadcaster firing every 5ms, this reliably dropped txs that were never broadcast. Replace with targeted `remove_broadcasted_txs()` that only removes the txs that were actually fetched and sent. 2. Offload incoming tx pool insertion to a background `tokio::spawn` task so the ConnectionServer isn't blocked during signature recovery and storage reads for large transaction batches.

github-actions · 2026-02-24T11:46:08Z

🤖 Kimi Code Review

Review Summary

Overall, this PR improves transaction handling by moving heavy validation work off the critical path, but introduces a few correctness and safety issues.

Critical Issues

Race condition in transaction broadcasting (crates/networking/p2p/rlpx/connection/server.rs:945-985)
- The PR spawns a background task to add transactions to the pool but immediately notifies the broadcaster with transaction hashes. If the background task fails to add a transaction (e.g., validation fails), the broadcaster will still attempt to broadcast it, leading to inconsistent state.
- Fix: Only notify the broadcaster after successful pool insertion, or use a channel to communicate success/failure back to the broadcaster.
Missing transaction deduplication (crates/networking/p2p/tx_broadcaster.rs:243-247)
- remove_broadcasted_txs removes transactions from the broadcast pool, but there's no guarantee these transactions are still in the mempool. If a transaction was rejected during validation, it will be removed from broadcast pool but never existed in mempool, potentially causing issues.

Performance & Safety Issues

Unbounded spawn without error handling (crates/networking/p2p/rlpx/connection/server.rs:955)
- tokio::spawn is used without any limit or error handling. Under high load, this could create thousands of tasks.
- Fix: Use a bounded channel or semaphore to limit concurrent validation tasks.
Missing L2 configuration in spawn (crates/networking/p2p/rlpx/connection/server.rs:954-956)
- The is_l2_mode variable is defined outside the spawn but used inside, creating a potential race if the L2 configuration changes between definition and usage.
- Fix: Move the check inside the spawn or clone the value.

Minor Issues

Inefficient hash collection (crates/networking/p2p/tx_broadcaster.rs:246-247)
- Creating a new Vec<H256> just to pass to remove_broadcasted_txs is unnecessary. Consider passing an iterator or slice directly.
Debug logging inconsistency (crates/networking/p2p/rlpx/connection/server.rs:962)
- The debug log uses peer=%peer but peer is a String from to_string(), which might not be as informative as the original Node struct.

Suggested Fixes

For the race condition, consider:

// In handle_incoming_message
let (tx, rx) = tokio::sync::oneshot::channel();
tokio::spawn(async move {
    let mut successful_hashes = Vec::new();
    for tx in txs.transactions {
        if blockchain.add_transaction_to_pool(tx.clone()).await.is_ok() {
            successful_hashes.push(tx.hash());
        }
    }
    let _ = tx.send(successful_hashes);
});

if let Ok(successful_hashes) = rx.await {
    state.tx_broadcaster.cast(InMessage::AddTxs(successful_hashes, state.node.node_id())).await?;
}

For the L2 configuration issue:

let is_l2_mode = state.l2_state.is_supported(); // Clone this value
let blockchain = state.blockchain.clone();
let peer = state.node.to_string();
tokio::spawn(async move {
    let is_l2_mode = is_l2_mode; // Use the cloned value
    // ... rest of the code
});

Automated review by Kimi (Moonshot AI) · custom prompt

github-actions · 2026-02-24T11:47:32Z

🤖 Codex Code Review

Review Summary

Overall direction looks good (decoupling expensive tx validation from the connection loop and only removing broadcasted hashes). I do see a couple of correctness/robustness risks worth addressing.

Findings

Potential DoS via unbounded task spawning
In crates/networking/p2p/rlpx/connection/server.rs within handle_incoming_message (Transactions branch), each incoming Transactions message spawns a tokio::spawn task that validates and inserts all txs. A peer can flood Transactions to create unbounded concurrent tasks, increasing CPU and memory pressure.
Suggestion: gate with a bounded semaphore / task queue, or reuse a dedicated worker channel for tx validation. That keeps backpressure and avoids unbounded task creation.
Race between broadcaster notification and mempool insertion
In the same block, tx_broadcaster.cast(InMessage::AddTxs(...)) happens immediately, while mempool insertion is async in the background. If the broadcaster queries the mempool before insertion completes, those txs may not be broadcasted in that round and only later (or potentially re-broadcasted to the same peer if dedupe is tied to mempool state rather than the AddTxs tracker).
Suggestion: confirm TxBroadcaster’s per-peer dedupe uses the AddTxs hashes independently of mempool presence. If not, consider making broadcaster notification happen after insertion or adjust dedupe to avoid accidental re-broadcasts.

Notes

crates/blockchain/mempool.rs change from clear_broadcasted_txs to remove_broadcasted_txs looks fine, but it turns O(1) clear into O(n) removes. Probably fine given typical sizes, but worth noting if broadcast sets are large.

If you want, I can scan the broadcaster and mempool internals to verify the dedupe semantics and whether the race in Item 2 is a real behavior issue.

Automated review by OpenAI Codex · custom prompt

github-actions · 2026-02-24T11:47:56Z

🤖 Claude Code Review

Now I have a thorough picture of the changes. Let me write the review.

Review: fix(l1): fix broadcast_pool race and offload tx pool insertion to background task

Summary

Both fixes are well-motivated and correctly implemented. The race condition in broadcast_pool is real and the surgical fix is the right approach. The background-task offload for pool insertion is sound given the architecture of TxBroadcaster. Overall the PR is in good shape.

Fix 1 — `remove_broadcasted_txs` (`mempool.rs`)

Correct. The original clear_broadcasted_txs had a classic TOCTOU window:

get_txs_for_broadcast  →  [lock released]  →  new txs added to broadcast_pool  →  clear_broadcasted_txs

The replacement iterates only over the hashes that were actually fetched and sent. Because broadcast_txs() runs inside a single-actor GenServer (one message handled at a time), there is no concurrent broadcast_txs() caller to race with between the fetch and the remove, so the targeted removal is safe and complete.

Minor nit: the function could accept &[H256] with a single write() acquisition, which is what it already does — nothing to change here.

Fix 2 — Background task for pool insertion (`server.rs`)

Broadcaster notification ordering (`server.rs` lines 948, 980–984)

The key design choice — notifying tx_broadcaster with tx_hashes before the background task has inserted anything into the pool — is correct. AddTxs marks hashes as "known from this peer" in known_txs solely to suppress re-broadcasting back to the sender. Actual outbound broadcasts are driven by get_txs_for_broadcast(), which only returns hashes present in both transaction_pool and broadcast_pool. So if the background task hasn't finished inserting when the 1-second timer fires, those transactions simply skip that broadcast cycle and are picked up in the next one. No transactions are lost.

L2-filtered hashes in `known_txs` (`server.rs` lines 958–964)

tx_hashes is computed before the L2 filter runs inside the spawned task, so blob/privileged tx hashes are passed to AddTxs even though those transactions will be rejected and never enter the pool. These dead entries accumulate in known_txs until the 10-minute prune cycle. This is not a regression — the pre-existing code passed all hashes to AddTxs in the same way. The practical impact is negligible (stale entries in a HashMap that gets pruned).

Dropped `JoinHandle` (`server.rs` line 956)

tokio::spawn(async move { ... });

The returned JoinHandle is silently dropped. If add_transaction_to_pool panics — most plausibly from a poisoned std::sync::RwLock inside the mempool — tokio will log the panic but the connection handler will never observe it. For a non-critical insertion path with debug!-level error handling, this is acceptable. Consider tracking it with a JoinSet at the connection level if deeper observability becomes desirable, but this is not a blocker.

Unbounded spawning under adversarial peer load (`server.rs` line 956)

One task is spawned per Transactions message received from any peer. Each task can concurrently be in validate_transaction (doing async storage reads) before serialising on the mempool RwLock for the actual insert. In practice this is bounded by peer count, which is small. The previous behaviour was equally unbounded — it just blocked the ConnectionServer instead of spawning. A semaphore-based concurrency cap could be a follow-up if needed, but it is out of scope for this fix.

Pre-existing TOCTOU in `add_transaction_to_pool` (`blockchain.rs` lines 2421, 2432)

if self.mempool.contains_tx(hash)? {   // check (lock released)
    return Ok(hash);
}
// ... async validate (no lock) ...
self.mempool.add_transaction(...)?;    // write (new lock)

Two background tasks from the same or different peers processing the same hash can both pass contains_tx and both call add_transaction. Because HashMap::insert and HashSet::insert are both idempotent on duplicate keys, the actual impact is a harmless redundant write followed by two broadcast_pool inserts (also idempotent). This pre-existed the PR but the background-task pattern makes it easier to trigger from a single high-volume peer. Worth a follow-up issue.

Minor

Removing the trace!("No transactions to broadcast") log (tx_broadcaster.rs line 187) reduces noise during idle periods, which is a clean-up.
Removing the dead continue after the add_transaction_to_pool error branch is correct — there was no code after the if let Err block, so it was unreachable.
state.blockchain.clone() is a reference-count bump on an Arc<Blockchain>, not a deep copy. Correct and cheap.
is_l2_mode is bool (Copy), so its capture by the async move is correct without explicit cloning.

Verdict

Both fixes are correct and appropriate for the CI flake they target. The main items noted above (dropped JoinHandle, unbounded spawning, pre-existing TOCTOU) are either pre-existing or minor observability concerns, none of which are blockers. Approve.

Automated review by Claude (Anthropic) · custom prompt

github-actions · 2026-02-24T12:29:41Z

Benchmark Block Execution Results Comparison Against Main

Command	Mean [s]	Min [s]	Max [s]	Relative
`base`	64.823 ± 0.211	64.604	65.194	1.00 ± 0.00
`head`	64.749 ± 0.237	64.504	65.271	1.00

ElFantasma

Just a comment on the JoinHandle. It LGTM

ElFantasma · 2026-02-24T13:27:46Z

-                        debug!(peer=%state.node, "Rejecting transaction in L2 mode - {tx_type} transactions are not broadcasted in L2");
-                        continue;
-                    }
+                tokio::spawn(async move {


The JoinHandle is dropped here, making this fire-and-forget. If add_transaction_to_pool panics (e.g., poisoned RwLock), the panic is caught by the tokio runtime but won't surface clearly. More importantly, on node shutdown, in-flight insertion tasks may be silently dropped.

Not necessarily a blocker since the task is bounded (processes a fixed set of txs and the errors are already debug-logged), but worth considering whether to at least log on task failure, e.g.:

tokio::spawn(async move { // ... existing code ... }.instrument(tracing::debug_span!("tx_pool_insert")));

or store the handle in a JoinSet for clean shutdown.

…kground task (lambdaclass#6253) ## Summary - **Fix broadcast_pool race condition**: `clear_broadcasted_txs()` wiped the entire `broadcast_pool`, including txs added between the read (`get_txs_for_broadcast()`) and the clear. With 2000 txs being inserted one-by-one and the broadcaster firing every 5ms, this reliably dropped txs that were never announced to peers. Replaced with targeted `remove_broadcasted_txs()` that only removes the txs that were actually fetched and broadcast. - **Offload tx pool insertion to background task**: Spawn the incoming `Transactions` message pool-insertion loop as a background `tokio::spawn` task instead of running it synchronously in `handle_cast`. This prevents the ConnectionServer from being blocked during signature recovery + storage reads for large transaction batches. ## Motivation The hive `LargeTxRequest` devp2p eth test intermittently fails in CI ([example run](https://github.com/lambdaclass/ethrex/actions/runs/22335807040)). Root cause is a race condition in the transaction broadcaster: 1. `get_txs_for_broadcast()` reads the `broadcast_pool` (releases lock) 2. Meanwhile, incoming txs continue to be added to `broadcast_pool` 3. `clear_broadcasted_txs()` clears **all** of `broadcast_pool`, including txs added in step 2 Any tx added between steps 1 and 3 is lost forever — never broadcast but cleared from the pool. With 2000 txs being inserted sequentially and the broadcaster firing every 5ms, this race reliably drops transactions, causing the test's 2-second timeout to expire before all tx hashes are announced. ## Test plan - [x] `cargo check` (with and without `l2` feature) - [x] Local hive devp2p eth tests pass 3/3 runs: `./hive --sim devp2p --sim.limit "eth" --client ethrex` (20/20 each run) - [ ] CI hive daily run should show P2P Eth capability back to 20/20

edg-l requested a review from a team as a code owner February 24, 2026 10:28

github-actions Bot assigned edg-l Feb 24, 2026

github-actions Bot added L1 Ethereum client performance Block execution throughput and performance in general labels Feb 24, 2026

github-project-automation Bot added this to ethrex_l1 and ethrex_performance Feb 24, 2026

github-project-automation Bot moved this to Todo in ethrex_performance Feb 24, 2026

edg-l marked this pull request as draft February 24, 2026 10:37

ethrex-project-sync Bot moved this to In Progress in ethrex_l1 Feb 24, 2026

edg-l force-pushed the fix/offload-tx-pool-insertion-from-connection-server branch from 3a6895d to 1ed81cb Compare February 24, 2026 11:44

edg-l changed the title ~~perf(l1): offload incoming tx pool insertion to background task~~ fix(l1): fix broadcast_pool race and offload tx pool insertion to background task Feb 24, 2026

github-actions Bot removed the performance Block execution throughput and performance in general label Feb 24, 2026

edg-l marked this pull request as ready for review February 24, 2026 11:45

ethrex-project-sync Bot moved this from In Progress to In Review in ethrex_l1 Feb 24, 2026

edg-l removed this from ethrex_performance Feb 24, 2026

ilitteri approved these changes Feb 24, 2026

View reviewed changes

ElFantasma approved these changes Feb 24, 2026

View reviewed changes

avilagaston9 approved these changes Feb 24, 2026

View reviewed changes

iovoid approved these changes Feb 24, 2026

View reviewed changes

ilitteri added this pull request to the merge queue Feb 24, 2026

Merged via the queue into main with commit 8103e18 Feb 24, 2026
66 of 68 checks passed

ilitteri deleted the fix/offload-tx-pool-insertion-from-connection-server branch February 24, 2026 16:18

github-project-automation Bot moved this from In Review to Done in ethrex_l1 Feb 24, 2026

Conversation

edg-l commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

Test plan

Uh oh!

github-actions Bot commented Feb 24, 2026

🤖 Kimi Code Review

Security & Correctness Issues

Performance & Resource Management

Code Quality & Best Practices

Ethereum-Specific Concerns

Uh oh!

github-actions Bot commented Feb 24, 2026

🤖 Codex Code Review

Uh oh!

greptile-apps Bot commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

github-actions Bot commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Lines of code report

Uh oh!

github-actions Bot commented Feb 24, 2026

🤖 Claude Code Review

Review: perf(l1): offload incoming tx pool insertion to background task

Summary

Semantic correctness of the broadcaster notification ordering

Fire-and-forget JoinHandle

Unbound concurrent task spawning under high peer load

TOCTOU in add_transaction_to_pool (pre-existing, not introduced here)

Minor: #[cfg(feature = "l2")] placement inside spawn

Positive observations

Verdict

Uh oh!

edg-l commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Addressing bot reviews

Kimi findings

Codex findings

Claude review

Update: broadcast_pool race condition fix

Uh oh!

github-actions Bot commented Feb 24, 2026

🤖 Kimi Code Review

Review Summary

Critical Issues

Performance & Safety Issues

Minor Issues

Suggested Fixes

Uh oh!

github-actions Bot commented Feb 24, 2026

🤖 Codex Code Review

Uh oh!

github-actions Bot commented Feb 24, 2026

🤖 Claude Code Review

Review: fix(l1): fix broadcast_pool race and offload tx pool insertion to background task

Summary

Fix 1 — remove_broadcasted_txs (mempool.rs)

Fix 2 — Background task for pool insertion (server.rs)

Broadcaster notification ordering (server.rs lines 948, 980–984)

L2-filtered hashes in known_txs (server.rs lines 958–964)

Dropped JoinHandle (server.rs line 956)

Unbounded spawning under adversarial peer load (server.rs line 956)

Pre-existing TOCTOU in add_transaction_to_pool (blockchain.rs lines 2421, 2432)

Minor

Verdict

Uh oh!

github-actions Bot commented Feb 24, 2026

Benchmark Block Execution Results Comparison Against Main

Uh oh!

ElFantasma left a comment

Choose a reason for hiding this comment

Uh oh!

ElFantasma Feb 24, 2026

edg-l commented Feb 24, 2026 •

edited

Loading

greptile-apps Bot commented Feb 24, 2026 •

edited

Loading

github-actions Bot commented Feb 24, 2026 •

edited

Loading

Review: `perf(l1): offload incoming tx pool insertion to background task`

Fire-and-forget `JoinHandle`

TOCTOU in `add_transaction_to_pool` (pre-existing, not introduced here)

Minor: `#[cfg(feature = "l2")]` placement inside spawn

edg-l commented Feb 24, 2026 •

edited

Loading

Fix 1 — `remove_broadcasted_txs` (`mempool.rs`)

Fix 2 — Background task for pool insertion (`server.rs`)

Broadcaster notification ordering (`server.rs` lines 948, 980–984)

L2-filtered hashes in `known_txs` (`server.rs` lines 958–964)

Dropped `JoinHandle` (`server.rs` line 956)

Unbounded spawning under adversarial peer load (`server.rs` line 956)

Pre-existing TOCTOU in `add_transaction_to_pool` (`blockchain.rs` lines 2421, 2432)