Skip to content

Conversation

@jimmygchen
Copy link
Member

Issue Addressed

This issue was identified during the fusaka audit competition.

The verify_parent_block_and_finalized_descendant in data column gossip verification currently load the parent first before checking if the column descends from the finalized root.

However, the fork_choice.get_block(&block_parent_root) function also make the same check internally:

/// Returns a `ProtoBlock` if the block is known **and** a descendant of the finalized root.
pub fn get_block(&self, block_root: &Hash256) -> Option<ProtoBlock> {
if self.is_finalized_checkpoint_or_descendant(*block_root) {
self.proto_array.get_block(block_root)
} else {
None
}
}

Therefore, if the column does not descend from the finalized root, we return an UnknownParent error, before hitting the is_finalized_checkpoint_or_descendant check just below.

Which means we IGNORE the gossip message instead REJECT, and the gossip peer is not immediately penalised. This deviates from the spec.

However, worth noting that lighthouse will currently attempt to request the parent from this peer, and if the peer is not able to serve the parent, it gets penalised with a LowToleranceError, and will get banned after ~5 occurences.

if let Some(Err(RpcResponseError::VerifyError(e))) = &resp {
self.report_peer(peer_id, PeerAction::LowToleranceError, e.into());
}

This PR will penalise the bad peer immediately instead of performing block lookups before penalising it.

…data column if parent does not descend from the finalize root.
@jimmygchen jimmygchen added ready-for-review The code is ready for review v8.0.0 Q4 2025 Fusaka Mainnet Release labels Oct 9, 2025
@jimmygchen jimmygchen changed the title Reject data columns that does not descend from finalize root Reject data columns that does not descend from finalize root instead of ignoring it Oct 9, 2025
Copy link
Member

@michaelsproul michaelsproul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice improvement, makes sense to me.

@michaelsproul michaelsproul added ready-for-merge This PR is ready to merge. and removed ready-for-review The code is ready for review labels Oct 9, 2025
@mergify mergify bot added the queued label Oct 9, 2025
mergify bot added a commit that referenced this pull request Oct 9, 2025
@mergify mergify bot merged commit 538b704 into sigp:unstable Oct 9, 2025
37 checks passed
@mergify mergify bot removed the queued label Oct 9, 2025
jimmygchen added a commit to jimmygchen/lighthouse that referenced this pull request Oct 16, 2025
mergify bot pushed a commit that referenced this pull request Oct 16, 2025
This PR reverts #8179.

It turns out that the fix was invalid because an unknown root is always not a finalized descendant:

https://github.com/sigp/lighthouse/blob/522bd9e9c6ac167f2231525e937c9ebbcb86cf6e/consensus/proto_array/src/proto_array.rs#L976-L979

so for any data columns with unknown parents, it will always penalise the gossip peer and disconnect it pretty quickly. On a small network, the node may lose all of its peers.

The impact is pretty obvious when the peer count is small and sync speed is slow, and is therefore easily reproducible by running a fresh supernode on devnet-3.

This isn't as obvious on a live testnet like holesky / sepolia, we haven't noticed this, probably due to its high peer count and sync speed - the nodes might be able to reach head quickly before losing too many peers.


  The previous behaviour isn't ideal but safe:  triggering unknown parent lookup and penalise the bad peer if it happens to be malicious or faulty. So for now it's safer to revert the change and plan for a proper fix after the v8 release.


Co-Authored-By: Jimmy Chen <[email protected]>
jchavarri pushed a commit to jchavarri/lighthouse that referenced this pull request Oct 21, 2025
…of ignoring it (sigp#8179)

This issue was identified during the fusaka audit competition.

The [`verify_parent_block_and_finalized_descendant`](https://github.com/sigp/lighthouse/blob/62d9302e0f9dd9f94d0325411a3029b36ad90685/beacon_node/beacon_chain/src/data_column_verification.rs#L606-L627) in data column gossip verification currently load the parent first before checking if the column descends from the finalized root.

However, the `fork_choice.get_block(&block_parent_root)` function also make the same check internally:

https://github.com/sigp/lighthouse/blob/8a4f6cf0d5b6b261b2c3439ce7c05383a53d30c5/consensus/fork_choice/src/fork_choice.rs#L1242-L1249

Therefore, if the column does not descend from the finalized root, we return an `UnknownParent` error, before hitting the `is_finalized_checkpoint_or_descendant` check just below.

Which means we `IGNORE` the gossip message instead `REJECT`, and the gossip peer is not _immediately_ penalised. This deviates from the spec.

However, worth noting that lighthouse will currently attempt to request the parent from this peer, and if the peer is not able to serve the parent, it gets penalised with a `LowToleranceError`, and will get banned after ~5 occurences.

https://github.com/sigp/lighthouse/blob/ffa7b2b2b9e3b4e70678e2c749b8bc45234febd7/beacon_node/network/src/sync/network_context.rs#L1530-L1532

This PR will penalise the bad peer immediately instead of performing block lookups before penalising it.


  


Co-Authored-By: Jimmy Chen <[email protected]>
jchavarri pushed a commit to jchavarri/lighthouse that referenced this pull request Oct 21, 2025
This PR reverts sigp#8179.

It turns out that the fix was invalid because an unknown root is always not a finalized descendant:

https://github.com/sigp/lighthouse/blob/522bd9e9c6ac167f2231525e937c9ebbcb86cf6e/consensus/proto_array/src/proto_array.rs#L976-L979

so for any data columns with unknown parents, it will always penalise the gossip peer and disconnect it pretty quickly. On a small network, the node may lose all of its peers.

The impact is pretty obvious when the peer count is small and sync speed is slow, and is therefore easily reproducible by running a fresh supernode on devnet-3.

This isn't as obvious on a live testnet like holesky / sepolia, we haven't noticed this, probably due to its high peer count and sync speed - the nodes might be able to reach head quickly before losing too many peers.


  The previous behaviour isn't ideal but safe:  triggering unknown parent lookup and penalise the bad peer if it happens to be malicious or faulty. So for now it's safer to revert the change and plan for a proper fix after the v8 release.


Co-Authored-By: Jimmy Chen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-for-merge This PR is ready to merge. v8.0.0 Q4 2025 Fusaka Mainnet Release

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants