Skip to content

Gossip blobs/data columns processed even after block import via EL blobs #7461

@jimmygchen

Description

@jimmygchen

On PeerDAS devnet-7, gossip messages for data columns (blobs) are being processed even after the corresponding block has been imported via engine_getBlobs.

This behaviour stems from the combination of EL blob fetching and Lighthouse’s batch re-publishing mechanism, as outlined in our blog post:

  1. The node receives a block via gossip and attempts to fetch the associated blobs from the EL.
  2. If all blobs are successfully returned, the node imports the beacon block and begins re-publishing blobs/data columns in batches (code reference).

Why this happens

The batch publishing mechanism was introduced to conserve bandwidth and prevent network-wide duplication. Each batch contains a randomly selected subset of blobs, ideally allowing full blob coverage early on without flooding.

However, the current batch size is 1, and the node only marks a blob as “observed” in the beacon chain cache right before it publishes it (code reference).

This means:

  • The node may import all 4 blobs from the EL, but only "observe" one at a time as it publishes them.
  • There's a 200ms delay between publishing each blob.
  • Until a blob is published and marked as observed, the node does not consider it “seen”.

As a result, blobs/data columns arriving via gossip before being published locally may:

  • Be unnecessarily processed again,
  • Trigger additional work (e.g. redundant block imports),
  • Attempt to forward duplicate message already published earlier.

Impact

This issue is particularly noticeable in PeerDAS due to the larger volume of data column messages. It contributes to performance degradation and is likely related to the behaviour described in #6439.

Ideas

  • Mark blobs as seen immediately after fetching from the EL
    This could reduce redundant processing and re-publishing. However, there are concerns about potential censorship - doing so may suppress message propagation before we’ve even forwarded or published the blob once.

  • Send IDONTWANT immediately upon EL fetch
    Proactively sending IDONTWANT for blobs already obtained via the EL would signal to peers that we no longer need the gossip copy, helping reduce bandwidth usage and processing overhead.

  • Skip import at gossip verification using DA checker (Thanks @dapplion for suggesting this)
    In the gossip verification function, we could skip processing if the blob / data column already exists in the DA checker:

    if column in da_checker:
        return accept

    We'd still need to perform a full equality check to ensure the message matches exactly - this avoids accepting invalid blobs from peers while still saving verification & import cycles.

  • libp2p Batch Publishing
    If the proposed libp2p batch publishing proves effective, we could consider adjusting the strategy: instead of sending 8 copies of 32 columns per batch, we send 2 copies of 128 columns per batch. This would eliminate this problem as well.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions