Skip to content

broadcast: use block id from snapshot as the CMR for initial blocks#12354

Open
AshwinSekar wants to merge 1 commit intoanza-xyz:masterfrom
AshwinSekar:broadcast-bid
Open

broadcast: use block id from snapshot as the CMR for initial blocks#12354
AshwinSekar wants to merge 1 commit intoanza-xyz:masterfrom
AshwinSekar:broadcast-bid

Conversation

@AshwinSekar
Copy link
Copy Markdown

Problem

When we produce the first block after a snapshot, we might not have the shreds of the parent available to set as the chained merkle root.

Now that SIMD-0333 (#11355) is active on master, all banks will have a block id.

Summary of Changes

Use the parent bank's block id instead.

Comment thread turbine/src/broadcast_stage/standard_broadcast_run.rs Outdated

let parent_block_id = bank
.parent_block_id()
.expect("All banks (including snapshot banks) must have a block id");
Copy link
Copy Markdown
Author

@AshwinSekar AshwinSekar May 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've checked this before, but good to double check that there's no race that can cause this to fail:

If parent_block_id is our leader bank, we've set the block_id here

broadcast_utils::set_block_id_and_send(

Leader slots are done sequentially so there's no way that this expect can fail.

If this is a non-leader bank, it must be frozen ( we don't build on non-frozen banks ). Block id is set when frozen here

agave/core/src/replay_stage.rs

Lines 3579 to 3582 in b6abf60

debug_assert!(block_id.is_some() || is_leader_block);
if block_id.is_some() {
bank.set_block_id(block_id);
}

Finally if this was a snapshot bank, SIMD-0333 guarantees that it was set

block_id: Some(bank_fields.block_id),

block_id: RwLock::new(fields.block_id),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when did we start serializing this into the snapshot? Want to make sure there isn't some set of honest validators we could overlap with (e.g. FD or older version) that are producing snapshots without this

Edit: Looks like we just started serializing into snapshot last month.. does that mean 4.1 would be incompatible with 4.0 in some cases?

Copy link
Copy Markdown
Author

@AshwinSekar AshwinSekar May 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in 4.0 we added support to read the block id field #9644
and then in 4.1 we added support to write the block id field (but still keep else Hash::default() logic) #11355

So now in 4.2 I believe we can remove the else Hash::default() logic and hard unwrap and still have adjacent version compatibility

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my concern was around some 4.0 produced snapshot not being able to be processed by 4.1 code + this change.

But if I'm understanding, it sounds like we will just populate Default hash in that case and not fail the expect

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah sorry, I realized that master is not on 4.2 so you're correct this is problematic if we merge in 4.1.

The snapshot still allows for None in v4.1 for v4.0 compatibility

pub(crate) block_id: Option<Hash>, // Option wrapper can be removed in version after v4.1

We can wait to merge this change until after the branch cut

@bw-solana
Copy link
Copy Markdown

I believe test_slot_interrupt is broken. Need to modify the test code to set the block ID for parent banks before handling the broadcast receive results

Copy link
Copy Markdown

@bw-solana bw-solana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test changes look a little gross, but whatever. Seems like we could just write the block ID directly - doesn't seem any worse than directly writing the tick height to max before recursing parents

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 7, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 83.2%. Comparing base (e25cdff) to head (1305d76).

Additional details and impacted files
@@           Coverage Diff           @@
##           master   #12354   +/-   ##
=======================================
  Coverage    83.2%    83.2%           
=======================================
  Files         849      849           
  Lines      322523   322547   +24     
=======================================
+ Hits       268495   268519   +24     
  Misses      54028    54028           
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants