feat: introducing `BlockComponent` by ksn6 · Pull Request #8719 · anza-xyz/agave

ksn6 · 2025-10-28T02:36:44Z

Problem

We need a unified, extensible data structure to represent the components of a Solana block, covering both normal transaction entries and special markers.

The structure must support a number of Alpenglow-related needs, e.g.:

Markers for Fast Leader Handover: SIMD-0337: Markers for Alpenglow Fast Leader Handover solana-foundation/solana-improvement-documents#337
Reasoning about the Alpenglow clock: SIMD-0363: Simple Alpenglow Clock solana-foundation/solana-improvement-documents#363
Block footer markers: SIMD-0307: Add Block Footer solana-foundation/solana-improvement-documents#307
Validator tickets: SIMD-0357: Alpenglow VAT implementation solana-foundation/solana-improvement-documents#357
Storing double-Merkle roots

... all the while remaining flexible for new marker types in the future.

Summary of Changes

We introduce a new enum, BlockComponent, which represents one logical element within a block. This code was originally designed in the alpenglow repository.

We are upstreaming the code identically, solely with the exception of employing wincode for Entry serde.

codecov-commenter · 2025-10-28T03:20:22Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 83.3%. Comparing base (674bc91) to head (b31c73f).

Additional details and impacted files

@@            Coverage Diff            @@
##           master    #8719     +/-   ##
=========================================
  Coverage    83.3%    83.3%             
=========================================
  Files         852      853      +1     
  Lines      370790   373049   +2259     
=========================================
+ Hits       308906   311027   +2121     
- Misses      61884    62022    +138

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

akhi3030

I didn't review the whole PR but came up with a few comments that I think would also apply to the structs that I didn't review yet. I hope they are useful.

akhi3030 · 2025-10-30T08:10:54Z

+    DeserializationFailed(String),
+}
+
+impl fmt::Display for BlockComponentError {


Consider deriving Error for this type and then you do not need to implement Display. See consensus_pool.rs for an example.

akhi3030 · 2025-10-30T08:12:06Z

+impl Error for BlockComponentError {}
+
+// Conversion from bincode::Error to BlockComponentError
+impl From<bincode::Error> for BlockComponentError {


If you derive Error above, you might be able to use the from macro to remove this. But given how you are parsing the message, it might not work.

akhi3030 · 2025-10-30T12:35:46Z

+                if msg.contains("exceeds maximum") {
+                    // Extract numbers if possible, otherwise use defaults
+                    Self::TooManyEntries { count: 0, max: 0 }
+                } else if msg.contains("Unknown") {
+                    Self::UnknownVariant {
+                        variant_type: "Unknown".to_string(),
+                        id: 0,
+                    }
+                } else if msg.contains("Unsupported") {


this type of parsing feels quite fragile. I.e. I don't think the actual strings are considered to be part of the API of the library. I wonder what kind of impact we will have if this parsing fails.

If the parsing fails and that can cause significant breakage for us, I think we need to add tests to ensure that we can detect changes in the error strings.

If the parsing fails and that will not break anything for us, then I propose we do not implement the parsing.

akhi3030 · 2025-10-30T12:44:00Z

+        if entries.is_empty() {
+            return Err(BlockComponentError::EmptyEntryBatch);
+        }
+        Self::validate_entry_batch_length(entries.len())?;


can this function also check that len is not 0 and then the check above can be removed?

akhi3030 · 2025-10-30T12:47:19Z

+    ///
+    /// # Errors
+    /// Returns an error if the entries vector is empty or exceeds the maximum allowed size.
+    pub fn new_entry_batch(entries: Vec<Entry>) -> Result<Self, BlockComponentError> {


BlockComponent is an enum so despite having this constructor here, it is still possible to create an instance of it that will be considered invalid according to this function.

In order to ensure that we cannot construct invalid types, we would have to do something like below:

struct Entries(Vec<Entry>); pub enum BlockComponent { EntryBatch(Entries), BlockMarker(VersionedBlockMarker), }

And move the check from this function into the constructor for Entries. Note that it is important that the member of Entries is not pub. Otherwise one can bypass the validation.

akhi3030 · 2025-10-30T12:55:10Z

+
+        match self {
+            Self::EntryBatch(entries) => {
+                Self::validate_entry_batch_length(entries.len())?;


If you use the suggestion above to introduce a struct Entries(Vec<Entry>), then you can remove this check as it should be guaranteed to be valid.

akhi3030 · 2025-10-30T12:59:41Z

+    pub fn to_bytes(&self) -> Result<Vec<u8>, BlockComponentError> {
+        let mut buffer = Vec::new();
+
+        match self {
+            Self::EntryBatch(entries) => {
+                Self::validate_entry_batch_length(entries.len())?;
+
+                buffer = wincode::serialize(entries)
+                    .map_err(|e| BlockComponentError::SerializationFailed(e.to_string()))?;
+            }
+            Self::BlockMarker(marker) => {
+                // Write zero entry count
+                buffer.extend_from_slice(&0u64.to_le_bytes());
+                // Write marker data
+                let marker_bytes = marker.to_bytes()?;
+                buffer.extend_from_slice(&marker_bytes);
+            }
+        }
+
+        Ok(buffer)
+    }


Suggested change

pub fn to_bytes(&self) -> Result<Vec<u8>, BlockComponentError> {

let mut buffer = Vec::new();

match self {

Self::EntryBatch(entries) => {

Self::validate_entry_batch_length(entries.len())?;

buffer = wincode::serialize(entries)

.map_err(|e| BlockComponentError::SerializationFailed(e.to_string()))?;

}

Self::BlockMarker(marker) => {

// Write zero entry count

buffer.extend_from_slice(&0u64.to_le_bytes());

// Write marker data

let marker_bytes = marker.to_bytes()?;

buffer.extend_from_slice(&marker_bytes);

}

}

Ok(buffer)

}

pub fn to_bytes(&self) -> Result<Vec<u8>, BlockComponentError> {

match self {

Self::EntryBatch(entries) => {

Self::validate_entry_batch_length(entries.len())?;

wincode::serialize(entries)

.map_err(|e| BlockComponentError::SerializationFailed(e.to_string()))

}

Self::BlockMarker(marker) => {

// Write zero entry count

let mut buffer = 0u64.to_le_bytes().to_vec();

// Write marker data

let marker_bytes = marker.to_bytes()?;

buffer.extend_from_slice(&marker_bytes);

Ok(buffer)

}

}

}

This seems like less code and achieves the same.

akhi3030 · 2025-10-30T13:02:31Z

+            cursor += bytes_consumed;
+        }
+
+        assert_eq!(cursor, data.len());


this seems like it should return an error instead of panicking.

If we are guaranteed that the whole data will be used, that seems to suggest that data is valid and then the function should not return an error anywhere but panic everywhere.

akhi3030 · 2025-10-30T13:04:37Z

+    /// # Errors
+    /// Returns an error if any component fails to serialize.
+    pub fn to_bytes_multiple(components: &[Self]) -> Result<Vec<u8>, BlockComponentError> {
+        let mut result = Vec::new();


we can use serialized_size() below to do Vec::with_capacity() to reduce reallocations.

akhi3030 · 2025-10-30T13:05:29Z

+impl Serialize for BlockComponent {
+    fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
+    where
+        S: Serializer,
+    {
+        let bytes = self.to_bytes().map_err(serde::ser::Error::custom)?;
+        serializer.serialize_bytes(&bytes)
+    }
+}
+
+impl<'de> Deserialize<'de> for BlockComponent {


If we are implementing serde serialize and deserialize, then why do we need to additionally implement to_bytes(), from_bytes(), etc.?

ksn6 · 2025-11-05T02:11:01Z

@akhi3030 - appreciate the comments here.

This code is a bit messy atm - @cpubot and I are rewriting the entirety of this via wincode, which will provide performance improvements + substantially simplify the code.

Closing this PR.

ksn6 requested review from AshwinSekar, bw-solana, carllin, cpubot and wen-coding October 28, 2025 02:37

feat: introducing BlockComponent

b76277e

ksn6 force-pushed the introduce-block-component branch from b31c73f to b76277e Compare October 30, 2025 07:49

akhi3030 reviewed Oct 30, 2025

View reviewed changes

ksn6 closed this Nov 5, 2025

Conversation

ksn6 commented Oct 28, 2025

Problem

Summary of Changes

Uh oh!

codecov-commenter commented Oct 28, 2025

Codecov Report

Uh oh!

akhi3030 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ksn6 commented Nov 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants