Skip to content

Change ChainDescriptions into blobs#3787

Merged
bart-linera merged 13 commits intolinera-io:mainfrom
bart-linera:issue-2390
Apr 29, 2025
Merged

Change ChainDescriptions into blobs#3787
bart-linera merged 13 commits intolinera-io:mainfrom
bart-linera:issue-2390

Conversation

@bart-linera
Copy link
Contributor

Motivation

This will allow for easier initialization of the network at genesis, among other things.

Proposal

ChainDescription will become a blob, and OpenChain messages will be removed. Calling open_chain will create the blob, and the chain will actually get initialized when a block is proposed for it.

Test Plan

Should be covered by CI.

Release Plan

  • Nothing to do / These changes follow the usual release cycle.

Links

RemoteNode::download_blobs(&blob_ids, validators, self.blob_download_timeout)
.await
.ok_or(LocalNodeError::BlobsNotFound(blob_ids))?;
self.local_node.storage_client().write_blobs(&blobs).await?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But now the remote node could have tricked us, and the chain might not actually have been created.

I think we need to use something like ChainClient::update_local_node_with_blobs_from, which downloads proof from the validators that the blob was created. Except if we already have at least one correctly signed certificate from this chain—its first block can also count as a proof, I guess (if we double-check the committee information with the admin chain).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, good point 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one I'd consider a blocker, or at least it should have a GitHub issue and be addressed right after this PR.

let mut validators_vec = validators.iter().collect::<Vec<_>>();
validators_vec.shuffle(&mut rand::thread_rng());
for remote_node in validators_vec {
let info = self.chain_info(chain_id, validators).await?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, at this point we only need the next block height. Why not just set it to zero if we don't have the chain at all?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need to return the info if target_next_block_height is zero, so we need to get it. (Maybe we could do a shortcut if it's not zero, but I'm not sure if it's worth it...)

let blob = RemoteNode::download_blob(&nodes, chain_desc_id, self.blob_download_timeout)
.await
.ok_or(LocalNodeError::BlobsNotFound(vec![chain_desc_id]))?;
self.local_node.storage_client().write_blob(&blob).await?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above: We can't trust that blob yet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Also kind of a blocker.)

pub struct ChainDescription {
origin: ChainOrigin,
timestamp: Timestamp,
config: OpenChainConfig,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should consider inlining this type here. (Probably in a later PR.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


/// Invariant for the states of active chains.
pub fn ensure_is_active(&self) -> Result<(), ChainError> {
pub async fn ensure_is_active(&mut self, local_time: Timestamp) -> Result<(), ChainError> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe due for a rename


/// Rearranges the messages in the bundle so that the first message is an `OpenChain` message.
/// Returns whether the `OpenChain` message was found at all.
pub fn put_openchain_at_front(bundles: &mut [IncomingBundle]) -> bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

@bart-linera bart-linera force-pushed the issue-2390 branch 10 times, most recently from f69e2ed to 0de904d Compare April 28, 2025 16:38
@bart-linera bart-linera marked this pull request as ready for review April 28, 2025 16:39
.into_iter()
.filter_map(|(blob_id, blob)| {
(blob_id.blob_type == BlobType::ChainDescription)
.then(|| bcs::from_bytes::<ChainDescription>(blob.content().bytes()).unwrap())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check the operation results instead, and make the operation return the child chain ID as a result? (Also below.)

@@ -432,28 +468,22 @@ where
};

// Download the parent chain.
Copy link
Contributor

@afck afck Apr 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have to do that at all anymore, I think.

(Edit: I guess we do, as a proof for the chain description blob.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually removed it 😅 My thinking is that the proof will be downloaded in ensure_has_chain_description if we don't have the relevant blob (of course, currently it doesn't check the proof, but will be fixed as a part of #2351 ).

@@ -303,13 +303,7 @@ where
}
let local_time = self.state.storage.clock().current_time();
// TODO(#2351): This sets the committee and then checks that committee's signatures.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Right… that's still the case now, is it?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think it is 😬

RemoteNode::download_blobs(&blob_ids, validators, self.blob_download_timeout)
.await
.ok_or(LocalNodeError::BlobsNotFound(blob_ids))?;
self.local_node.storage_client().write_blobs(&blobs).await?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one I'd consider a blocker, or at least it should have a GitHub issue and be addressed right after this PR.

let blob = RemoteNode::download_blob(&nodes, chain_desc_id, self.blob_download_timeout)
.await
.ok_or(LocalNodeError::BlobsNotFound(vec![chain_desc_id]))?;
self.local_node.storage_client().write_blob(&blob).await?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Also kind of a blocker.)

balance: Amount,
timestamp: Timestamp,
) -> Result<(), ChainError>
async fn create_chain(&self, description: ChainDescription) -> Result<(), ChainError>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this really belongs in linera-storage. Maybe this function should just be inlined at the call site(s) now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are quite a few call sites, so I think it still makes sense to have such a function - but I agree that it seems to be a bit outside of storage's scope. I'm not sure how to clean it up best right now, though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

};
let admin_chain_description =
ChainDescription::new(origin, admin_config.clone(), timestamp);
let admin_id = admin_chain_description.id();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of this is probably duplicated somewhere else in the initialization code (where we actually write the description). Maybe it can be simplified.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're probably right, but I don't see an obvious way of doing this now - I'll take a note of it for later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{
init_worker_with_chains(storage, [(description, owner, balance)]).await
}
async fn add_child_chain(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we ever actually do that? Insert a child chain just deus ex machina, without an OpenChain operation?

action: MessageAction::Accept,
};
let description = env
.add_child_chain(chain_2_desc.id(), sender_key_pair.public().into(), balance)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see… but now this test makes less sense because there's no OpenChain message. (And the name is wrong, too.) I'd almost say this test can be removed. And then maybe we don't need add_child_chain anymore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also used in one more test to simulate creating a block that opens a chain, without actually using it. So while we could remove this test, this function would still have some use.

/// Controller to track fuel and storage consumption.
resource_controller: ResourceController,
/// Additional context for the runtime.
user_context: UserInstance::UserContext,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it really worth creating that trait and associated type for this? We already have several optional fields in here that are None in some cases (operation vs. message vs. service call etc.).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, honestly, but it does avoid some awkward unwrapping or returning errors from codepaths that should be unreachable.
I might even have started with an option and then decided that this is better, but I don't remember if that was how I introduced this, TBH 😬

next_message_id: message_id,
parent_id,
block_height,
timestamp,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And the timestamp is actually not necessary here, because it's part of the system execution state.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you mean 😅 I believe we need this to initialize the new system execution state with the correct timestamp (and it might be different than the one stored in the current chain's system execution state until now), or am I missing something?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is while executing the block on the parent chain, is it? Specifically, executing the OpenChain operation?

The timestamp we want to assign to the child chain is the timestamp of the current block. And yes, that is exactly the timestamp in the system execution state, because we update that at the beginning of block execution.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see! That makes sense. I'll try to apply your suggestion, then - could be that it will let me remove this whole trait thing 👍

Copy link
Contributor

@afck afck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving, since as discussed the comments can also be addressed in follow-up PRs.

@bart-linera bart-linera merged commit 49c09a9 into linera-io:main Apr 29, 2025
26 checks passed
@bart-linera bart-linera deleted the issue-2390 branch April 29, 2025 18:54
// Since the faucet chain exceeds the configured maximum length, the faucet should have
// switched after the first new chain.
assert!(other_outcome.message_id.chain_id != outcome.message_id.chain_id);
assert!(other_outcome.chain_id != outcome.chain_id);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should compare the parent chain IDs. I'll fix this when I port #3863 to the main branch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, you're right! 😬 Thanks!

ndr-ds pushed a commit that referenced this pull request May 1, 2025
…end (#3870)

## Motivation

After #3787, it is no longer needed to initialize the shard of the
dual-store databases separately.

## Proposal

Remove special code added in #2734

## Test Plan

CI
github-merge-queue bot pushed a commit that referenced this pull request Jun 11, 2025
This is a completion of @bart-linera's #4075:

## Motivation

#3787 left a potential security issue: when a client was downloading a
missing `ChainDescription` blob, it would just download a blob, but
wouldn't make sure that it was legitimately created on another chain.

## Proposal

Whenever a `ChainDescription` is fetched, fetch the certificate for the
block that created it and validate it against committees known from the
admin chain.

## Test Plan

The only occurrences of `RemoteNode::download_blob[s]` are now after
`process_certificate` returned `BlobsNotFound`, i.e. after checking a
certificate's signatures.

CI should catch regressions.

## Release Plan

- Nothing to do / These changes follow the usual release cycle.

## Links

- Closes #2351 
- [reviewer
checklist](https://github.com/linera-io/linera-protocol/blob/main/CONTRIBUTING.md#reviewer-checklist)

## Notes

The `TODO` in `attempted_changes.rs` has been removed after verifying
that the safety issue is no longer present there: the committees are
taken from a blob, but that blob can only exist on the validator if it
has been legitimately created on another chain.

---------

Co-authored-by: Bartłomiej Kamiński <bartlomiej.kaminski@linera.io>
github-merge-queue bot pushed a commit that referenced this pull request Sep 2, 2025
## Motivation

We simulate a range of validator faults in the `linera-core` tests. One
of them is called `Malicious`, but it really just initializes the
storage with wrong chains and returns `ArithmeticError`s in some cases.
After
#3787 (comment),
the wrong initialization even causes the chains to have different chain
ID, not just a wrong balance, so they will often return `InactiveChain`
errors.

Even before that chain, the storage initialization fault didn't play
well with `set_fault_type`: Even if we switch them to `Honest`, they
will still be faulty due to their storage contents.

## Proposal

Replace `Malicious` with `NoChains`: Storage _does_ get initialized, but
these validators return `InactiveChain` anyway, when handling block
proposals, certificates or chain info queries.

## Test Plan

CI

## Release Plan

- Nothing to do / These changes follow the usual release cycle.
- (But might as well be backported: It's test only and not problematic.)

## Links

- Closes #3858.
- [reviewer
checklist](https://github.com/linera-io/linera-protocol/blob/main/CONTRIBUTING.md#reviewer-checklist)
afck added a commit to afck/linera-protocol that referenced this pull request Sep 4, 2025
## Motivation

We simulate a range of validator faults in the `linera-core` tests. One
of them is called `Malicious`, but it really just initializes the
storage with wrong chains and returns `ArithmeticError`s in some cases.
After
linera-io#3787 (comment),
the wrong initialization even causes the chains to have different chain
ID, not just a wrong balance, so they will often return `InactiveChain`
errors.

Even before that chain, the storage initialization fault didn't play
well with `set_fault_type`: Even if we switch them to `Honest`, they
will still be faulty due to their storage contents.

## Proposal

Replace `Malicious` with `NoChains`: Storage _does_ get initialized, but
these validators return `InactiveChain` anyway, when handling block
proposals, certificates or chain info queries.

## Test Plan

CI

## Release Plan

- Nothing to do / These changes follow the usual release cycle.
- (But might as well be backported: It's test only and not problematic.)

## Links

- Closes linera-io#3858.
- [reviewer
checklist](https://github.com/linera-io/linera-protocol/blob/main/CONTRIBUTING.md#reviewer-checklist)
ma2bd pushed a commit that referenced this pull request Sep 5, 2025
## Motivation

We simulate a range of validator faults in the `linera-core` tests. One
of them is called `Malicious`, but it really just initializes the
storage with wrong chains and returns `ArithmeticError`s in some cases.
After
#3787 (comment),
the wrong initialization even causes the chains to have different chain
ID, not just a wrong balance, so they will often return `InactiveChain`
errors.

Even before that chain, the storage initialization fault didn't play
well with `set_fault_type`: Even if we switch them to `Honest`, they
will still be faulty due to their storage contents.

## Proposal

Replace `Malicious` with `NoChains`: Storage _does_ get initialized, but
these validators return `InactiveChain` anyway, when handling block
proposals, certificates or chain info queries.

## Test Plan

CI

## Release Plan

- Nothing to do / These changes follow the usual release cycle.
- (But might as well be backported: It's test only and not problematic.)

## Links

- Closes #3858.
- [reviewer
checklist](https://github.com/linera-io/linera-protocol/blob/main/CONTRIBUTING.md#reviewer-checklist)
ma2bd pushed a commit that referenced this pull request Sep 5, 2025
## Motivation

We simulate a range of validator faults in the `linera-core` tests. One
of them is called `Malicious`, but it really just initializes the
storage with wrong chains and returns `ArithmeticError`s in some cases.
After
#3787 (comment),
the wrong initialization even causes the chains to have different chain
ID, not just a wrong balance, so they will often return `InactiveChain`
errors.

Even before that chain, the storage initialization fault didn't play
well with `set_fault_type`: Even if we switch them to `Honest`, they
will still be faulty due to their storage contents.

## Proposal

Replace `Malicious` with `NoChains`: Storage _does_ get initialized, but
these validators return `InactiveChain` anyway, when handling block
proposals, certificates or chain info queries.

## Test Plan

CI

## Release Plan

- Nothing to do / These changes follow the usual release cycle.
- (But might as well be backported: It's test only and not problematic.)

## Links

- Closes #3858.
- [reviewer
checklist](https://github.com/linera-io/linera-protocol/blob/main/CONTRIBUTING.md#reviewer-checklist)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants