Check artifact integrity before execution by AndreiEres · Pull Request #8833 · paritytech/polkadot-sdk

AndreiEres · 2025-06-12T09:16:37Z

Description

To detect potential corruption of PVF artifacts on disk, we store their checksums and verify if they match before execution. In case of a mismatch, we recreate the artifact.

Integration

In Candidate Validation, we treat the error similarly to PossiblyInvalidError::RuntimeConstruction due to their close nature.

Review Notes

The Black3 hashing algorithm has already been used. I believe we can switch to twox, as suggested in the issue, because the checksum does not need to be cryptographically hashed, and we do not reveal the checksum in logs.

AndreiEres

If we come across a corrupted artifact, we should prepare it again. Can it be a possible vulnerability, @s0me0ne-unkn0wn?

AndreiEres · 2025-06-12T11:43:03Z

 	unistd::{ForkResult, Pid},
 };
 use polkadot_node_core_pvf_common::{
-	executor_interface::{prepare, prevalidate},


It looks messy, but I just merged two imports of polkadot_node_core_pvf_common. In fact, only compute_checksum is newly imported.

This reverts commit e4afb12.

AndreiEres · 2025-06-12T11:48:26Z

+
+/// Compute the checksum of the given artifact.
+pub fn compute_checksum(data: &[u8]) -> ArtifactChecksum {
+	blake3::hash(data).into()


Should we switch to twox?

I personally don't have any preference here. The blake3's throughput is more than enough for us, so why wouldn't we use it (especially given that we're already using it).

I checked Twox; it seems much faster, so I decided to switch eventually.

AndreiEres · 2025-06-12T13:13:15Z

/cmd prdoc --audience node_dev --bump patch

…e_dev --bump patch'

alexggh

Looks good to me.

I added, some comments, I would also be interested if this retry path we have is ever tested with an integration test.

alexggh · 2025-06-12T14:47:43Z

 					)
 				})?;

+				if artifact_checksum != compute_checksum(&compiled_artifact_blob) {


How much does this take for 10MiB, 100MiB ?

Blake3's throughput is ~3Gb/sec on what is close to our reference hw AFAIR

According to the crate's benchmark data, 10 MiB with Blake3 takes 1-2 ms. Twox should be at least 3x faster.

Ok, so we are not worried about this eating up too much time.

s0me0ne-unkn0wn

Looks good, left some comments but none of them are blockers!

s0me0ne-unkn0wn · 2025-06-13T20:48:18Z

 	Ok(buf)
 }

+pub type ArtifactChecksum = [u8; 32];


nit: In other places of code, we're very idiomatic and usually go with

#[repr(transparent)] pub struct ArtifactChecksum(H256)

With the following AsRef implementations, if needed, etc. I do not insist it should be implemented like that in this very case, it just seems to be one of our "best practices".

s0me0ne-unkn0wn · 2025-06-13T20:53:56Z

+
+/// Compute the checksum of the given artifact.
+pub fn compute_checksum(data: &[u8]) -> ArtifactChecksum {
+	blake3::hash(data).into()


I personally don't have any preference here. The blake3's throughput is more than enough for us, so why wouldn't we use it (especially given that we're already using it).

s0me0ne-unkn0wn · 2025-06-13T20:59:44Z

-	Ok((pvd, pov, execution_timeout))
+
+	let artifact_checksum = framed_recv_blocking(stream)?;
+	let artifact_checksum =


I'm NOT encouraging to change this right away, but...

Why do we want to encode a raw 32-byte sequence? Why not transfer it as a raw 32-byte sequence?

If we ought to encode, why don't we encode the entire tuple and do one-by-one instead?

Maybe a good candidate for a refactoring issue? I bet single recv() and single decode() are somewhat more performant than one-by-ones.

Let's do it in another pr

paritytech-workflow-stopper · 2025-06-16T14:13:46Z

All GitHub workflows were cancelled due to failure one of the required jobs.
Failed workflow url: https://github.com/paritytech/polkadot-sdk/actions/runs/15683054285
Failed job name: test-linux-stable

alexggh · 2025-06-18T07:39:40Z

@AndreiEres Can we backport this in stable2506 ? I see no reason to wait until 2509.

paritytech-release-backport-bot · 2025-06-18T14:41:30Z

Successfully created backport PR for stable2506:

[stable2506] Backport #8833 #8899

Fixes #677 Fixes #2399 # Description To detect potential corruption of PVF artifacts on disk, we store their checksums and verify if they match before execution. In case of a mismatch, we recreate the artifact. ## Integration In Candidate Validation, we treat the error similarly to PossiblyInvalidError::RuntimeConstruction due to their close nature. ## Review Notes The Black3 hashing algorithm has already been used. I believe we can switch to twox, as suggested in the issue, because the checksum does not need to be cryptographically hashed, and we do not reveal the checksum in logs. --------- Co-authored-by: cmd[bot] <41898282+github-actions[bot]@users.noreply.github.com> (cherry picked from commit 310e81d)

Backport #8833 into `stable2506` from AndreiEres. This backport includes a major version bump due to internal API changes that only affect the polkadot binary. Since stable2506 hasn’t been released yet and no other downstream users are impacted, the change is considered safe. See the [documentation](https://github.com/paritytech/polkadot-sdk/blob/master/docs/BACKPORT.md) on how to use this bot.  Co-authored-by: Andrei Eres <eresav@me.com> Co-authored-by: cmd[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Fixes #677 Fixes #2399 # Description To detect potential corruption of PVF artifacts on disk, we store their checksums and verify if they match before execution. In case of a mismatch, we recreate the artifact. ## Integration In Candidate Validation, we treat the error similarly to PossiblyInvalidError::RuntimeConstruction due to their close nature. ## Review Notes The Black3 hashing algorithm has already been used. I believe we can switch to twox, as suggested in the issue, because the checksum does not need to be cryptographically hashed, and we do not reveal the checksum in logs. --------- Co-authored-by: cmd[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Draft the solution

f1e4aab

AndreiEres added the T0-node This PR/Issue is related to the topic “node”. label Jun 12, 2025

AndreiEres commented Jun 12, 2025

View reviewed changes

AndreiEres added 4 commits June 12, 2025 13:05

Use helper to compute checksum

dedc1db

Don't use string for checksum

b8bb9b9

Treat as possibly invalid

5b0b8cf

Use just blake hash

e4afb12

AndreiEres commented Jun 12, 2025

View reviewed changes

Revert "Use just blake hash"

f190420

This reverts commit e4afb12.

AndreiEres commented Jun 12, 2025

View reviewed changes

AndreiEres marked this pull request as ready for review June 12, 2025 12:06

AndreiEres added 2 commits June 12, 2025 14:48

Move checking to worker

b0c5b97

Fix clippy errors

dd97469

github-actions Bot and others added 2 commits June 12, 2025 13:15

Update from github-actions[bot] running command 'prdoc --audience nod…

16cd982

…e_dev --bump patch'

Fix prdoc

aa5421a

AndreiEres changed the title ~~[WIP] Check artifact integrity before execution~~ Check artifact integrity before execution Jun 12, 2025

Update tests

1ed46f5

alexggh approved these changes Jun 12, 2025

View reviewed changes

AndreiEres added 2 commits June 13, 2025 10:12

Update semver

0c6c56d

Rename the var

9afc8a0

eskimor reviewed Jun 13, 2025

View reviewed changes

Comment thread polkadot/node/core/candidate-validation/src/lib.rs

s0me0ne-unkn0wn approved these changes Jun 13, 2025

View reviewed changes

AndreiEres added 3 commits June 16, 2025 15:05

Update ArtifactChecksum

b81d551

Add PartialEq to ArtifactChecksum

2bf8a2f

Switch ArtifactChecksum to twox_256

fd83516

AndreiEres added 3 commits June 16, 2025 16:15

Remove blake3

ada5913

Fix comment

8e4b0c4

Update tests

c2b7463

AndreiEres added this pull request to the merge queue Jun 17, 2025

github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Jun 17, 2025

AndreiEres added this pull request to the merge queue Jun 17, 2025

Merged via the queue into master with commit 310e81d Jun 17, 2025
257 of 259 checks passed

AndreiEres deleted the AndreiEres/check-artifact-integrity branch June 17, 2025 14:12

AndreiEres mentioned this pull request Jun 17, 2025

Refactor the argument passing to the execution worker #8886

Closed

AndreiEres added the A4-backport-stable2506 Pull request must be backported to the stable2506 release branch label Jun 18, 2025

paritytech-release-backport-bot Bot mentioned this pull request Jun 18, 2025

[stable2506] Backport #8833 #8899

Merged

pgherveou added this to [preview] release tracker Mar 20, 2026

github-project-automation Bot moved this to Done in [preview] release tracker Mar 20, 2026

pgherveou removed this from [preview] release tracker Mar 20, 2026

Conversation

AndreiEres commented Jun 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Integration

Review Notes

Uh oh!

AndreiEres left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AndreiEres commented Jun 12, 2025

Uh oh!

alexggh left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

s0me0ne-unkn0wn left a comment

Choose a reason for hiding this comment

Uh oh!

s0me0ne-unkn0wn Jun 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

paritytech-workflow-stopper Bot commented Jun 16, 2025

Uh oh!

Uh oh!

Uh oh!

alexggh commented Jun 18, 2025

Uh oh!

paritytech-release-backport-bot Bot commented Jun 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

AndreiEres commented Jun 12, 2025 •

edited

Loading

s0me0ne-unkn0wn Jun 13, 2025 •

edited

Loading