Skip to content

Conversation

@basvandijk
Copy link
Collaborator

@basvandijk basvandijk commented May 28, 2025

The ledger_suite_orchestrator_canister is using a proc-macro from the askama crate:

#[derive(Template)]
#[template(path = "dashboard.html")]
pub struct DashboardTemplate { ... }

This is known to cause non-determinism (askama-rs/askama#461) and it's causing the Build Reproducibility check to fail in: https://github.com/dfinity/ic/actions/runs/15296835171/job/43032042139?pr=5194. It's unclear why it didn't fail earlier. #5348 is testing whether cherry-picking this fix into #5194 fixes the Build Determinism job.

We therefor apply the same work-around as in: rs/http_endpoints/public/build.rs and rs/ic_os/config/build.rs by loading the template in build.rs from a string instead of from a path.

Copy link
Contributor

@gregorydemay gregorydemay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @basvandijk for getting to the bottom of this. Just some understanding questions from my side.

@@ -1,3 +1,4 @@
load("@rules_rust//cargo:defs.bzl", "cargo_build_script")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understanding question: we use askama for other canisters, are they also not a problem for the reproducibility issue?

  1. ic-btc-checker
  2. ic-cketh-minter
  3. ic-ckbtc-kyt (this one we plan on deleting)

format!(
r#"
#[derive(Template)]
#[template(escape = "html", source = {:?}, ext = "html")]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

understanding question: why was this line added?

],
)

cargo_build_script(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@basvandijk : If I understand that failure correctly, this means that it doesn't solve the problem or there is another source of non-determinism?

//rs/ethereum/ledger-suite-orchestrator:_ledger_suite_orchestrator_canister.wasm.gz_finalize bazel-out/k8-opt/bin/rs/ethereum/ledger-suite-orchestrator/ledger_suite_orchestrator_canister.wasm.gz: 46b544a2e01428a5cf29384e2858f0bed73c98340b563a8d4157133a64ecb6b3
!= 5379b1dc3c8cea64d032c5d6ade6a4069a234aaadc945184fe6d122de69daf65

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean? That this PR is insufficient?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mraszyk The description of the PR states

it's causing the Build Reproducibility check to fail in: https://github.com/dfinity/ic/actions/runs/15296835171/job/43032042139?pr=5194.

However, that PR seems to suffer from the same problem

https://github.com/dfinity/ic/actions/runs/15303873653/job/43056720920?pr=5348

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I think I was too soon turning this from draft into a PR. There seems to be another source of non-determinism.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the logs it seems that there is a non-determinism with the archive. This would definitely produce non-determinism in the ledger (because it embeds its wasm) and in the ledger suite orchestrator (which embeds among other also the archive wasm)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gregorydemay we're embedding other canisters into the ledger_suite_orchestrator_canister right? Are those other canisters also using askama? That could be one explanation while there's still non-determinism.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@basvandijk

we're embedding other canisters into the ledger_suite_orchestrator_canister right?

yes that's correct, they are defined here

Are those other canisters also using askama?

I don't think so, none of the canisters in the ledger suite have a dependency on askama

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@basvandijk To me the root cause of the non-determinism seen here could be explained by the archive

//rs/ledger_suite/icrc1/archive:_wasm_archive_canister_u256 bazel-out/k8-opt-ST-42b6d6ef7a37/bin/rs/ledger_suite/icrc1/archive/_wasm_archive_canister_u256.wasm: 4977a63fc611b224d47a4c0004139dda86f6ef69563870011ac6e3ddbbf99b17

since this would trigger non-determinism for ledger_u256 and the ledger suite orchestrator.
It's interesting that only the u256 variant seems to have that problem (and not the u64 variant)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right the archive canister is the source.

I think I jumped to conclusions too soon since we have been suffering from non-determinism from a very similar use of askama recently which we fixed using #5282. But maybe the symbol name generator of the rustc WASM backend is not sensitive to a changing metadata hash and so canisters remain reproducible while on x86_64 things blow up.

@basvandijk basvandijk disabled auto-merge May 28, 2025 16:43
@basvandijk basvandijk marked this pull request as draft May 28, 2025 17:46
@basvandijk basvandijk closed this Sep 16, 2025
@basvandijk basvandijk deleted the basvandijk/reproducible-ledger_suite_orchestrator branch September 16, 2025 18:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants