Skip to content

chore: fixing M3 devcontainer builds#21611

Merged
ludamad merged 3 commits intomerge-train/spartanfrom
mr/m3-build-fixes
Mar 16, 2026
Merged

chore: fixing M3 devcontainer builds#21611
ludamad merged 3 commits intomerge-train/spartanfrom
mr/m3-build-fixes

Conversation

@mrzeszutko
Copy link
Contributor

Fix: ARM64 Mac (M3) Devcontainer Build Failures

Problem

Building inside a devcontainer on Mac with Apple M3 chip fails in multiple ways:

  1. SIGILL crashes — The bb-sol build step crashes when running honk_solidity_key_gen, and E2E tests fail with Illegal instruction errors.
  2. Rust compilation failures — The noir build fails with can't find crate for serde and similar errors when noir and avm-transpiler build in parallel, racing on the shared CARGO_HOME.

Root Cause

SVE instructions from zig -target native

  1. CI runs on AWS Graviton (ARM64 with SVE vector extensions)
  2. The zig compiler wrapper uses -target native-linux-gnu.2.35, which on Graviton enables SVE instructions
  3. Mac M3 devcontainer (ARM64 without SVE) downloads the same cached binaries
  4. Binaries contain SVE opcodes (e.g. 0x04be4000) that Apple Silicon can't execute → SIGILL

Cache keys already include architecture via cache_content_hash (which appends $OSTYPE-$(uname -m)), so amd64 vs arm64 caches never collide. The problem is specifically that two ARM64 machines (Graviton with SVE vs Apple Silicon without SVE) share the same architecture tag but have different CPU feature sets. The fix is to stop emitting CPU-specific instructions in the first place.

Parallel Rust build race condition

The top-level bootstrap runs noir and avm-transpiler builds in parallel. Both invoke cargo build, and both share the same CARGO_HOME (~/.cargo) which contains the crate registry and download cache. When both cargo processes run concurrently, they race on shared registry state, causing downstream crates (e.g. serde-big-array, ecdsa) to fail with can't find crate errors during compilation. This does not happen on CI where builds are cached, only on local fresh builds (e.g. NO_CACHE=1).

Fixes

1. Zig compiler wrappers: explicit ARM64 target

Files: barretenberg/cpp/scripts/zig-cc.sh, barretenberg/cpp/scripts/zig-c++.sh

Changed -target native-linux-gnu.2.35 to use explicit aarch64-linux-gnu.2.35 on ARM64 Linux. This produces generic ARM64 code without CPU-specific extensions (SVE, etc.), ensuring binaries work on all ARM64 machines — Graviton, Apple Silicon, Ampere, etc.

x86_64 behavior is unchanged (still uses native).

2. Extract native_cache_key variable in barretenberg bootstrap

File: barretenberg/cpp/bootstrap.sh

Extracted the repeated cache key pattern barretenberg-$native_preset-$hash into a single native_cache_key variable, used by build_native_objects, build_native, and related functions. Pure refactor, no change in cache key values.

3. Better error handling in init_honk.sh

File: barretenberg/sol/scripts/init_honk.sh

Added set -eu so the script fails immediately on error instead of silently continuing after SIGILL. Added an existence check for the honk_solidity_key_gen binary with a clear error message.

4. Serialize parallel cargo builds with flock

Files: noir/bootstrap.sh, avm-transpiler/bootstrap.sh

Both scripts wrap their cargo build invocations with flock -x 200 on a shared lock file (/tmp/rustup.lock):

(
  flock -x 200
  cd noir-repo && cargo build --locked --release --target-dir target
) 200>/tmp/rustup.lock

This acquires an exclusive file lock before running cargo, so if both noir and avm-transpiler builds run in parallel, one waits for the other to finish. The lock is automatically released when the subshell exits. This eliminates the CARGO_HOME race condition without requiring changes to the top-level parallelism.

Notes

E2E Tests

The E2E test failures (SIGKILL from invalid instructions) have the same root cause as the SIGILL crashes — the bb binary used by tests was from the SVE-contaminated cache. After rebuilding with these fixes, E2E tests work.

# generic ARM64 code. This prevents CPU-specific instructions (e.g. SVE on Graviton)
# from being emitted, ensuring binaries work across all ARM64 machines including
# Apple Silicon in devcontainers.
if [[ "$(uname -s)" == "Linux" ]]; then
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels hacky. There must be a more targeted solution for your fix

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there no generic native target that is more conservative?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about it. I'm ok with this. but can we add -mcpu=apple_m1 and just more concisely say this is to make sure arm64 builds work on the lowest common denominator apple_m1 arm64 subset?

Copy link
Collaborator

@ludamad ludamad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some notes

# Serialize cargo operations to avoid race conditions with avm-transpiler
# which may run in parallel and share the same CARGO_HOME.
(
flock -x 200
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would be nicer if if it was of the form 'flock cmd' but would need some care to set up. looks ok

AztecBot and others added 2 commits March 16, 2026 15:09
…et (#21621)

## Summary

Implements Charlie's suggestion from
#21611:

- **Remove `TARGET_ARCH` from the `default` cmake preset** — it was set
to `skylake` but silently ignored on ARM (due to the `if(NOT ARM)`
guard), which is confusing.
- **Auto-detect architecture in `arch.cmake`** — when `TARGET_ARCH` is
not explicitly set, pick `skylake` on x86_64 and `generic` on ARM. This
matches what the cross-compile presets already do (`zig-amd64-linux`
uses `skylake`, `zig-arm64-linux` uses `generic`).

The `generic` ARM target prevents CPU-specific instructions (e.g. SVE on
Graviton) from being emitted, so cached binaries work on all ARM64
machines including Apple Silicon in devcontainers.

Presets that explicitly set `TARGET_ARCH` (like `zig-amd64-linux`,
`zig-arm64-linux`, `zig-amd64-windows`) are unaffected — the
auto-detection only kicks in when `TARGET_ARCH` is not set.

ClaudeBox log: https://claudebox.work/s/8d283e5d61970673?run=1
@ludamad ludamad enabled auto-merge (squash) March 16, 2026 19:12
@ludamad ludamad merged commit 6845b72 into merge-train/spartan Mar 16, 2026
11 checks passed
@ludamad ludamad deleted the mr/m3-build-fixes branch March 16, 2026 19:37
AztecBot added a commit that referenced this pull request Mar 16, 2026
PR #21611 introduced -march=generic for ARM to prevent SVE on Graviton,
but this broke cross-compile targets:
- arm64-android: missing CMAKE_SYSTEM_PROCESSOR meant ARM wasn't detected,
  defaulting to -march=skylake (invalid for aarch64)
- arm64-macos: -march=generic stripped AES/crypto features needed by
  libdeflate, conflicting with zig's -mcpu=apple_a14
- arm64-ios/arm64-ios-sim: same missing CMAKE_SYSTEM_PROCESSOR issue

Fix: add CMAKE_SYSTEM_PROCESSOR to arm64-ios/ios-sim/android presets,
and skip -march for ARM cross-compile targets (zig handles CPU targeting).
AztecBot added a commit that referenced this pull request Mar 16, 2026
PR #21611 added TARGET_ARCH auto-detection (skylake for x86, generic for ARM)
but this ran for cross-compiles too, where zig -target/-mcpu already handles
CPU selection. Guard auto-detection with NOT CMAKE_CROSSCOMPILING.

Also add missing CMAKE_SYSTEM_PROCESSOR to arm64-ios, arm64-ios-sim, and
arm64-android presets so ARM is properly detected for DISABLE_ASM/DISABLE_ADX.
AztecBot added a commit that referenced this pull request Mar 17, 2026
The M3 devcontainer fix (#21611) auto-detects TARGET_ARCH based on
host architecture, but this breaks cross-compilation from x86_64 to
aarch64 targets:

- arm64-android/ios: HOST is x86_64 → ARM=false → TARGET_ARCH=skylake
  → '-march=skylake' on aarch64 = 'unknown CPU' error
- arm64-macos: CMAKE_SYSTEM_PROCESSOR=aarch64 → ARM=true →
  TARGET_ARCH=generic → '-march=generic' conflicts with '-mcpu=apple_a14'
  causing libdeflate NEON/AES errors

Fix: only auto-detect TARGET_ARCH for native (non-cross) builds.
Cross-compilation presets specify their target via -target/-mcpu flags
on the compiler command line; adding -march based on HOST arch is wrong.
AztecBot added a commit that referenced this pull request Mar 17, 2026
#21611 introduced -march auto-detection in arch.cmake (skylake for x86,
generic for ARM) but this breaks cross-compilation:

1. arm64-android: CMAKE_SYSTEM_PROCESSOR defaults to host (x86_64) since
   the preset doesn't set it, so -march=skylake is passed to aarch64 zig.
2. arm64-macos: ARM is detected, but -march=generic overrides the preset's
   -mcpu=apple_a14, disabling AES/crypto extensions needed by libdeflate.

Fix: Gate auto-detection on NOT CMAKE_CROSSCOMPILING. Cross-compile presets
already control the target CPU via zig -target and -mcpu flags.

Also restores native_build_dir export accidentally removed by #21611.
AztecBot added a commit that referenced this pull request Mar 17, 2026
…ip -march for cross-compiled ARM

The M3 devcontainer fix (#21611) changed arch.cmake to auto-detect
TARGET_ARCH but didn't account for cross-compile presets (arm64-android,
arm64-ios, arm64-ios-sim) that lack CMAKE_SYSTEM_PROCESSOR. Without it,
ARM detection fails and -march=skylake is incorrectly passed to aarch64
targets.

For arm64-macos, ARM was detected but -march=generic conflicted with
zig's -mcpu=apple_a14, causing libdeflate NEON/AES compilation failures.

Fix:
- Add CMAKE_SYSTEM_PROCESSOR: aarch64 to arm64-android/ios/ios-sim
- Add CMAKE_SYSTEM_PROCESSOR: x86_64 and TARGET_ARCH: skylake to x86_64-android
- Skip -march entirely for cross-compiled ARM targets (zig already has -target/-mcpu)
AztecBot added a commit that referenced this pull request Mar 17, 2026
The arch.cmake auto-detection (added in #21611) defaults TARGET_ARCH to
'skylake' when ARM is not detected. For cross-compilation presets like
arm64-ios, arm64-android, and arm64-ios-sim, CMAKE_SYSTEM_PROCESSOR is
not set, so ARM detection fails and -march=skylake gets injected into
aarch64 builds — causing zig to error with 'unknown CPU: skylake'.

For arm64-macos, ARM is detected but -march=generic overrides zig's
-mcpu=apple_a14, causing libdeflate build failures (missing AES target
feature).

Fix: gate auto-detection on NOT CMAKE_CROSSCOMPILING. Cross-compile
toolchains (Zig) handle architecture targeting via their own flags.
Presets that explicitly set TARGET_ARCH (amd64-linux, arm64-linux) are
unaffected.

Also restores native_build_dir variable that was dropped in the build
infrastructure refactor.
alexghr pushed a commit that referenced this pull request Mar 17, 2026
## Summary

Fixes CI failure on merge-train/spartan caused by `-march=skylake` being
injected into aarch64 cross-compilation builds (arm64-android,
arm64-ios, arm64-macos).

**Root cause:** The `arch.cmake` auto-detection added in #21611 defaults
`TARGET_ARCH` to `skylake` when `ARM` is not detected. Cross-compile
presets (ios, android) don't set `CMAKE_SYSTEM_PROCESSOR`, so ARM
detection fails and `-march=skylake` gets passed to aarch64 Zig builds —
which errors with `unknown CPU: 'skylake'`. For arm64-macos,
`-march=generic` overrides Zig's `-mcpu=apple_a14`, breaking libdeflate.

**Fix:** Gate auto-detection on `NOT CMAKE_CROSSCOMPILING`.
Cross-compile toolchains handle architecture targeting via their own
flags (e.g. Zig `-mcpu`). Presets that explicitly set `TARGET_ARCH`
(amd64-linux, arm64-linux) are unaffected.

Also restores `native_build_dir` variable dropped in the build
infrastructure refactor.

## Test plan
- Verified all cross-compile presets (arm64-android, arm64-ios,
arm64-ios-sim, arm64-macos, x86_64-android) configure with zero `-march`
flags
- Verified native presets (default, amd64-linux, arm64-linux) still get
correct `-march` values
charlielye added a commit that referenced this pull request Mar 17, 2026
…VX-512 on cached ARM64 binaries

The zig wrapper scripts used `-target native` which detects host CPU features
(e.g. SVE on Graviton). Cached binaries then SIGILL on Apple Silicon which
lacks SVE. The previous fix (#21611) tried `-march=generic` from cmake but
that's not a valid AArch64 flag and doesn't reliably override zig's detection.

Fix: use explicit architecture targets (aarch64/x86_64) in zig wrappers with
-mcpu=skylake on x86 for AVX2. Move TARGET_ARCH out of cross-compile presets
into zig -mcpu flags. Keep TARGET_ARCH on default preset for non-zig builds.
ludamad pushed a commit that referenced this pull request Mar 17, 2026
# fix: ARM64 devcontainer builds — skip `-march` on ARM and use explicit
zig aarch64 target

## Summary

Fixes SIGILL (Illegal Instruction) crashes and build failures on ARM64
Mac (M3/Apple Silicon) devcontainers caused by incorrect `-march`
handling introduced in #21611.

## Problem

PR #21611 originally fixed ARM64 devcontainer builds by using explicit
`aarch64-linux-gnu.2.35` zig targets. During the merge, that approach
was replaced with cmake-based auto-detection that sets
`TARGET_ARCH=generic` on ARM and passes `-march=generic` to the
compiler. This caused two distinct failures:

### 1. SIGILL crashes (`Illegal instruction`)

The zig compiler wrappers still used `-target native-linux-gnu.2.35`,
which auto-detects the host CPU. On CI (AWS Graviton with SVE
extensions), this produces binaries containing SVE instructions. These
cached binaries are then downloaded on Apple Silicon devcontainers
(ARM64 without SVE), causing SIGILL when executed — e.g.
`honk_solidity_key_gen` crashing during `barretenberg/sol` bootstrap.

The `-march=generic` flag was supposed to override this, but
`-march=generic` is **not a valid value on aarch64**. It's an x86
concept. LLVM/zig silently ignored it, so the native CPU detection still
produced SVE instructions.

### 2. Build failures (`unknown CPU: 'armv8'`)

Even attempting `-march=armv8-a` (a valid GCC/Clang aarch64 value) fails
because zig uses its own CPU naming scheme (e.g. `generic`,
`cortex_a72`, `apple_m3`), not GCC-style architecture strings. Zig
interprets `-march=armv8-a` as CPU name `armv8`, which doesn't exist →
`error: unknown CPU: 'armv8'`.

**Bottom line:** The `-march` cmake approach fundamentally doesn't work
with zig on ARM. Zig has its own architecture targeting via `-target`,
which is the correct mechanism.

## What this PR changes

### 1. `arch.cmake` — Skip `-march` auto-detection on ARM

Removed the ARM branch from the auto-detection. On x86_64, we still
auto-detect `TARGET_ARCH=skylake`. On ARM, we don't set `TARGET_ARCH` at
all, so no `-march` flag is passed — the zig wrappers handle
architecture targeting instead.

### 2. `zig-cc.sh` / `zig-c++.sh` — Explicit aarch64 target on ARM Linux

Restored the original fix from #21611 that was dropped during merge. On
ARM64 Linux, the wrappers now use `-target aarch64-linux-gnu.2.35`
instead of `-target native-linux-gnu.2.35`. This produces generic ARM64
code without CPU-specific extensions (SVE, etc.), ensuring cached
binaries work on all ARM64 machines — Graviton, Apple Silicon, Ampere,
etc.

x86_64 behavior is unchanged (still uses `-target native`).

## Context: what happened after #21611

After #21611 merged with the cmake auto-detection approach, it triggered
a cascade of follow-up PRs trying to fix the fallout:

| PR | Status | Issue |
|----|--------|-------|
| #21621 | Merged | Introduced the auto-detect approach (replaced zig
wrapper fix with cmake `-march`) |
| #21356 | Merged | Added `NOT CMAKE_CROSSCOMPILING` guard for
cross-compile failures |
| #21637 | Open | Attempting to fix cross-compiles + restore
`native_build_dir` |
| #21660 | Open | Attempting to fix cross-compile targets |
| #21632 | Open | Attempting to fix cross-compile targets |
| #21662 | Open | Adding `CMAKE_SYSTEM_PROCESSOR` to ARM64 cross-compile
presets |
| #21653 | Open | Attempting to skip auto-detection when cross-compiling
|
| #21655 | Open | Attempting to skip auto-detection for
cross-compilation targets |

This PR supersedes the still-open PRs above by addressing the root
cause: `-march` via cmake doesn't work with zig on ARM. The zig
`-target` mechanism is the correct approach.
charlielye added a commit that referenced this pull request Mar 17, 2026
…VX-512 on cached ARM64 binaries

The zig wrapper scripts used `-target native` which detects host CPU features
(e.g. SVE on Graviton). Cached binaries then SIGILL on Apple Silicon which
lacks SVE. The previous fix (#21611) tried `-march=generic` from cmake but
that's not a valid AArch64 flag and doesn't reliably override zig's detection.

Fix: use explicit architecture targets (aarch64/x86_64) in zig wrappers.
Hardcode -march=skylake in arch.cmake for x86. Remove TARGET_ARCH variable.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants