chore: stdlib byte_array and bool external audit fixes#17838
chore: stdlib byte_array and bool external audit fixes#17838iakovenkos merged 31 commits intomerge-train/barretenbergfrom
Conversation
Removed all public methods that could create or insert unconstrained field_t values: - Removed write_unconstrained() and write_at_unconstrained() - Made internal constructors from vector<field_t> truly private - Made from_constants() private (only used for padding via constant_padding()) - Removed non-const operator[] to prevent direct mutation All hash implementations now use safe write() pattern to build byte_arrays from constrained sources. This ensures lookup tables only receive properly range-constrained bytes. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
…i/bool-byte-array-post-audit
…i/bool-byte-array-post-audit
…sult
This change prevents concept leakage of the witness_inverted flag by
ensuring conditional_assign always returns a normalized bool_t.
Changes:
- All three branches now normalize their results:
1. Constant predicate: normalize selected value
2. Same witness: normalize lhs
3. Boolean operations: normalize result (can be unnormalized when
constants are involved, e.g., inverted_witness && constant_true)
- Updated test to verify normalization in all 512 input combinations
- Simplified gate count pinning to focus on predictable cases
- Added explicit EXPECT_FALSE(result.is_inverted()) check
The normalization cost is minimal:
- Branches 1 & 2: 0-1 gates (only if value is inverted)
- Branch 3: 0-1 gates (only if boolean ops return inverted result)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
| return values[index]; | ||
| } | ||
|
|
||
| // Non-const operator[] removed to prevent assigning unconstrained values into the array |
There was a problem hiding this comment.
nit: no need for the comment explaining why we removed the non-const operator[].
| const_values.reserve(input.size()); | ||
| for (const auto& byte : input) { | ||
| // Create constant field elements - no witness, no constraints | ||
| const_values.push_back(field_t<Builder>(parent_context, byte)); |
There was a problem hiding this comment.
might be worth adding assertion to check if indeed input values are bytes?
There was a problem hiding this comment.
I think there's no way to propagate something which is not a uint8_t here. it is a private member that's called by constant_padding() externally. I guess one can try to call it with smth that can be truncated to uint8_t, but then it either has no effect or breaks some padding assumptions in hash functions
| } | ||
|
|
||
| // Helper to compute expected gate count for conditional_assign | ||
| static size_t compute_conditional_assign_gates( |
There was a problem hiding this comment.
terribly complicated one tbh :D
| little_endian_limb_bytes[6] = limb_bytes[1]; | ||
| little_endian_limb_bytes[7] = limb_bytes[0]; | ||
| result.write(little_endian_limb_bytes); | ||
| result.write(limb_bytes.reverse()); |
There was a problem hiding this comment.
Q: Since output limb is the result of the lookup table, is it range-constrained implicitly? In that case, do we need a way to "opt-out" of range constraining while performing write?
There was a problem hiding this comment.
You're right. There's no need to constrain it. Otoh, hash and all its helpers are un-used. So I'd just keep it like that and nuke in a follow-up. We only need keccak1600 from this module.
…i/bool-byte-array-post-audit
| // Step 1: Conditionally assign field values when predicate is false | ||
| // There is one remaining edge case that happens with negligible probability, see here: | ||
| // https://github.com/AztecProtocol/barretenberg/issues/1570 | ||
| if (!input.predicate.is_constant) { |
There was a problem hiding this comment.
@federicobarbacovi can you check that this is fine? I think the values need to be conditionally assigned first and then fed to byte_array constructors
| template <typename Builder> byte_array<Builder> fields_to_bytes(Builder& builder, std::vector<field_t<Builder>>& fields) | ||
| { | ||
| byte_array<Builder> result(&builder); | ||
| byte_array<Builder> result = byte_array<Builder>::constant_padding(&builder, /*length*/ 0); |
There was a problem hiding this comment.
@federicobarbacovi just flagging that now it's the canonical way to create an empty array
BEGIN_COMMIT_OVERRIDE chore: remove more pg references and leftovers (#18260) refactor: Migrate --update_inputs flag to --vk_policy option (#18237) chore: stdlib byte_array and bool external audit fixes (#17838) chore: update naming in bb_proof_verification lib (#18269) chore: cycle group 13 (#18200) chore: universal handling for coord constancy in cycle_group (#18253) END_COMMIT_OVERRIDE
## Critical Changes ### Issues 1 & 2: Prevent Unconstrained Value Propagation **Root Cause**: Constructors allowed creation of byte_arrays with unconstrained field_t values, enabling malicious provers to inject invalid "bytes" (>255) into hash function lookup tables. **Fixes**: - **1** (6585762): Removed `byte_array(Builder*, size_t)` constructor entirely - **2** (de5b87e): Made `byte_array(Builder*, bytes_t const&)` and `byte_array(Builder*, bytes_t&&)` constructors **private** **Impact**: Establishes security boundary where: - External code must use public constructors that add 8-bit range constraints - Internal methods (slice, reverse) safely use private constructors only on already-constrained data - Unconstrained values cannot propagate to hash functions **Cascading Benefits**: Issues 5 and 8 automatically prevented at source. --- ## 📝 byte_array API Security & Correctness ### Issue 3: Allow Valid 32-Byte ↔ field_t Round-Trips - **Commit**: f899a23 - **Fix**: Removed overly restrictive `BB_ASSERT(bytes < 32)` from `operator field_t()` - **Result**: Enables testing of critical modulus boundary cases ### Issue 4: Proper Bounds Checking & Prevent Mutation - **Commits**: bf58e85, e897223 - **Fix**: Changed const `operator[]` from `assert(size > 0)` to `BB_ASSERT_LT(index, size)`; removed non-const `operator[]` - **Result**: Proper bounds validation; no direct mutation possible ### Issue 5: write_at() Cannot Accept Unconstrained Arrays - **Commit**: de5b87e (architectural fix) - **Fix**: With private bytes_t constructors, only constrained byte_arrays can exist - **Result**: write_at() validated at construction time, not call time ### Issue 6: Handle Empty Array Edge Case in slice() - **Commit**: 5f916a6 - **Fix**: Changed `BB_ASSERT_LT(offset, size)` to `BB_ASSERT_LTE(offset, size)` - **Result**: Allows valid `slice(0)` on empty arrays ### Issue 7: Document Division Safety Invariant - **Commit**: 8d9f4e1 - **Fix**: Added SAFETY comment explaining `reconstructed_hi / 2^128` divisibility invariant - **Result**: Critical assumption now documented for maintainers and auditors ### Issue 8: slice() Cannot Propagate Unconstrained Fields - **Commit**: de5b87e (architectural fix) - **Fix**: slice() uses private constructor only on already-constrained internal data - **Result**: No external bypass possible; constraints always preserved ### Issue 9: No Checks/Documentation About field_t Conversion Not Adding Constraints - **Commits**: de5b87e + 6585762 (architectural fixes) - **Fix**: The operator field_t() conversion doesn't need to check constraints because the type system now guarantees all byte_arrays contain constrained values: - **Result**: The conversion's "trust model" is valid - it's not trusting user input, it's relying on a type-level invariant enforced at construction ### Issue 10: Remove Undefined Behavior in get_string() - **Commit**: 2cf61e6 - **Fix**: Removed `get_string()` method entirely - **Result**: No signed char UB with values 128-255 --- ## 🔧 bool_t Security & Correctness ### Issue 11: Validate Builder Context in Operators - **Commit**: a9f6608 - **Fix**: Use `validate_context<Builder>()` instead of manual `context ? context : other.context` in &, |, ^, == operators - **Result**: Ensures both operands belong to same builder ### Issue 12: Use Canonical Zero Witness - **Commit**: 1fb4b6e - **Fix**: Use `context->zero_idx()` instead of same witness with q_r=0 in normalize() - **Result**: Cleaner gate construction following best practices --- ## 🚀 Performance & Correctness ### Issue 13: Honor Move Semantics in Rvalue Constructor - **Commit**: eaa34b7 - **Fix**: Changed to `values(std::move(input))` in rvalue constructor - **Result**: Avoids unnecessary copies for large arrays ### Issue 14: Prevent Underflow in reverse() on Empty Array - **Commit**: 6cf505d - **Fix**: Added early return: `if (values.empty()) { return *this; }` - **Result**: Prevents underflow from `size() - 1` computation ### Issue 15: Fix Selector Order in Boolean Equality Gate - **Commit**: cf791df - **Fix**: Swapped q_l and q_r to match canonical (q_m, q_l, q_r, q_o, q_c) order - **Result**: Conforms to documented layout; prevents future bugs if coefficients diverge ### `bool_t`: Always normalize `conditional_assign` Output - **Commit**: [d0032b2](d0032b2) - **Fix**: Fixed concept leakage in `bool_t::conditional_assign` that could return un-normalized `bool_t` and cause in APIs at higher level - **Note**: Minimal gate overhead - only 0-1 gates added when values are actually inverted. ## Summary The key insight: Making internal `bytes_t` constructors private creates a security boundary that prevents unconstrained field_t values from ever entering byte_arrays. Combined with removing unsafe constructors and the non-const operator[], this ensures that only properly range-constrained bytes can reach hash function lookup tables throughout the circuit stack. --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: federicobarbacovi <171914500+federicobarbacovi@users.noreply.github.com>
Critical Changes
Issues 1 & 2: Prevent Unconstrained Value Propagation
Root Cause: Constructors allowed creation of byte_arrays with unconstrained field_t values, enabling malicious provers to inject invalid "bytes" (>255) into hash function lookup tables.
Fixes:
byte_array(Builder*, size_t)constructor entirelybyte_array(Builder*, bytes_t const&)andbyte_array(Builder*, bytes_t&&)constructors privateImpact: Establishes security boundary where:
Cascading Benefits: Issues 5 and 8 automatically prevented at source.
📝 byte_array API Security & Correctness
Issue 3: Allow Valid 32-Byte ↔ field_t Round-Trips
BB_ASSERT(bytes < 32)fromoperator field_t()Issue 4: Proper Bounds Checking & Prevent Mutation
operator[]fromassert(size > 0)toBB_ASSERT_LT(index, size); removed non-constoperator[]Issue 5: write_at() Cannot Accept Unconstrained Arrays
Issue 6: Handle Empty Array Edge Case in slice()
BB_ASSERT_LT(offset, size)toBB_ASSERT_LTE(offset, size)slice(0)on empty arraysIssue 7: Document Division Safety Invariant
reconstructed_hi / 2^128divisibility invariantIssue 8: slice() Cannot Propagate Unconstrained Fields
Issue 9: No Checks/Documentation About field_t Conversion Not Adding Constraints
Issue 10: Remove Undefined Behavior in get_string()
get_string()method entirely🔧 bool_t Security & Correctness
Issue 11: Validate Builder Context in Operators
validate_context<Builder>()instead of manualcontext ? context : other.contextin &, |, ^, == operatorsIssue 12: Use Canonical Zero Witness
context->zero_idx()instead of same witness with q_r=0 in normalize()🚀 Performance & Correctness
Issue 13: Honor Move Semantics in Rvalue Constructor
values(std::move(input))in rvalue constructorIssue 14: Prevent Underflow in reverse() on Empty Array
if (values.empty()) { return *this; }size() - 1computationIssue 15: Fix Selector Order in Boolean Equality Gate
bool_t: Always normalizeconditional_assignOutputbool_t::conditional_assignthat could return un-normalizedbool_tand cause in APIs at higher levelSummary
The key insight: Making internal
bytes_tconstructors private creates a security boundary that prevents unconstrained field_t values from ever entering byte_arrays. Combined with removing unsafe constructors and the non-const operator[], this ensures that only properly range-constrained bytes can reach hash function lookup tables throughout the circuit stack.