chore: pippenger int audit#19302
Conversation
…i/pippenger-audit-0
| uint32_t result = lo + (hi << lo_slice_bits); | ||
| return result; | ||
| size_t lo_bit = (hi_bit < slice_size) ? 0 : hi_bit - slice_size; | ||
| return scalar.get_bit_slice_raw(lo_bit, hi_bit); |
There was a problem hiding this comment.
Moved this method to the field core primitives. it's very efficient and can't be replaced with existing primitives. I tried creating uint256_t and slicing it, it resulted in 4-6% regression
| cached_cost = total_cost; | ||
| target_bit_slice = bit_slice; | ||
| // Cost model: total_cost = num_rounds * (num_points + num_buckets * BUCKET_ACCUMULATION_COST) | ||
| auto compute_cost = [&](size_t bits) { |
There was a problem hiding this comment.
A bit of renaming here
| const size_t num_rounds = numeric::ceil_div(NUM_BITS_IN_FIELD, bits_per_slice); | ||
| for (size_t i = 0; i < num_rounds; ++i) { | ||
| round_output = evaluate_pippenger_round(msm_data, i, affine_data, bucket_data, round_output, bits_per_slice); | ||
| evaluate_pippenger_round(msm_data, i, affine_data, bucket_data, msm_result, bits_per_slice); |
There was a problem hiding this comment.
don't love void methods with a bunch of params, but i think it's clear that the method modifies msm_result
| result += round_output; | ||
| return result; | ||
| // Accumulate into running total | ||
| accumulate_round_result(msm_accumulator, bucket_result, round_index, bits_per_slice); |
There was a problem hiding this comment.
avoiding duplication here
| * @param scratch_it Current scratch space iterator (modified in place) | ||
| */ | ||
| template <typename Curve> | ||
| __attribute__((always_inline)) static inline void process_single_point( |
There was a problem hiding this comment.
simply isolating shared logic from consume_point_schedule
| } | ||
|
|
||
| /** | ||
| * @brief Process a pair of points/buckets using branchless conditional moves |
| MSM<Curve>::BucketAccumulators& bucket_data, | ||
| size_t num_input_points_processed, | ||
| size_t num_queued_affine_points) noexcept | ||
| MSM<Curve>::BucketAccumulators& bucket_data) noexcept |
There was a problem hiding this comment.
switched from recursive to iterative structure here
| template <typename Curve> | ||
| void MSM<Curve>::transform_scalar_and_get_nonzero_scalar_indices(std::span<typename Curve::ScalarField> scalars, | ||
| std::vector<uint32_t>& consolidated_indices) noexcept | ||
| std::vector<uint32_t>& nonzero_scalar_indices) noexcept |
There was a problem hiding this comment.
renaming + added docs
| size_t thread_accumulated_work = 0; | ||
| size_t current_thread_idx = 0; | ||
| for (size_t i = 0; i < num_msms; ++i) { | ||
| BB_ASSERT_DEBUG(i < msm_scalar_indices.size()); |
There was a problem hiding this comment.
redundant assert
| const size_t available_thread_work = total_thread_work - thread_accumulated_work; | ||
| const size_t work_to_assign = std::min(available_thread_work, msm_work_remaining); | ||
|
|
||
| work_units[current_thread_idx].push_back(MSMWorkUnit{ |
There was a problem hiding this comment.
A bit cleaner when using min
| * @return constexpr size_t | ||
| */ | ||
| template <typename Curve> size_t MSM<Curve>::get_optimal_log_num_buckets(const size_t num_points) noexcept | ||
| template <typename Curve> uint32_t MSM<Curve>::get_optimal_log_num_buckets(const size_t num_points) noexcept |
There was a problem hiding this comment.
slightly improved the clarity here
| template <typename Curve> bool MSM<Curve>::use_affine_trick(const size_t num_points, const size_t num_buckets) noexcept | ||
| { | ||
| if (num_points < 128) { | ||
| if (num_points < AFFINE_TRICK_THRESHOLD) { |
There was a problem hiding this comment.
all constants are now living in the header
| * @param scratch_space coordinate field scratch space needed for batched inversion | ||
| **/ | ||
| template <typename Curve> | ||
| void MSM<Curve>::add_affine_points(typename Curve::AffineElement* points, |
| */ | ||
| template <typename Curve> | ||
| typename Curve::Element MSM<Curve>::small_pippenger_low_memory_with_transformed_scalars(MSMData& msm_data) noexcept | ||
| typename Curve::Element MSM<Curve>::jacobian_pippenger_with_transformed_scalars(MSMData& msm_data) noexcept |
There was a problem hiding this comment.
Seems clearer this way
|
|
||
| for (uint32_t round = 0; round < num_rounds; ++round) { | ||
| // Populate buckets using Jacobian accumulation | ||
| for (size_t i = 0; i < size; ++i) { |
There was a problem hiding this comment.
Basically unfolded evaluate_small_pippenger_round, the impl is pretty concise
| for (size_t i = 0; i < num_rounds; ++i) { | ||
| round_output = evaluate_pippenger_round(msm_data, i, affine_data, bucket_data, round_output, bits_per_slice); | ||
| } | ||
| // Per-call allocation for WASM compatibility (thread_local causes issues in WASM) |
There was a problem hiding this comment.
Not sure it was thread_local specifically or its combination with accidentally disabled multi-threading in Pippenger. But ivc integration test were failing, seemed worth documenting
| } else { | ||
| bucket_data.buckets[bucket_index] = points[nonzero_scalar_indices[i]]; | ||
| bucket_data.bucket_exists.set(bucket_index, true); | ||
| for (uint32_t round = 0; round < num_rounds; ++round) { |
There was a problem hiding this comment.
again, unfolded pippenger_round, just to be able to see the entire algo in one loop
| MSM<Curve>::BucketAccumulators& bucket_data, | ||
| size_t num_input_points_processed, | ||
| size_t num_queued_affine_points) noexcept | ||
| void MSM<Curve>::batch_accumulate_points_into_buckets(std::span<const uint64_t> point_schedule, |
There was a problem hiding this comment.
- somewhat more math-oriented name, makes it easier to match with the abstract description
- unfolded recursion
- both loops share bucket/point processing logic
| // unfortnately we need to remove const on this data type to prevent duplicating _scalars (which is typically | ||
| // large) We need to convert `_scalars` out of montgomery form for the MSM. We then convert the scalars back | ||
| // into Montgomery form at the end of the algorithm. NOLINTNEXTLINE(cppcoreguidelines-pro-type-const-cast) | ||
| // TODO(https://github.com/AztecProtocol/barretenberg/issues/1449): handle const correctness. |
There was a problem hiding this comment.
Tried to make it cleaner but couldn't find something that would look way cleaner. Maybe can take a look in a followup PR. But added tests checking that batch msm preserves constness.
…i/pippenger-audit-0
barretenberg/cpp/src/barretenberg/ecc/fields/field_declarations.hpp
Outdated
Show resolved
Hide resolved
ledwards2225
left a comment
There was a problem hiding this comment.
Overall LGTM - some great cleanup. Few issues with the get_bit_slice_raw method I think and some other minor comments
barretenberg/cpp/src/barretenberg/ecc/scalar_multiplication/bitvector.hpp
Outdated
Show resolved
Hide resolved
barretenberg/cpp/src/barretenberg/ecc/fields/field_declarations.hpp
Outdated
Show resolved
Hide resolved
barretenberg/cpp/src/barretenberg/ecc/fields/field_declarations.hpp
Outdated
Show resolved
Hide resolved
barretenberg/cpp/src/barretenberg/ecc/fields/field_declarations.hpp
Outdated
Show resolved
Hide resolved
| AffineElement result = scalar_multiplication::MSM<Curve>::msm(points, scalar_span); | ||
|
|
||
| AffineElement expected(base_point * scalar_sum); | ||
| EXPECT_EQ(result, expected); |
There was a problem hiding this comment.
Is the fact that it works expected even though handle_edge_cases is false?
There was a problem hiding this comment.
wasn't using enough points to trigger pippenger, actually, thanks for catching this
|
|
||
| ### Point Scheduling (Affine Variant Only) | ||
|
|
||
| Entries are packed as `(point_index << 32) | bucket_index` and sorted via **in-place MSD radix sort**. Sorting groups points by bucket, enabling efficient batch processing. The sort also detects entries with `bucket_index == 0` during the final radix pass, allowing zero-bucket entries to be skipped without a separate scan. |
There was a problem hiding this comment.
Maybe it's worth stressing that as c = 8, this is effectively appending the bucket index at the end of the index we use to pack the points
federicobarbacovi
left a comment
There was a problem hiding this comment.
Great work! Just some minor comments
barretenberg/cpp/src/barretenberg/ecc/scalar_multiplication/scalar_multiplication.cpp
Outdated
Show resolved
Hide resolved
BEGIN_COMMIT_OVERRIDE feat: support JSON input files for bb verify command (#19800) fix: update bootstrap.sh to use new JSON field names chore: Update `index.js` so that `HAS_ZK` and `PUBLIC_INPUTS` variables must always be set in tests (#19884) chore: pippenger int audit (#19302) chore: deduplicate batch affine addition trick (#19788) chore: transcript+codec+poseidon2 fixes (#19419) chore!: explicitly constrain inputs and intermediate witnesses (#19826) fix: exclude nlohmann/json from WASM builds in json_output.hpp chore: translator circuit builder and flavor audit (#19798) Revert "fix: exclude nlohmann/json from WASM builds in json_output.hpp" Revert "feat: support JSON input files for bb verify command (#19800)" Revert "fix: update bootstrap.sh to use new JSON field names" END_COMMIT_OVERRIDE
clean up + docs+ a couple of edge case tests Closes AztecProtocol/barretenberg#486 --------- Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
clean up + docs+ a couple of edge case tests
Closes AztecProtocol/barretenberg#486