Skip to content

fix: allocate non-gate selectors at trace-active size instead of dyadic size#20600

Merged
johnathan79717 merged 3 commits intomerge-train/barretenbergfrom
jh/non-gate-selector-memory
Feb 18, 2026
Merged

fix: allocate non-gate selectors at trace-active size instead of dyadic size#20600
johnathan79717 merged 3 commits intomerge-train/barretenbergfrom
jh/non-gate-selector-memory

Conversation

@johnathan79717
Copy link
Contributor

@johnathan79717 johnathan79717 commented Feb 17, 2026

Summary

  • Allocate non-gate selectors (q_m, q_c, q_l, q_r, q_o, q_4) at trace_active_range_size() instead of dyadic_size(), using virtual zeroes for the rest
  • Fix get_polynomial_size() to return virtual_size() instead of size(), since the logical polynomial size is the dyadic circuit size
  • Fix batch_polynomials in Hypernova to handle polynomials with different backing sizes (non-gate selectors now have smaller backing than other entities like table polynomials)

Benchmark results

cd barretenberg/cpp/build
BB_VERBOSE=1 BB_BENCH=1 ./bin/ultra_honk_bench --benchmark_filter="construct_proof_ultrahonk_1M_gates_dyadic_2_20$" --benchmark_repetitions=1
BB_VERBOSE=1 BB_BENCH=1 ./bin/ultra_honk_bench --benchmark_filter="construct_proof_ultrahonk_1M_gates_dyadic_2_21$" --benchmark_repetitions=1

Circuits ~2000 gates apart straddling the 2^20 boundary:

Metric Before After
Peak RSS (2^21 dyadic) 2854 MiB 2375 MiB (-479 MiB, -17%)
Memory gap 2^20 vs 2^21 603 MiB 126 MiB (-79%)
Peak RSS (2^20 dyadic) 2251 MiB 2249 MiB (unchanged)

Test plan

  • ultra_honk_tests — 260 passed, 5 skipped
  • chonk_tests — 20 passed
  • circuit_checker_tests — 80 passed
  • hypernova_tests — 9 passed

Resolves AztecProtocol/barretenberg#1625.

…ic size

Non-gate selectors (q_m, q_c, q_l, q_r, q_o, q_4) were allocated at
full dyadic_size() even though they are only written within the active
trace range. This caused circuits just above a power-of-two boundary
to use ~2x memory for selectors compared to circuits just below.

Changed allocation from Polynomial(dyadic_size()) to
Polynomial(trace_active_range_size(), dyadic_size()), which backs
only the active region while still presenting the full virtual size
(with virtual zeroes beyond the active range).

Also changed get_polynomial_size() to return virtual_size() instead
of size(), since the logical polynomial size is the dyadic circuit
size, not the backing memory size.

Benchmarked on circuits ~2000 gates apart straddling the 2^20 boundary:
- Peak RSS for 2^21 dyadic: 2854 MiB -> 2375 MiB (-479 MiB, -17%)
- Memory gap between 2^20 and 2^21: 603 MiB -> 126 MiB (-79%)

Resolves AztecProtocol/barretenberg#1625.
@johnathan79717 johnathan79717 added the ci-barretenberg Run all barretenberg/cpp checks. label Feb 17, 2026
When non-gate selectors are allocated with smaller backing (trace_active_range_size
instead of dyadic_size), the first polynomial in batch_polynomials may be too small
to accumulate others via add_scaled. Create a properly sized result polynomial when
needed.
@ledwards2225
Copy link
Contributor

@johnathan79717 I'm not sure how readily we can do this but it should also be possible to download only the SRS content that we need. Worth investigating

@johnathan79717
Copy link
Contributor Author

@ledwards2225 Good idea — the CRS is currently loaded at full dyadic_size() and the Pippenger point table doubles it, so at 2^21 that's ~256 MiB. Loading only what's needed for the active trace could save ~128 MiB at the dyadic boundary. I'll investigate as a follow-up.

Copy link
Contributor

@ledwards2225 ledwards2225 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - One conservative suggestion for a bit of extra safety

Extend the existing max_end check to also track min_start across all
polynomials. If the 0th polynomial has a greater start_index than
another polynomial in the batch, we now create a result polynomial
covering the full [min_start, max_end) range, preventing add_scaled
assertion failures.
@johnathan79717 johnathan79717 enabled auto-merge (squash) February 18, 2026 14:27
@johnathan79717 johnathan79717 merged commit 14a4773 into merge-train/barretenberg Feb 18, 2026
9 checks passed
@johnathan79717 johnathan79717 deleted the jh/non-gate-selector-memory branch February 18, 2026 14:32
github-merge-queue bot pushed a commit that referenced this pull request Feb 18, 2026
BEGIN_COMMIT_OVERRIDE
chore: chonk rec ver 0 (#20506)
fix: allocate non-gate selectors at trace-active size instead of dyadic
size (#20600)
chore: numeric audit 0 (#20491)
chore: prepare barretenberg-rs for crates.io publishing (#20496)
chore: add build_bench to ci-barretenberg-full (#20650)
chore: add component graphs for app-proving benchmarks (#20649)
END_COMMIT_OVERRIDE
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-barretenberg Run all barretenberg/cpp checks.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants