Skip to content

feat(bb): extend_edges optimization for zero values past end_index#12703

Merged
jeanmon merged 3 commits intomasterfrom
jm/sumcheck-optimization-2
Mar 14, 2025
Merged

feat(bb): extend_edges optimization for zero values past end_index#12703
jeanmon merged 3 commits intomasterfrom
jm/sumcheck-optimization-2

Conversation

@jeanmon
Copy link
Contributor

@jeanmon jeanmon commented Mar 12, 2025

On 16 cores, avm bulk test v2, sumcheck time improves from 16 seconds to 8.5 seconds.

for (auto [poly, full_poly] : zip_view(get_all(), full_polynomials.get_all())) {
// After the initial sumcheck round, the new size is CEIL(size/2).
size_t desired_size = std::min(full_poly.end_index() / 2 + full_poly.end_index() % 2, circuit_size / 2);
size_t desired_size = full_poly.end_index() / 2 + full_poly.end_index() % 2;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a change which was forgotten in a previous PR.

@jeanmon jeanmon marked this pull request as ready for review March 12, 2025 22:14
@jeanmon jeanmon requested review from lucasxia01 and removed request for IlyasRidhuan and Maddiaa0 March 12, 2025 22:14
extended_edge = edge.template extend_to<MAX_PARTIAL_RELATION_LENGTH>();
if (multivariate.end_index() < edge_idx) {
extended_edge =
bb::Univariate<FF, MAX_PARTIAL_RELATION_LENGTH>(std::array<FF, MAX_PARTIAL_RELATION_LENGTH>{});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Just taking a peek out of curiosity). I think it would be clearer here to use something like bb::Univariate<FF, MAX_PARTIAL_RELATION_LENGTH>::zero() (which I believe exists). Possibly even better to construct such a thing statically once and for all but probably not a big difference

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also wonder if this indicates there's a more fundamental optimization somewhere. E.g. does this imply that we're going on to do a bunch of computation with zero univariates?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ledwards2225 In accumulate_relation_univariates(), we then have computation on zero univariates but I think that this is strongly mitigated by the skippable mechanism. There might be some fine tuning but I think this is less trivial.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ledwards2225 I pushed the suggested changes on using zero() and using a static const variable.

@fcarreiro
Copy link
Contributor

Nice! I like these low hanging fruits.

@fcarreiro fcarreiro changed the title chore: extend_edges optimization for zero values past end_index feat(bb): extend_edges optimization for zero values past end_index Mar 13, 2025
@jeanmon jeanmon force-pushed the jm/sumcheck-optimization-2 branch from 575f636 to 4af8259 Compare March 13, 2025 16:43
@fcarreiro fcarreiro requested a review from ledwards2225 March 13, 2025 16:46
Copy link
Contributor

@lucasxia01 lucasxia01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I wonder if there's any other juice to be squeezed by using end_index instead of round_size in some of these other functions. Seems harder though

@jeanmon jeanmon merged commit 7913053 into master Mar 14, 2025
6 checks passed
@jeanmon jeanmon deleted the jm/sumcheck-optimization-2 branch March 14, 2025 07:25
TomAFrench added a commit that referenced this pull request Mar 14, 2025
* master:
  fix: filter for empty attestations when pulling from block (#12740)
  feat: one-way noir sync (#12592)
  feat(avm): Address derivation gadget (#12721)
  chore(ci): add workflow dispatch to masternet (#12739)
  feat: add default accounts (#12734)
  feat: gas reports and snapshots (#12724)
  fix(avm): fix vm1 fake proof size (#12733)
  feat(bb): extend_edges optimization for zero values past end_index (#12703)
  fix: remove hard coding of constructor for account manager (#12678)
  git subrepo push --branch=master noir-projects/aztec-nr
  git_subrepo.sh: Fix parent in .gitrepo file. [skip ci]
  chore: replace relative paths to noir-protocol-circuits
  git subrepo push --branch=master barretenberg
  fix: verify_honk_proof inputs generation in bootstrap (#12457)
  fix: Patches to cycle_group and cycle_group fuzzer (#12385)
jeanmon added a commit that referenced this pull request Mar 16, 2025
…2707)

This PR introduces a more even vertically distributed trace chunks
processing among threads in the univariate computation as part of
sumcheck. This leads to a substantial speed up of sumcheck.

Let `t` be the number of threads. The processing of the rows of a
circuit used to be

* thread_1 (`round_size/t` rows)
* ...
* thread_t (`round_size/t` rows) [end of circuit]

while this PR introduces the possibility of having the processing be
interleaved, and therefore the load is more uniformly balanced across
threads:

* thread_1 (`chunk_thread_portion_size` rows)
* ...
* thread_t (`chunk_thread_portion_size` rows)
* thread_1 (`chunk_thread_portion_size` rows)
* ...
* thread_t (`chunk_thread_portion_size` rows)
* ...
* [end of circuit]

This PR improves #12703 measurement from 8.5 seconds to 2.4 seconds.

---------

Co-authored-by: fcarreiro <facundo@aztecprotocol.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants