Insert MSM and FFT code and their benchmarks.#86
Conversation
ae74f12 to
8797915
Compare
|
Hi team, Rationale for this PR is mentioned in privacy-ethereum/halo2#84. Following this I'd like to:
Depending on what is merged first either this PR or privacy-ethereum/halo2#202 will need to be updated |
There was a problem hiding this comment.
This looks good! I'd like to run benchmars with and without this commit in our server cc: @AronisAt79 @ed255 So that we can also see if we really benefit from multithreading without exploding RAM comsumption.
Once we wee no RAM explosions, I'mm happy to merge this!
If you have any benchmark results in other machines or memory profiling @mratsim it would be nice if you could share it.
I'll try to run this locally and get it. But only have 16 threads avaliable.
han0110
left a comment
There was a problem hiding this comment.
LGTM! Checked locally the fft.rs and msm.rs are ported from halo2_proofs with only 2 lines of non-logic change.
Also got a question about MSM benchmark.
…ltiexp`. Split into `singlecore` and `multicore` benchmarks so Criterion's result caching and comparison over multiple runs makes sense. Rewrite point and scalar generation.
|
We could file an issue for that! |
|
The bench came from my library, with addition chains ;), so likely in Halo2curves it might even take a minute to generate the points. |
|
Hold on on merging, @einar-taiko is looking into parallelizing and caching the generation of test (coeffs, points) pairs |
Didn't do it as per your last comment :) |
Laptop measurements: k=22: 109 sec k=16: 1 sec
mratsim
left a comment
There was a problem hiding this comment.
The new bench with parallel point generation now completes under a minute
han0110
left a comment
There was a problem hiding this comment.
LGTM! Just a small suggestion, could be ignored.
|
Here are my benchmarks on my laptop (i9-11980HK 8 cores). Unfortunately I have to go to zkSummit just after so benchmark on my 18-core workstation will have to wait end of week. For 2^16 there is a factor 3-3.5x between halo2curves (90ms) and Gnark (30ms) / Constantine (22.5ms) Halo2curvesGnarkConstantine |
han0110
left a comment
There was a problem hiding this comment.
I think we still need to add required-features = ["multicore"] for msm.rs to pass the CI build without default features.
* Add field conversion to/from `[u64;4]` (privacy-ethereum#80) * feat: add field conversion to/from `[u64;4]` * Added conversion tests * Added `montgomery_reduce_short` for no-asm * For bn256, uses assembly conversion when asm feature is on * fix: remove conflict for asm * chore: bump rust-toolchain to 1.67.0 * Compute Legendre symbol for `hash_to_curve` (privacy-ethereum#77) * Add `Legendre` trait and macro - Add Legendre macro with norm and legendre symbol computation - Add macro for automatic implementation in prime fields * Add legendre macro call for prime fields * Remove unused imports * Remove leftover * Add `is_quadratic_non_residue` for hash_to_curve * Add `legendre` function * Compute modulus separately * Substitute division for shift * Update modulus computation * Add quadratic residue check func * Add quadratic residue tests * Add hash_to_curve bench * Implement Legendre trait for all curves * Move misplaced comment * Add all curves to hash bench * fix: add suggestion for legendre_exp * fix: imports after rebase * Add simplified SWU method (privacy-ethereum#81) * Fix broken link * Add simple SWU algorithm * Add simplified SWU hash_to_curve for secp256r1 * add: sswu z reference * update MAP_ID identifier Co-authored-by: Han <tinghan0110@gmail.com> --------- Co-authored-by: Han <tinghan0110@gmail.com> * Bring back curve algorithms for `a = 0` (privacy-ethereum#82) * refactor: bring back curve algorithms for `a = 0` * fix: clippy warning * fix: Improve serialization for prime fields (privacy-ethereum#85) * fix: Improve serialization for prime fields Summary: 256-bit field serialization is currently 4x u64, ie. the native format. This implements the standard of byte-serialization (corresponding to the PrimeField::{to,from}_repr), and an hex-encoded variant of that for (de)serializers that are human-readable (concretely, json). - Added a new macro `serialize_deserialize_32_byte_primefield!` for custom serialization and deserialization of 32-byte prime field in different struct (Fq, Fp, Fr) across the secp256r, bn256, and derive libraries. - Implemented the new macro for serialization and deserialization in various structs, replacing the previous `serde::{Deserialize, Serialize}` direct use. - Enhanced error checking in the custom serialization methods to ensure valid field elements. - Updated the test function in the tests/field.rs file to include JSON serialization and deserialization tests for object integrity checking. * fixup! fix: Improve serialization for prime fields --------- Co-authored-by: Carlos Pérez <37264926+CPerezz@users.noreply.github.com> * refactor: (De)Serialization of points using `GroupEncoding` (privacy-ethereum#88) * refactor: implement (De)Serialization of points using the `GroupEncoding` trait - Updated curve point (de)serialization logic from the internal representation to the representation offered by the implementation of the `GroupEncoding` trait. * fix: add explicit json serde tests * Insert MSM and FFT code and their benchmarks. (privacy-ethereum#86) * Insert MSM and FFT code and their benchmarks. Resolves taikoxyz/zkevm-circuits#150. * feedback * Add instructions * feeback * Implement feedback: Actually supply the correct arguments to `best_multiexp`. Split into `singlecore` and `multicore` benchmarks so Criterion's result caching and comparison over multiple runs makes sense. Rewrite point and scalar generation. * Use slicing and parallelism to to decrease running time. Laptop measurements: k=22: 109 sec k=16: 1 sec * Refactor msm * Refactor fft * Update module comments * Fix formatting * Implement suggestion for fixing CI --------- Co-authored-by: David Nevado <davidnevadoc@users.noreply.github.com> Co-authored-by: Han <tinghan0110@gmail.com> Co-authored-by: François Garillot <4142+huitseeker@users.noreply.github.com> Co-authored-by: Carlos Pérez <37264926+CPerezz@users.noreply.github.com> Co-authored-by: einar-taiko <126954546+einar-taiko@users.noreply.github.com>
* Add field conversion to/from `[u64;4]` (privacy-ethereum#80) * feat: add field conversion to/from `[u64;4]` * Added conversion tests * Added `montgomery_reduce_short` for no-asm * For bn256, uses assembly conversion when asm feature is on * fix: remove conflict for asm * chore: bump rust-toolchain to 1.67.0 * Compute Legendre symbol for `hash_to_curve` (privacy-ethereum#77) * Add `Legendre` trait and macro - Add Legendre macro with norm and legendre symbol computation - Add macro for automatic implementation in prime fields * Add legendre macro call for prime fields * Remove unused imports * Remove leftover * Add `is_quadratic_non_residue` for hash_to_curve * Add `legendre` function * Compute modulus separately * Substitute division for shift * Update modulus computation * Add quadratic residue check func * Add quadratic residue tests * Add hash_to_curve bench * Implement Legendre trait for all curves * Move misplaced comment * Add all curves to hash bench * fix: add suggestion for legendre_exp * fix: imports after rebase * Add simplified SWU method (privacy-ethereum#81) * Fix broken link * Add simple SWU algorithm * Add simplified SWU hash_to_curve for secp256r1 * add: sswu z reference * update MAP_ID identifier Co-authored-by: Han <tinghan0110@gmail.com> --------- Co-authored-by: Han <tinghan0110@gmail.com> * Bring back curve algorithms for `a = 0` (privacy-ethereum#82) * refactor: bring back curve algorithms for `a = 0` * fix: clippy warning * fix: Improve serialization for prime fields (privacy-ethereum#85) * fix: Improve serialization for prime fields Summary: 256-bit field serialization is currently 4x u64, ie. the native format. This implements the standard of byte-serialization (corresponding to the PrimeField::{to,from}_repr), and an hex-encoded variant of that for (de)serializers that are human-readable (concretely, json). - Added a new macro `serialize_deserialize_32_byte_primefield!` for custom serialization and deserialization of 32-byte prime field in different struct (Fq, Fp, Fr) across the secp256r, bn256, and derive libraries. - Implemented the new macro for serialization and deserialization in various structs, replacing the previous `serde::{Deserialize, Serialize}` direct use. - Enhanced error checking in the custom serialization methods to ensure valid field elements. - Updated the test function in the tests/field.rs file to include JSON serialization and deserialization tests for object integrity checking. * fixup! fix: Improve serialization for prime fields --------- Co-authored-by: Carlos Pérez <37264926+CPerezz@users.noreply.github.com> * refactor: (De)Serialization of points using `GroupEncoding` (privacy-ethereum#88) * refactor: implement (De)Serialization of points using the `GroupEncoding` trait - Updated curve point (de)serialization logic from the internal representation to the representation offered by the implementation of the `GroupEncoding` trait. * fix: add explicit json serde tests * Insert MSM and FFT code and their benchmarks. (privacy-ethereum#86) * Insert MSM and FFT code and their benchmarks. Resolves taikoxyz/zkevm-circuits#150. * feedback * Add instructions * feeback * Implement feedback: Actually supply the correct arguments to `best_multiexp`. Split into `singlecore` and `multicore` benchmarks so Criterion's result caching and comparison over multiple runs makes sense. Rewrite point and scalar generation. * Use slicing and parallelism to to decrease running time. Laptop measurements: k=22: 109 sec k=16: 1 sec * Refactor msm * Refactor fft * Update module comments * Fix formatting * Implement suggestion for fixing CI * Re-export also mod `pairing` and remove flag `reexport` to alwasy re-export (privacy-ethereum#93) fix: re-export also mod `pairing` and remove flag `reexport` to alwasy re-export * fix regression in privacy-ethereum#93 reexport field benches aren't run (privacy-ethereum#94) fix regression in privacy-ethereum#93, field benches aren't run * Fast modular inverse - 9.4x acceleration (privacy-ethereum#83) * Bernstein yang modular multiplicative inverter (#2) * rename similar to privacy-ethereum#95 --------- Co-authored-by: Aleksei Vambol <77882392+AlekseiVambol@users.noreply.github.com> * Fast isSquare / Legendre symbol / Jacobi symbol - 16.8x acceleration (privacy-ethereum#95) * Derivatives of the Pornin's method (taikoxyz#3) * renaming file * make cargo fmt happy * clarifications from privacy-ethereum#95 (comment) [skip ci] * Formatting and slightly changing a comment --------- Co-authored-by: Aleksei Vambol <77882392+AlekseiVambol@users.noreply.github.com> * chore: delete bernsteinyang module (replaced by ff_inverse) * Bump version to 0.4.1 --------- Co-authored-by: David Nevado <davidnevadoc@users.noreply.github.com> Co-authored-by: Han <tinghan0110@gmail.com> Co-authored-by: François Garillot <4142+huitseeker@users.noreply.github.com> Co-authored-by: Carlos Pérez <37264926+CPerezz@users.noreply.github.com> Co-authored-by: einar-taiko <126954546+einar-taiko@users.noreply.github.com> Co-authored-by: Mamy Ratsimbazafy <mamy_github@numforge.co> Co-authored-by: Aleksei Vambol <77882392+AlekseiVambol@users.noreply.github.com>





This PR moves MSM and FFT to
halo2curvesas suggested in #84.Adds benchmarks for FFT and MSM to serve as comparison standard in the future: current situation vs privacy-ethereum/halo2#40 vs #29 vs post #163.
Note: This might require moving privacy-ethereum/halo2#202 to
halo2curvescc @jonathanpwangResolves taikoxyz/zkevm-circuits#150.