einar/pr/extended.jacobian.coordinates#1
Conversation
|
I get even less improvement than previously reported which is quite strange |
| }); | ||
| sum | ||
| }) | ||
| } |
There was a problem hiding this comment.
Is the plan to remove this after the review? This duplicates halo2_proofs https://github.com/privacy-scaling-explorations/halo2/blob/f3487575f00e627705fcb7778d7d1049eed79601/halo2_proofs/src/arithmetic.rs#L28-L174
mratsim
left a comment
There was a problem hiding this comment.
The EC implementation is correct with a small issue on inline usage that will lead to code explosion.
The multiexp implementation should be done by just modifying the buckets accumulation scheme in https://github.com/privacy-scaling-explorations/halo2/blob/f3487575f00e627705fcb7778d7d1049eed79601/halo2_proofs/src/arithmetic.rs#L67-L71
Also, there are 2 passes over the coeffs+base instead of 1 in this PR which should explain most of the slowness compared to theoretical (15%) and conservative estimate (10% speedup).
|
|
||
| // Jacobian implementations | ||
|
|
||
| // From affine to homogeneous |
There was a problem hiding this comment.
Is it from affine to homogeneous or from affine to Jacobian?
This seems to contradict the comment above
There was a problem hiding this comment.
Agreed. I added the comment because I found it confusing too, since $name refers to homogeneous coordinates and $name_affine to affine coordinates. Alternatively, I can remove the first comment, since non-extended Jacobian coordinates should be gone anyway.
| let zzz1 = p.zzz; | ||
|
|
||
| // curve constants | ||
| let a = Self::curve_constant_a(); |
There was a problem hiding this comment.
a is always equal to 0 for pairing curves (those needed for Snarks) and it's also equal to 0 for secp256k1.
If Rust provides compile time evaluation, it would be extremely useful to specialize for a == 0
There was a problem hiding this comment.
At the very least a comment is needed to point out an future easy optimization opportunity
There was a problem hiding this comment.
a is always equal to 0 for pairing curves (those needed for Snarks) and it's also equal to 0 for secp256k1.
If Rust provides compile time evaluation, it would be extremely useful to specialize for a == 0
I have changed it to:
// curve constants
const A: $base = $name_jac_ext::curve_constant_a();
This should enforce compile time evaluation and hopefully imply that
let m = (x1_sqr.double()+x1_sqr) + A*zz1.square();
gets optimized to
let m = (x1_sqr.double()+x1_sqr)
for the curves where
There was a problem hiding this comment.
You should use if A == 0 and if A != 0.
The multiplication and addition operations are pure assembly and will not be optimized away.
There was a problem hiding this comment.
We can make it that way, no problem, but I politely remain skeptical of the argument. When the value is known to be 0 at compile time, like in our case, the assembly will indeed be optimized away. I tested it with -opt-level=2 here https://godbolt.org/z/4Gr41o8Kr
There was a problem hiding this comment.
We can make it that way, no problem, but I politely remain skeptical of the argument. When the value is known to be 0 at compile time, like in our case, the assembly will indeed be optimized away. I tested it with
-opt-level=2here https://godbolt.org/z/4Gr41o8Kr
I realise that * is overloaded for field arithmetic, so my argument may or may not be applicable.
| let w = u*v; | ||
| let s = x1 * v; | ||
| let x1_sqr = x1.square(); | ||
| let m = (x1_sqr.double()+x1_sqr) + a*zz1.square(); |
There was a problem hiding this comment.
with a = 0, a*zz1.square() can be skipped
There was a problem hiding this comment.
I tried commenting out the last term and then ran cargo test. All tests pass.
I put in a debug_assert_eq!(a, 0);, so if we can catch future curves where a != 0.
| } | ||
|
|
||
| /// <http://www.hyperelliptic.org/EFD/g1p/auto-shortw-xyzz.html#doubling-dbl-2008-s-1> | ||
| #[inline] |
There was a problem hiding this comment.
#[inline] is likely not worth it here. Especially because the underlying field operations are already inline so it will lead to code size explosion:
halo2curves/src/bn256/assembly.rs
Lines 11 to 12 in 2264867
There was a problem hiding this comment.
Okay. I will remove it from the add as well then.
| let num_threads = num_cpus::get(); | ||
| if n > num_threads && n > 32 { | ||
| let chunk = n / num_threads; | ||
| let results: Vec<C::ExtendedJacobianCoordinates> = coeffs |
There was a problem hiding this comment.
Only the inner buckets result should be collected in Extended Jacobian.
Extended Jacobian mixed addition is faster than projective, but for plain addition, projective is faster.
There was a problem hiding this comment.
So the signature of best_multiexp should stay with projective coordinates
| }) | ||
| .sample_size(30); | ||
| } | ||
| } |
There was a problem hiding this comment.
To measure perf of this PR without multithreading interference, a serial benchmark is needed as well.
| } | ||
| } | ||
|
|
||
| pub(crate) fn multiexp_serial<C: CurveJacExt>(coeffs: &[C::Scalar], bases: &[C]) -> C::ExtendedJacobianCoordinates { |
There was a problem hiding this comment.
similarly, the signature of multiexp_serial should stay homogeneous projective
| .enumerate() | ||
| .rev() | ||
| .map(|(i, bucket)| { | ||
| for (coeff, base) in coeffs.iter().zip(bases.iter()) { |
There was a problem hiding this comment.
The map and for are inefficient, you loop twice over the data.
You should loop over the coeff+base and directly mutate/add_assign in the matching bucket
| } | ||
| bucket | ||
| }) | ||
| .fold(C::jac_ext_identity(), |mut sum, bucket| { |
There was a problem hiding this comment.
The final reduction should use projective coordinates (and explicitly convert the bucket sum to projective)
|
Note on our discussion on constant_a, some more changes upstream privacy-ethereum#82. |
…_curve()` is implemented
d52078b to
747c6e2
Compare
2239b6a to
81a0782
Compare

This resolves taikoxyz/zkevm-circuits#121
To measure the performance impact: