You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, most of the time in the signing protocol is spent in Montgomery exponentiation. Key refresh is split between exponentiation and prime number generation, but the latter is mainly exponentiation again (most of the time is spent in Miller-Rabin tests). So it would help a lot if the exponentiation performance is improved.
Possible avenues:
Replace schoolbook multiplication with Karatsuba or Toom-Cook. This may start making a difference at our integer sizes (2048 bit). This has to be done within crypto-bigint, see Improve multiplication RustCrypto/crypto-bigint#66
Use wNAF exponentiation instead of the current fixed-window one (for the cases where the exponent is not secret). This has to be done within crypto-bigint.
crypto-bigint's pow() supports exponents of arbitrary size (that is you can raise Uint<N> into Uint<M> power). We currently only raise Uint<N> to Uint<N>, and implement Uint<N>^Uint<2*N> and Uint<N>^Uint<4*N> by breaking the exponent in halves and exponentiating separately. If we could use the arbitrary size exponentiation, it could make this faster, because we would not have to calculate x^{2^N} separately to merge the halves - it's already calculated by the fixed window algorithm.
In some places where we calculate x^y mod N we also know phi(N) (the totient), so we can instead calculate x^(y mod phi(N)) mod N. If y is large (of the order of N^2), this may be faster than direct exponentiation.
The text was updated successfully, but these errors were encountered:
Did some investigation of point 4; capturing that here.
The reason it's ok to reduce y modulo phi(N) is that x is overwhelmingly likely to be coprime to N (the only two factors of N are p and q, both primes), so we can apply Euler's x^(phi(N)) ≡ 1 mod N.
phi(N) is on the order of N - 2*sqrt(N) (why? N is the product of two numbers each close to sqrt(N) given how we search for our primes, so (p-1)(q-1) is roughly N - (p+q) ≈ N - 2sqrt(N)) which is the same magnitude as N
If y is large, on the order of N^2 or larger, reducing it mod phi(N) takes the exponent to be order N rather than N^2 at the cost of one extra reduction.
reducing y from order N^2 to just N translates into cutting the exponentiation cost by half (why? because log(N^2) ≈ 2*log(N))
I find it difficult to guesstimate just how much of a speedup 4) would give us, but it seems likely that it ends up faster.
I wrote an artificial benchmark today comparing "vanilla" exponentiation (x^y mod N) of 2048^4096-bit numbers with x^(y mod phi(N)) mod N and perhaps unsurprisingly the latter is 2x as fast. The question about how much of that speedup seeps through in the end? I.e. How many such exponentiations are actually doing in the hot path?
Currently, most of the time in the signing protocol is spent in Montgomery exponentiation. Key refresh is split between exponentiation and prime number generation, but the latter is mainly exponentiation again (most of the time is spent in Miller-Rabin tests). So it would help a lot if the exponentiation performance is improved.
Possible avenues:
crypto-bigint
, see Improve multiplication RustCrypto/crypto-bigint#66crypto-bigint
.crypto-bigint
'spow()
supports exponents of arbitrary size (that is you can raiseUint<N>
intoUint<M>
power). We currently only raiseUint<N>
toUint<N>
, and implementUint<N>^Uint<2*N>
andUint<N>^Uint<4*N>
by breaking the exponent in halves and exponentiating separately. If we could use the arbitrary size exponentiation, it could make this faster, because we would not have to calculatex^{2^N}
separately to merge the halves - it's already calculated by the fixed window algorithm.x^y mod N
we also knowphi(N)
(the totient), so we can instead calculatex^(y mod phi(N)) mod N
. Ify
is large (of the order of N^2), this may be faster than direct exponentiation.The text was updated successfully, but these errors were encountered: