-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
P-521 implementation with Fiat-crypto field arithmetic #376
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm in the middle of make_tables.go. Will continue reviewing tomorrow.
|
||
if err := writeP521Table("p521_table.h"); err != nil { | ||
fmt.Fprintf(os.Stderr, "Error writing p521_table.h: %s\n", err) | ||
os.Exit(1) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is different indentations in these lines than the ones above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the indentation is fixed in the whole file now.
ret := make([]uint64, words) | ||
mask := big.NewInt((1 << 58) - 1) | ||
tmp := new(big.Int).Set(n) | ||
for i := 0; i < words; i++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the stopping condition be tmp == 0 as an additional check on the correct calculation of words
? Alternatively, we can check after the loop that tmp =0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is no reason for additional checks here, there are no checks in the existing functions and the script is used only for generating the table. Any mistake here and the table would be wrong, which means the unit tests for the curve would fail.
mask := big.NewInt((1 << bits) - 1) | ||
ret[i] = new(big.Int).And(tmp, mask).Uint64() | ||
tmp.Rsh(tmp, bits) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar comment, not sure if there is a value in checking tmp=0 here.
Also the indentation seems off in the function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see my comment above. Indentation fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still continuing
@@ -414,9 +514,40 @@ func writeU32Mont(w io.Writer, curve elliptic.Curve, n *big.Int, indent int) err | |||
}) | |||
} | |||
|
|||
type writeBigIntFunc func(w io.Writer, curve elliptic.Curve, n *big.Int, indent int) error | |||
func writeU64(w io.Writer, curve elliptic.Curve, n *big.Int, indent int, bitSizes []uint) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think bitSizes
is not used in this function nor the following one, nor in writeU32. Is there a reason to have it in the arguments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't get me started on Go's limitations :) Because of the way all these functions are defined and used I couldn't find a more elegant way than adding the parameter to all the functions even if it's not used everywhere
// For each digit |d| in the current group read the corresponding point | ||
// from the table and add it to |res|. If |d| is negative, negate | ||
// the point before adding it to |res|. | ||
int16_t d = rnaf[j]; | ||
// is_neg = (d < 0) ? 1 : 0 | ||
int16_t is_neg = (d >> 7) & 1; | ||
int16_t is_neg = (d >> 15) & 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! Thank you. There is another occurrence at line 772.
Just curious, was it causing errors in P-521?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it could cause errors only with window size >7 I think. Fixed 772.
fiat_secp521r1_selectznz(out, !!t, z, nz); | ||
} | ||
|
||
static void p521_from_generic(p521_felem out, const EC_FELEM *in) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I keep coming back to remember this: It may be worth it to add as a comment here (and in p384.c and p256.c) and before _to_generic
that the input and output are in little-endian representation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
crypto/fipsmodule/ec/p521.c
Outdated
out[P521_MUL_NWINDOWS - 1] = window; | ||
} | ||
|
||
// fiat_p521_select_point selects the |idx|-th projective point from the given |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function and the following one kept the fiat_
prefix which was removed in p384 ones.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks! changed
crypto/fipsmodule/ec/p521.c
Outdated
p521_to_generic(&r->X, res[0]); | ||
p521_to_generic(&r->Y, res[1]); | ||
p521_to_generic(&r->Z, res[2]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these lines needed? I don't think r is used afterwards until they're repeated again at l. 675.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, leftover from debugging.
crypto/fipsmodule/ec/p521.c
Outdated
p521_felem acc, t128, t16, t2, t256, t32, t4; | ||
p521_felem t512, t516, t518, t519, t64, t8; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: we can rearrange the variable names in their order of appearance, which I think is their ascending order, unless you'd like to keep it true to where it's copied from.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
…s#376) * Workaround to mark the DEFAULT SecureRandom algorithm thread-safe This is a workaround to ensure that the `nextBytes` calls to SecureRandom are not synchronized when specifying the alias algorithm `DEFAULT` of ACCP SecureRandom provider. Without this workaround, the `nextBytes` calls are synchronized and the multi-threaded performance of ACCP SecureRandom drops: concretely, with x threads concurrently generating randomness, average time per thread gets multiplied by x. In more details, OpenJDK/Corretto requires thread safe implementations of SecureRandom to specify the attribute ThreadSafe (https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/security/SecureRandom.html). If this attribute is not set to true, the nextBytes calls are synchronized. ACCP sets this attribute when registering SecureRandom. It also registers an alias `DEFAULT` to the algorithm `LibCryptoRng` that it registers. Attributes of services are converted to attributes at the provider level, using the naming convention: `Service.Algorithm Attribute`. Thus, setting the `ThreadSafe` attribute leads to setting `SecureRandom.LibCryptoRng ThreadSafe=true`, but not `SecureRandom.DEFAULT ThreadSafe=true`. Unfortunately, the JDK does not follow aliases and thus the DEFAULT algorithm is not considered thread safe. See, e.g., https://github.com/corretto/corretto-17/blob/e0d038c50d5c61683a9270218b32c5aed2b0b466/src/java.base/share/classes/java/security/SecureRandom.java#L229-L236 FAQ: > Who is affected by the performance drop if this workaround is not applied? Anyone instantiating explicitly the `DEFAULT` algorithm of ACCP SecureRandom (i.e., `SecureRandom.getInstance("DEFAULT", "AmazonCorrettoCryptoProvider")`) and using it concurrently in multiple threads. This includes the benchmarks in the subfolder `benchmarks/` of this repo. > I am using `new SecureRandom`, am I affected? No. `new SecureRandom()` uses the algorithm `LibCryptoRng` and not the alias `DEFAULT` (assuming ACCP is set as first cryptographic provider). > I did not register ACCP SecureRandom, am I affected? No. > Can this workaround impact other crypto providers like BouncyCastle (which also provides a `DEFAULT` algorithm for SecureRandom)? No, attributes are stored at the provider level. This change will not affect other providers. * Add unit tests and checks when setting DEFAULT to be ThreadSafe This commit improves the way SecureRandom algorithms (LibCryptoRng and its alias DEFAULT) are registered and adds unit tests. 1. Check that DEFAULT points to LibCryptoRng before setting it to be ThreadSafe. This is to mitigate risks that a future code change uses a different non-ThreadSafe algorithm for the DEFAULT alias. 2. Add unit tests checking that both LibCryptoRng and DEFAULT are ThreadSafe. --------- Co-authored-by: Fabrice Benhamouda <[email protected]>
Following reversal of PR aws#376 in PR aws#388, using the ACCP SecureRandom algorithm `DEFAULT` (which is an alias of `LibCryptoRng`) yields lower performance in multi-threaded settings. See aws#376 for details. This PR is meant to use the algorithm `LibCryptoRng` in the benchmark, instead of its alias `DEFAULT`. This solves the performance drop in multi-threaded settings. This also corresponds to the most common use of ACCP SecureRandom. Indeed, if ACCP is the first security provider and if ACCP SecureRandom is registered, then when instantiating SecureRandom as `new SecureRandom()`, the algorithm `LibCryptoRng` will be selected. Co-authored-by: Fabrice Benhamouda <[email protected]>
Issues:
CryptoAlg-923
Description of changes:
This PR adds the implementation of P-521 curve that uses Fiat-crypto field arithmetic. The implementation is done in the same way as for P-384. The main changes are:
third_party
directory (filesp521_32.h
andp521_64.h
)p521.c
. The file contains implementation of:make_tables.go
script to support generating the pre-computed table for P-521 (the generated file isp521_tables.h
).Finally, the performance improvements of this change are:
The raw numbers measure by
./tool/bssl speed -filter "P-521"
(x86) EC2 instance with Intel(R) Xeon(R) Platinum 8124M CPU:
Before:
Did 451 ECDH P-521 operations in 1099027us (410.4 ops/sec)
Did 792 ECDSA P-521 signing operations in 1083113us (731.2 ops/sec)
Did 781 ECDSA P-521 verify operations in 1077091us (725.1 ops/sec)
After:
Did 1716 ECDH P-521 operations in 1065345us (1610.7 ops/sec)
Did 4400 ECDSA P-521 signing operations in 1023071us (4300.8 ops/sec)
Did 1903 ECDSA P-521 verify operations in 1076533us (1767.7 ops/sec)
(Arm) EC2 instance with Graviton 2:
Before:
Did 154 ECDH P-521 operations in 1071138us (143.8 ops/sec)
Did 275 ECDSA P-521 signing operations in 1078510us (255.0 ops/sec)
Did 264 ECDSA P-521 verify operations in 1073877us (245.8 ops/sec)
After:
Did 572 ECDH P-521 operations in 1075348us (531.9 ops/sec)
Did 1540 ECDSA P-521 signing operations in 1054963us (1459.8 ops/sec)
Did 627 ECDSA P-521 verify operations in 1090150us (575.2 ops/sec)
Call-outs:
Point out areas that need special attention or support during the review process. Discuss architecture or design changes.
Testing:
How is this change tested (unit tests, fuzz tests, etc.)? Are there any testing steps to be verified by the reviewer?
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.