Conversation
src/Nethermind/Nethermind.Evm.Precompiles/BN254PairingPrecompile.cs
Outdated
Show resolved
Hide resolved
Fixed |
d0792f5 to
9966518
Compare
Co-authored-by: Lukasz Rozmej <lukasz.rozmej@gmail.com>
rubo
left a comment
There was a problem hiding this comment.
Apart from using Miller loop/final exp combo, other changes don't bring any performance benefits according to my benchmarks, just made the code less readable:
add
| Method | Mean | Error | StdDev |
|------- |---------:|---------:|---------:|
| RunOld | 848.0 ns | 3.18 ns | 2.82 ns |
| RunNew | 855.4 ns | 17.02 ns | 18.92 ns |
mul
| Method | Mean | Error | StdDev |
|------- |---------:|---------:|---------:|
| RunOld | 21.24 us | 0.238 us | 0.211 us |
| RunNew | 21.93 us | 0.437 us | 0.448 us |
Better? Though it was more about BN254Pairing than Add or Mul |
There was a problem hiding this comment.
Pull Request Overview
This PR optimizes the BN254 elliptic curve precompiles by improving memory management, implementing SIMD-accelerated byte reversal operations, and enhancing the pairing check algorithm. The changes focus on performance improvements while maintaining correctness.
Key changes:
- Replaced
ArrayPoolandstackalloc-based memory management with direct byte arrays to reduce allocations and copies - Implemented hardware-accelerated byte reversal using AVX2, SSE, and Vector128 intrinsics with scalar fallback
- Optimized pairing check to use Miller loop followed by single final exponentiation instead of full pairing per pair
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| BN254PairingPrecompile.cs | Simplified memory management by removing ArrayPool usage and passing inputData.Span directly to BN254.CheckPairing |
| BN254MulPrecompile.cs | Added named constants for input/output lengths and simplified output allocation |
| BN254AddPrecompile.cs | Added named constants, implemented input padding logic to avoid unnecessary copies when input matches expected length |
| BN254.cs | Major refactoring: changed signatures to use ReadOnlySpan, added SIMD-optimized byte reversal methods, optimized pairing check algorithm, and improved deserialization methods to work with pointers |
2a683fb to
7aec521
Compare
Doesn't look like they made any difference
Well, there are lots of changes bringing no performance gain but a negligible loss (at least on my machine)? |
7aec521 to
9d894e4
Compare
|
Added change to annotate some lightweight methods with |
d117412 to
7584279
Compare
|
Though is more about BN254Pairing than Mul and Add |
|
Overall, the situation is the following: Pairing check: where almost all the improvement comes from the Miller loop/final exp combo: Thus, all other bells and whistles add just ~5% improvement. For add: On each run, the changes are slower, although slightly. The same for the mul: |
Then maybe better to leave them intact? They're clearly worse now.
That's interesting, but that doesn't seem to make any difference in this particular case. |
Are you using 96 byte inputs for Mul and 128 byte inputs for Add in the benchmarks? |
The difference for both is less than the error; the difference on Add is ~16 cpu cycles? (4.3 ns) |
Marchhill
left a comment
There was a problem hiding this comment.
please verify no hive test regressions
| return true; | ||
| } | ||
|
|
||
| private static unsafe bool IsZero64(byte* ptr) |
There was a problem hiding this comment.
Could these methods be moved to somewhere they can be reused, like byte array extensions or something
There was a problem hiding this comment.
Not sure want to encourage pointers 😅
Will move if crops up again
Co-authored-by: Marc <Marchhill@users.noreply.github.com>
Looks ok |
* Add more logging in MultiSyncModeSelector (#9616) * Add more logging * fix for seq * feat: Add configurable EIP-2935 ring buffer size (#9611) * Blockchain Engine Tests support (#9394) * initial commit * fix normal blockchain tests * tidy * restore disposes * comment out BALs * fix var declaration * don't set basefeepergas if null * use network from genesis in blockchain test * update blockchain test base * add tracer to blockchain tests runner * tidy * tidy * add genesis processing timeout * check for null head block * try undo some changes * detect failure to process genesis * check removal is error * add back checks for genesis spec * only add noenginerequeststracker in tests * comment sealed block check * try remove timeout * only configure merge for engine tests * fix merge module init * add back timeout and remove sealer * await new payloads * use reflection for engine rpc method calling --------- Co-authored-by: Marc Harvey-Hill <10379486+Marchhill@users.noreply.github.com> * use zero address when from address not specified in rpc calls (#9578) * use zero address for null values * small test * fix proof rpc * fix test and add more changes * Allow serving snap requests for more than 128 blocks (#9602) * Initial plan * Add SnapServingMaxDepth configuration and update LastNStateRootTracker Co-authored-by: tanishqjasoria <11698398+tanishqjasoria@users.noreply.github.com> * Add clarifying comments for configuration changes Co-authored-by: tanishqjasoria <11698398+tanishqjasoria@users.noreply.github.com> * Get reorgDepth from config instead of hardcoding in test Co-authored-by: LukaszRozmej <12445221+LukaszRozmej@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: tanishqjasoria <11698398+tanishqjasoria@users.noreply.github.com> Co-authored-by: Tanishq Jasoria <jasoriatanishq@gmail.com> Co-authored-by: LukaszRozmej <12445221+LukaszRozmej@users.noreply.github.com> * Remove console log from FileTestsSource (#9622) Removed console log for loading test file. * Correct docs value for Blocks.BlockProductionMaxTxKilobytes (#9620) * Update OP Superchain chains (#9629) Co-authored-by: emlautarom1 <emlautarom1@users.noreply.github.com> * Auto-update fast sync settings (#9628) Co-authored-by: rubo <rubo@users.noreply.github.com> * feat: write AckMessage directly to IByteBuffer without temp array (#9623) * Optimize Ripemd (#9627) * Allow precompile cache to be switched off by config (#9633) * Mainnet Osaka, BPO1, BPO2 forks (#9615) * Change rlp limits and add logs (#9631) * log rlp guard messages * Adding stack trace * Increase receipts limit to 1024 * fix for stack trace * try fix multiline * fix * whitespace * try fix * More logs and potential fixes * log fix * Revert "More logs and potential fixes"This reverts commit ec71c87.# Conflicts:# src/Nethermind/Nethermind.Network/MessageSerializationService.csRevert "log fix"This reverts commit b3b1f5f. Revert "More logs and potential fixes" This reverts commit ec71c87. * try weird fix * Revert "try weird fix" This reverts commit 0bdfb3d. * revert packages.json * simplify log * Don't abbreviate ForkchoiceStateV1 hashes * revert spammy ForkchoiceStateV1 * Fix missed dispose on StorageRange in ProgressTracker * fix test * Optimise CALL by throwing stack underflow earlier (#9581) * fail fast * add unit tests --------- Co-authored-by: Marc Harvey-Hill <10379486+Marchhill@users.noreply.github.com> * Optimize BN254Pairing call (#9621) * Optimize BN254Pairing call * Don't modify input * Skip locals * Tweak inlining * Update src/Nethermind/Nethermind.Evm.Precompiles/BN254.cs Co-authored-by: Lukasz Rozmej <lukasz.rozmej@gmail.com> * Less fixed * Optimize * Optimize * formatting * formatting * Tidy * Use constants * Feedback * Skip init * Faster * More skip init * Make mul ReadOnlySpan, add comments * Simplify * Skip locals * Ruben making me work for it * More working for it * Still working for it * Update benchmarks * Apply suggestions from code review Co-authored-by: Marc <Marchhill@users.noreply.github.com> * Feedback --------- Co-authored-by: Lukasz Rozmej <lukasz.rozmej@gmail.com> Co-authored-by: Marc <Marchhill@users.noreply.github.com> * Update OP Superchain chains (#9643) Co-authored-by: LukaszRozmej <LukaszRozmej@users.noreply.github.com> * Log/decrease noise (#9642) * decrease Ethash cache miss log * Decrease block downloader invalid bloc log * Update X handle (#9634) * Better logs on invalid orphan (#9641) * Try fixing world wrong block * not this, so keep validating Withdrawals * Update all op chain configs (#9645) * update configs * update all configs * Add `NetworkId` flag to `InitConfig` (#9476) * add flag with networkId * cosmetic * Update src/Nethermind/Nethermind.Api/IInitConfig.cs Co-authored-by: Ruben Buniatyan <rubo@users.noreply.github.com> * fix * # Conflicts: # src/Nethermind/Nethermind.Api/IInitConfig.cs # src/Nethermind/Nethermind.Api/InitConfig.cs * Revert "# Conflicts:" This reverts commit 7f25638. * postmerge fix * Apply suggestion from @benaadams --------- Co-authored-by: Ruben Buniatyan <rubo@users.noreply.github.com> Co-authored-by: Ben {chmark} Adams <thundercat@illyriad.co.uk> * Revert "HACK: enable in fusaka" This reverts commit 7cfad14. * HACK: enable in pectra --------- Co-authored-by: Lukasz Rozmej <lukasz.rozmej@gmail.com> Co-authored-by: Daniil Ankushin <ankushin.daniil42@gmail.com> Co-authored-by: Marc <Marchhill@users.noreply.github.com> Co-authored-by: Marc Harvey-Hill <10379486+Marchhill@users.noreply.github.com> Co-authored-by: Tanishq Jasoria <jasoriatanishq@gmail.com> Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com> Co-authored-by: tanishqjasoria <11698398+tanishqjasoria@users.noreply.github.com> Co-authored-by: LukaszRozmej <12445221+LukaszRozmej@users.noreply.github.com> Co-authored-by: Ben {chmark} Adams <thundercat@illyriad.co.uk> Co-authored-by: core-repository-dispatch-app[bot] <173070810+core-repository-dispatch-app[bot]@users.noreply.github.com> Co-authored-by: emlautarom1 <emlautarom1@users.noreply.github.com> Co-authored-by: rubo <rubo@users.noreply.github.com> Co-authored-by: VolodymyrBg <aqdrgg19@gmail.com> Co-authored-by: LukaszRozmej <LukaszRozmej@users.noreply.github.com> Co-authored-by: Marcin Sobczak <77129288+marcindsobczak@users.noreply.github.com> Co-authored-by: Marcin Sobczak <marcindsobczak@gmail.com>

Changes
mclBn_millerLoopinside the loop and onemclBn_finalExpat the end removes N‑1 final exponentiations, which are the expensive part. On BN254 that is a meaningful win at even modest N.Types of changes
What types of changes does your code introduce?
Testing
Requires testing