Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmark against other EVMs #7

Closed
lightclient opened this issue Nov 9, 2021 · 13 comments
Closed

Benchmark against other EVMs #7

lightclient opened this issue Nov 9, 2021 · 13 comments

Comments

@lightclient
Copy link

It would be cool to benchmark against other EVM implementations, especially evmone which AFAIK is currently the fastest EVM interpreter.

This would probably be a good benchmark for arithmetic: ethereum/evmone#320

@rakita
Copy link
Member

rakita commented Nov 10, 2021

This will be very useful, thank you lightclient!

On my laptop, I am getting around 210-220ms, didn't expect to be that big. Will need to spin perf to see if I can see something.

@rakita
Copy link
Member

rakita commented Nov 14, 2021

It is a little bit faster now, I am getting around ~110-120ms. The memory and stack that I got from sputnik were not optimized, and signed operations could probably be done better.

@rakita
Copy link
Member

rakita commented Nov 15, 2021

and now it is more in the range of ~85-95ms

@rakita
Copy link
Member

rakita commented Nov 17, 2021

~75-80ms now, on my laptop.

@rakita
Copy link
Member

rakita commented Nov 23, 2021

sdiv looks like next in line for optimization.
Screenshot from 2021-11-23 02-11-49

@rakita
Copy link
Member

rakita commented Dec 12, 2021

This is where the story becomes interesting. and evmone is really great. I added few more optimization: static gas are precalculated in gas_block and applied when needed and added some other small tweaks but still div is big performance hit.

It seems that there is a big difference if I am running windows or linux. windows is usually faster by ~8-10ms, I am still unsure what part of code is responsible for that. All measurements above are done in windows.

For measurement bellow, they only differ by switching div and sdiv opcodes. here

For Parity u256 div I am getting around ~68-72ms on win and ~77-80ms on linux and graph looks like:
parityu256

while with zkp_u256 I got a boost and was getting around ~58-61ms on windows and on linux ~67-68ms
zkpu256

zkp u256 that uses __udivti3 to divide 2by1 word here. It is a lot faster even with unneeded Option unwrap, I will remove it and measure again a bit later.
parity u256 uses their custom 2by1 div and it is even slower, from flamegraph it seems all time is spent on this function: here

Parity_u256 should probably just switch to u128 and will probably gain some better performance.

evmone uses an optimized version that seems even faster than embedded __udivti3 so there are even more improvements that can be done. Amazing Pawel gave us info on the speed of it: https://groups.google.com/g/llvm-dev/c/5PqUC4nB_DQ/m/DaCBItw4AAAJ

running on: Intel(R) Core(TM) i7-10750H CPU @ 2.60GHz

flamegraphs as svg if somebody wants to look in detail:
flamegraphs.zip

I feel like there is a lot of small improvements that can be done to optimize things, but we will see how big of an impact they will have.

@rakita
Copy link
Member

rakita commented Dec 12, 2021

switching parity u256 div_mod_word with zkp_u256 gives me good boost ~64-66ms on linux that is even better than zkp_u256
link

same output was got with just using parity u256 div_mod_word uncommented code

@rakita
Copy link
Member

rakita commented Dec 13, 2021

~56-58ms on windows with improved parity u256.

test is found in bin/revm-test/ and executed with cargo run --release

@rakita
Copy link
Member

rakita commented Dec 14, 2021

My test was called only once per execution, and I would execute it multiple times to get range of timing. I changed that and introduced loop, so now the execution test is called 50times. So after a few iterations, i am getting a better time than windows

elapsed: 53.666179ms
0: 65.588152ms
1: 63.255175ms
2: 57.723127ms
3: 56.212264ms
4: 53.734064ms
5: 53.121586ms
6: 53.089055ms
7: 53.133512ms
8: 53.082209ms
9: 53.090587ms
10: 53.045255ms
11: 53.880638ms
12: 53.16134ms
13: 52.969316ms
14: 53.033339ms
15: 53.167286ms
16: 53.091371ms
17: 53.054458ms
18: 53.067683ms
19: 53.243839ms
20: 53.085979ms
21: 53.122794ms
22: 53.06014ms
23: 53.123104ms
24: 53.072308ms
25: 53.119213ms
26: 53.072579ms
27: 53.094516ms
28: 53.139832ms
29: 53.038691ms
30: 53.094649ms
31: 53.293706ms
32: 52.844196ms
33: 51.876471ms
34: 52.991977ms
35: 53.015948ms
36: 53.241124ms
37: 52.784502ms
38: 52.94318ms
39: 52.920714ms
40: 52.792951ms
41: 53.023354ms
42: 53.096627ms
43: 53.086917ms
44: 52.479412ms
45: 52.817731ms
46: 53.05368ms
47: 52.982625ms
48: 53.16019ms
49: 53.135602ms

And I am getting close to evmone:

advanced/total/snailtracer/benchmark        51468 us        51466 us           13 gas_rate=4.47271G/s gas_used=230.193M
baseline/total/snailtracer/benchmark        46800 us        46762 us           15 gas_rate=4.92267G/s gas_used=230.193M

@rakita
Copy link
Member

rakita commented Dec 23, 2021

after binding intx directly I am getting even better results that are comparable with evmone (changes are at intx branch):

mean: 48.905952ms
median: 48.82769ms
0: 49.88344ms
1: 50.16717ms
2: 47.413608ms
3: 48.678762ms
4: 48.776993ms
5: 48.747ms
6: 48.434196ms
7: 48.795624ms
8: 49.002815ms
9: 48.859757ms
10: 48.972574ms
11: 48.752764ms
12: 48.724995ms
13: 48.790919ms
14: 48.897968ms
15: 48.52337ms
16: 49.149537ms
17: 49.326058ms
18: 48.927653ms
19: 49.293851ms

And flamegraph with that change looks like this (zipped svg file: flamegraph.zip):

flamegraph_final

I will not merge intx to main brach, proper way should be to reimplement it into rust. There is two issues regarding that for future improvements: #22 and #23

I feel like this is okay to close, revm got very close to evmone and timings looks good. There is probably some optimization that can be done on Host part, evmone uses MockedHost for testing while revm has only standard host impl and mock Database ( you can see from flamegraph sload takes a lot of time), but i will leave this for later. It was fun ride.

@rakita rakita closed this as completed Dec 23, 2021
@aewc
Copy link

aewc commented Sep 15, 2022

Is there any clear documentation comparing the performance of REVM with other EVMs, especially parity EVM which is also based on Rust?

@rakita
Copy link
Member

rakita commented Sep 18, 2022

Is there any clear documentation comparing the performance of REVM with other EVMs, especially parity EVM which is also based on Rust?

If you found one, please forward it to me.
In general, this issue is comparing revm with evmone, and there is this comparison with sputnikvm here: cassc/rust-evm-bench#2

@flyq
Copy link
Contributor

flyq commented Jan 29, 2023

rakita pushed a commit that referenced this issue Sep 22, 2023
anonymousGiga added a commit to anonymousGiga/revm that referenced this issue Nov 21, 2023
* fix error count of sload opcode

* format code

---------

Co-authored-by: anonymousGiga <[email protected]>
anylots pushed a commit to morph-l2/revm that referenced this issue Oct 15, 2024
* many fix and more assert

* fix default
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants