Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(interpreter): Unify instruction fn signature #283

Merged
merged 6 commits into from
Dec 9, 2022

Conversation

rakita
Copy link
Member

@rakita rakita commented Dec 4, 2022

All opcode instructions are now functions with same signature:
pub fn jumpi(interpreter: &mut Interpreter, _host: &mut dyn Host)

This is revm internal change and it does not affect external API.

This allows the compiler to optimize calls in the interpreter hot loop and it gives around ~10% faster snailtracer execution on i7. I am interested to see if the same boost can be found on macs with M processors.

This is one of steps needed to define multiple dispatch strategies related to interpreter loop. As instructions now have same signature they can be put in an array, this can have potentially new ways to use revm as it is easier to just switch function pointer to something else (custom function to sload/sstore?)

Spec was simplified and is_static is moved as variable to interpreter (Previously it was inside Spec)

There was one regression with ruint in casting U256 to saturated usize here. It was faster to just check limbs.

microbench added to revm-tests

@rakita rakita force-pushed the rakita/unify_instr_fn branch from 958bddf to fb8073d Compare December 4, 2022 13:40
@rakita
Copy link
Member Author

rakita commented Dec 4, 2022

new run (with microbench)

Snailtracer benchmark (3.4s) ...  41_297_247.647 ns/iter (0.999 R²)

old run:

elapsed: 47.478018ms
0: 47.639335ms
1: 47.21674ms
2: 47.112436ms
3: 46.998897ms
4: 46.768525ms
5: 46.722415ms
6: 46.711912ms
7: 46.705151ms
8: 46.658515ms
9: 46.649658ms
10: 46.642895ms
11: 46.631915ms
12: 46.622408ms
13: 46.618776ms
14: 46.617359ms
15: 46.612672ms
16: 46.605036ms
17: 46.58608ms
18: 46.543365ms
19: 46.531848ms
20: 46.524966ms
21: 46.523186ms
22: 46.521023ms
23: 46.514446ms
24: 46.496036ms
end!

There is not a lot that can be seen from flamegraph, but there they are:

flamegraph new:
revm_unification

flamegraph old:
revm_main_before_unitication

@0xDmtri
Copy link
Contributor

0xDmtri commented Dec 4, 2022

OS: Mac OS
Chip: M1 Pro

Test info for rakita/unify_instr_fn:

Hello, world!
Snailtracer benchmark (3.1s) ...  33_138_990.401 ns/iter (0.999 R²)
end!

Test info for main:

Hello, world!
elapsed: 35.072378ms
0: 35.08125ms
1: 35.058541ms
2: 34.864458ms
3: 34.665416ms
4: 34.451916ms
5: 34.328375ms
6: 34.279208ms
7: 34.239041ms
8: 34.213ms
9: 34.201791ms
10: 34.199416ms
11: 34.196083ms
12: 34.192875ms
13: 34.191083ms
14: 34.188583ms
15: 34.1665ms
16: 34.153166ms
17: 34.152708ms
18: 34.127666ms
19: 34.097875ms
20: 34.091416ms
21: 34.091083ms
22: 34.085625ms
23: 34.074ms
24: 34.054541ms
end!

Copy link
Collaborator

@gakonst gakonst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love this PR as it has a ton of useful changes. Could we please isolate them in separate PRs, starting with the bench change, followed by the individual opcode changes, followed by the function signature unification? That way we'll know for sure what change caused what performance gain.

Cargo.toml Show resolved Hide resolved
bins/revm-test/src/bin/snailtracer.rs Show resolved Hide resolved
bins/revm-test/src/bin/snailtracer.rs Show resolved Hide resolved
crates/revm/src/evm_impl.rs Show resolved Hide resolved
crates/revm/src/instructions.rs Show resolved Hide resolved
crates/revm/src/instructions.rs Show resolved Hide resolved
crates/revm/src/instructions.rs Show resolved Hide resolved
crates/revm/src/instructions/arithmetic.rs Show resolved Hide resolved
crates/revm/src/instructions/arithmetic.rs Show resolved Hide resolved
crates/revm/src/instructions/bitwise.rs Show resolved Hide resolved
@rakita
Copy link
Member Author

rakita commented Dec 4, 2022

thanks for the review @gakonst!

I experimented with this idea (and a few others) 3 months ago and got pretty decent results, this was one of them. Will try to play with additional dispatch strategies but I am assuming that benefit will not be significant but would potentially allow more creative ways to use revm (as in injecting the instruction fn, I will need to see how it goes).

So this pull request is what I expected to happen it was not surprising.

@gakonst
Copy link
Collaborator

gakonst commented Dec 4, 2022

That makes sense - your PR of course, no strong opinions just wanted to take a close look. Love the outcome. Can we add the new benchmarks to the evm-bench repo?

@rakita rakita merged commit 90fe01e into main Dec 9, 2022
@gakonst
Copy link
Collaborator

gakonst commented Dec 9, 2022

@ziyadedher can you rerun https://github.com/ziyadedher/evm-bench with latest main? ^^

@ziyadedher
Copy link

@gakonst ooh cool! Yeah, will do tonight.

@rakita
Copy link
Member Author

rakita commented Dec 9, 2022

I have one more round of small optimizations, but this should give a good boost in performance.

@ziyadedher
Copy link

I also see ~10% bumps on snailtracer! But interestingly, I don't see any noticeable differences on other benches, see ziyadedher/evm-bench@aad8de9

This might make sense since the optimization is primarily in dispatches and whatnot it seems. Just wanted to float the results!

@rakita rakita deleted the rakita/unify_instr_fn branch December 24, 2022 18:58
@rakita
Copy link
Member Author

rakita commented Dec 24, 2022

I assumed it would be more, around 15%. Yeah, it is mostly dispatch optimization and interpreter loop, I will play with it a little bit more in future but I don't expect a lot.

revm is a little bit different from other evm's as it has both the Host and Interpreter together. This means that for test execution, SSLOAD and SSTORE access two HashMaps to find value, one for substate and one for cached value, If I remove this (planned here: #307), test for revm will be more accurate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants