stylus_benchmark #2827

diegoximenes · 2024-12-11T13:56:49Z

This PR adds the stylus_benchmark binary.
It will mainly be used to fine tune the ink prices that are charged today.

It deterministically creates .wat programs with lots of instructions to be benchmarked.

If this PR is approved then more .wat programs generations, benchmarking other instructions, will be added in a future PR.

This PR introduces two new WASM instruction to jit, start_benchmark and end_benchmark.
Code blocks between start_benchmark and end_benchmark instructions will be benchmarked.

stylus_benchmark uses jit as a library.

This reverts commit 4fd9d10.

arbitrator/tools/stylus_benchmark/src/generate_wats.rs

tsahee · 2024-12-17T04:09:48Z

general approach seems great.
would like to get more reviews but I think it's already interesting to add a few benchmarks and start looking at results.

…R_PATH> --cenario <SCENARIO>

gligneul · 2024-12-18T20:36:44Z

arbitrator/tools/stylus_benchmark/src/generate_wats.rs

+use std::path::PathBuf;
+
+fn generate_add_i32_wat(mut out_path: PathBuf) -> eyre::Result<()> {
+    let number_of_ops = 20_000_000;


I'm skeptical of the faithfulness of this benchmark when compared to actual execution. Stylus programs are usually tiny, given the contract size limitation (24kb), so real programs will live in the CPU cache. Generating a big program that doesn't fit the CPU cache might shadow the processing time with the memory transfer overhead between RAM and the cache. My theory is that the memory transfer will become the bottleneck, and the measurement of instructions won't be precise. For instance, the addition instruction might appear much more expensive than in real usage because of the CPU caching problem.

Instead of creating a program with millions of instructions, I suggest you create a program with a few thousand instructions inside a loop. Since you have a few thousand instructions, the overhead of the loop increment and branch instructions should be minimal.

You are right :), the size of the program was impacting a lot benchmark performance.
The strategy that you mentioned is around 5 times faster than not using a loop at all.

Instead of using toggle_benchmark to specify a single code block to be benchmarked, I added start_benchmark and end_benchmark instructions, that enables to specify multiple execution blocks to be benchmarked.
I tried to only benchmark the execution block inside the loop, but calling start_benchmark/end_benchmark multiple times introduces a performance overhead, that is not worth when compared to having a single start_benchmark/end_benchmark block that includes the loop.

Thanks!!!

arbitrator/arbutil/src/benchmark.rs

arbitrator/tools/stylus_benchmark/src/benchmark.rs

gligneul · 2024-12-18T21:15:59Z

arbitrator/tools/stylus_benchmark/src/benchmark.rs

+
+    let exec = &mut WasmEnv::default();
+
+    let module = exec_program(


Is this code benchmarking only the JIT execution? I'm worried about the differences between the JIT and the interpreter, so it would be nice to benchmark non-JIT execution as well.

That is a good point.
By interpreter you mean execution through the prover's binary?

@tsahee, any takes on that?

diegoximenes · 2024-12-23T16:01:32Z

general approach seems great. would like to get more reviews but I think it's already interesting to add a few benchmarks and start looking at results.

I intend to add more scenarios to be benchmarked in a future PR, together with a more detailed look at results.

But here goes the output of the benchmarks implemented in this PR, that was run in a Apple M3 Pro:

No scenario specified, benchmarking all scenarios

Benchmarking add_i32
Run 0, duration: 120.735917ms, ink_spent: Ink(29298411152)
Run 1, duration: 108.255875ms, ink_spent: Ink(29298411152)
Run 2, duration: 110.482833ms, ink_spent: Ink(29298411152)
Run 3, duration: 95.918417ms, ink_spent: Ink(29298411152)
Run 4, duration: 95.604125ms, ink_spent: Ink(29298411152)
Run 5, duration: 95.353792ms, ink_spent: Ink(29298411152)
Run 6, duration: 95.531375ms, ink_spent: Ink(29298411152)
After discarding top and bottom runs:
avg_duration: 99.926139ms, avg_ink_spent_per_micro_second: 293201
Benchmarking xor_i32
Run 0, duration: 96.054583ms, ink_spent: Ink(29298411152)
Run 1, duration: 96.208625ms, ink_spent: Ink(29298411152)
Run 2, duration: 97.552791ms, ink_spent: Ink(29298411152)
Run 3, duration: 96.658ms, ink_spent: Ink(29298411152)
Run 4, duration: 96.164417ms, ink_spent: Ink(29298411152)
Run 5, duration: 96.1905ms, ink_spent: Ink(29298411152)
Run 6, duration: 96.149083ms, ink_spent: Ink(29298411152)
After discarding top and bottom runs:
avg_duration: 96.187847ms, avg_ink_spent_per_micro_second: 304598

cla-bot bot added the s Automatically added by the CLA bot if the creator of a PR is registered as having signed the CLA. label Dec 11, 2024

diegoximenes requested a review from tsahee December 11, 2024 19:27

diegoximenes added 28 commits December 13, 2024 08:20

wip

fdb42c8

Revert "wip"

6705932

This reverts commit 4fd9d10.

jit as a library and binary

d94665a

Beginning of stylus_benchmark

cd1642d

stylus_benchmark: create_stylus_config and create_evm_data2

25a770b

Refactors jit to have *_with_wasm functions

472a662

Basic CallProgramLoop in stylus_benchmark

f327531

Adds debug.toggle_measurement instruction

2aa59c9

toggle_measurement ink tracking

61620f9

Run benchmark multiple times for the same wat file

5e17f53

toggle_measurement to toggle_benchmark

89eeee8

Adds whole Benchmark object to MessageFromCothread

7d9819c

Stores benchmark as an Option<Benchmark> in WasmEnv

55463ed

reuse compiled module

e9a74d9

Uses ? instead of unwrap when appropriate

1198dfa

Small refactor on stylus_benchmark

d53bcc2

stylus_benchmark: from to_result to check_result

d2668f6

Removes send_response_with_wasm_env

c4fb5df

Removes cycle from Benchmark

f36013f

Renames add_one.wat to add_i32.wat. Uses loop in add_i32.wat

24dac3e

stylus_benchmark: print ink usage

3df9b24

Improves prints in stylus_benchmark

87f7386

fix cargo fmt

654d18a

fix cargo fmt

15ca778

Adds rust fmt to stylus_benchmark in ci

a813433

Improve panics msgs in stylus_benchmark

1f1e5e3

generate-wats subcommand

175e39b

Split benchmark and generate_wats in different files

922fce8

diegoximenes requested review from gligneul, magicxyyz and ganeshvanahalli December 13, 2024 13:53

Fixes typo

382da91

tsahee reviewed Dec 17, 2024

View reviewed changes

arbitrator/tools/stylus_benchmark/src/generate_wats.rs Outdated Show resolved Hide resolved

stylus_benchmark: change args to --output-wat-dir-path <OUTPUT_WAT_DI…

aa0da72

…R_PATH> --cenario <SCENARIO>

gligneul reviewed Dec 18, 2024

View reviewed changes

diegoximenes added 19 commits December 19, 2024 09:26

Uses strum in stylus_benchmark

ecb5172

Run all scenarios in stylus_benchmark

486cc67

Do not import functions in stylus_benchmark

03e8315

Adds comment to Benchmark struct

6d59589

Uses loop in benchmark programs

babadc9

start_benchmark/end_benchmark instead of toggle_benchmark

7fbd08a

Stores Benchmark instead of Option<Benchmark> in WasmEnv

d3ad3e6

Divide wat generation in multiple functions

1fcab9f

Benchmark XorI32

367f77b

Avoids unnecessary derivative

56b3354

Fixes user-host

0ae7046

Increases number of operations

cc8ba64

Fixes user-test

8af1b4b

Fix clippy check

b0e79e5

Add comment about the strategy of the programs to be benchmarked

c270a85

Improves wat programs to be benchmarked generation

667d6ac

Fix lint

6aa0c91

Fix lint

33a80b3

Fix lint

28c045b

diegoximenes requested review from tsahee and gligneul December 23, 2024 16:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

stylus_benchmark #2827

stylus_benchmark #2827

diegoximenes commented Dec 11, 2024 •

edited

Loading

tsahee commented Dec 17, 2024

gligneul Dec 18, 2024

diegoximenes Dec 23, 2024

gligneul Dec 18, 2024

diegoximenes Dec 23, 2024

diegoximenes commented Dec 23, 2024 •

edited

Loading


		let exec = &mut WasmEnv::default();

		let module = exec_program(

stylus_benchmark #2827

Are you sure you want to change the base?

stylus_benchmark #2827

Conversation

diegoximenes commented Dec 11, 2024 • edited Loading

tsahee commented Dec 17, 2024

gligneul Dec 18, 2024

Choose a reason for hiding this comment

diegoximenes Dec 23, 2024

Choose a reason for hiding this comment

gligneul Dec 18, 2024

Choose a reason for hiding this comment

diegoximenes Dec 23, 2024

Choose a reason for hiding this comment

diegoximenes commented Dec 23, 2024 • edited Loading

diegoximenes commented Dec 11, 2024 •

edited

Loading

diegoximenes commented Dec 23, 2024 •

edited

Loading