`execute` Benchmark Mode

After the `execute` live session we came to the conclusion that there needs to be a separation of modes depending on whether we are running `execute` to test consensus or to benchmark.

Current issues:
- Execute sends every transaction to the chain regardless of whether the transaction is part of the benchmark test's setup or workload, so it's very difficult/impossible to clearly separate blocks that contain setup and workloads of these tests.
- There's no reporting whatsoever of when (which block) these transactions landed in the blockchain, and therefore it's difficult to construct a dashboard that highlights the execution of a single block when we don't really know its contents (test-wise).
- Currently we use `state_test` or `blockchain_test` specs to also construct benchmark tests, but this leads up to having to infer what each transaction of the test is trying to do, we should have `benchmark_test` and `benchmark_state_test` that contains extra fields to clearly separate and label each transaction (accounts can also be setup-only or workload-only labeled, e.g. of a setup-only account could be an account whose only purpose is to send transactions that deploy setup contracts).

Potential path forward:
- Have `benchmark_test` and `benchmark_state_test` have fields that allow the test writer to clearly state which transactions are part of the setup and which transactions are part of the workload.
- Have a new `execute remote --benchmark` or `execute benchmark` that only can run tests that are explicitly using the correct benchmark specs (`benchmark_test` and `benchmark_state_test`), and disallow `execute remote` or `execute hive` from running these tests.
- Allow `execute benchmark` to explicitly setup the remote blockchain to deploy contracts and accounts that the workload is going to need, along with checking that any stubbed contracts are already in place and correct.
  - We could use a two-phase approach (similar to how we fill engine-x tests) where the first phase runs all benchmark tests but in setup mode so it only deploys the contracts, and then the second phase actually runs the tests.
  - The setup phase could even be deterministic so we check all whether the contracts are already there before deploying any of them. This could be achieved by having a setup-seed so we can run the workload phase many times over and over with a single setup and just passing a `--setup-seed` flag or similar.


cc @jochem-brouwer @LouisTsai-Csie @kamilchodola @marcindsobczak

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`execute` Benchmark Mode #2112

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

execute Benchmark Mode #2112

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`execute` Benchmark Mode #2112