-
Notifications
You must be signed in to change notification settings - Fork 171
Open
Description
After the execute
live session we came to the conclusion that there needs to be a separation of modes depending on whether we are running execute
to test consensus or to benchmark.
Current issues:
- Execute sends every transaction to the chain regardless of whether the transaction is part of the benchmark test's setup or workload, so it's very difficult/impossible to clearly separate blocks that contain setup and workloads of these tests.
- There's no reporting whatsoever of when (which block) these transactions landed in the blockchain, and therefore it's difficult to construct a dashboard that highlights the execution of a single block when we don't really know its contents (test-wise).
- Currently we use
state_test
orblockchain_test
specs to also construct benchmark tests, but this leads up to having to infer what each transaction of the test is trying to do, we should havebenchmark_test
andbenchmark_state_test
that contains extra fields to clearly separate and label each transaction (accounts can also be setup-only or workload-only labeled, e.g. of a setup-only account could be an account whose only purpose is to send transactions that deploy setup contracts).
Potential path forward:
- Have
benchmark_test
andbenchmark_state_test
have fields that allow the test writer to clearly state which transactions are part of the setup and which transactions are part of the workload. - Have a new
execute remote --benchmark
orexecute benchmark
that only can run tests that are explicitly using the correct benchmark specs (benchmark_test
andbenchmark_state_test
), and disallowexecute remote
orexecute hive
from running these tests. - Allow
execute benchmark
to explicitly setup the remote blockchain to deploy contracts and accounts that the workload is going to need, along with checking that any stubbed contracts are already in place and correct.- We could use a two-phase approach (similar to how we fill engine-x tests) where the first phase runs all benchmark tests but in setup mode so it only deploys the contracts, and then the second phase actually runs the tests.
- The setup phase could even be deterministic so we check all whether the contracts are already there before deploying any of them. This could be achieved by having a setup-seed so we can run the workload phase many times over and over with a single setup and just passing a
--setup-seed
flag or similar.
cc @jochem-brouwer @LouisTsai-Csie @kamilchodola @marcindsobczak
jochem-brouwer
Metadata
Metadata
Assignees
Labels
No labels