Refactor evm Instruction to be a c-like enum#8914
Conversation
| arr[REVERT as usize] = InstructionInfo::new("REVERT", 2, 0, GasPriceTier::Zero); | ||
| static ref INSTRUCTIONS: [Option<InstructionInfo>; 0x100] = { | ||
| let mut arr = [None; 0x100]; | ||
| arr[STOP as usize] = Some(InstructionInfo::new("STOP", 0, 0, GasPriceTier::Zero)); |
There was a problem hiding this comment.
This tab formatting displays correctly on my Emacs, but looks like it doesn't work on Github interface.
I'm thinking about maybe it's better to just use a single space as separator -- we don't gain much readability by using multiple tabs.
tomusdrw
left a comment
There was a problem hiding this comment.
Looks good, although I would insists on having some representative benchmarks for the baseline before merging any EVM changes.
The earlier we catch performance regressions or improvements the better.
| #[doc = "Convert from u8 to the given enum"] | ||
| pub fn from_u8(value: u8) -> Option<Self> { | ||
| match value { | ||
| $( $discriminator => Some($variant) ),+, |
There was a problem hiding this comment.
Are we sure it's equally performant to borrowing from the static array?
There was a problem hiding this comment.
Instruction is repr(u8), and here we use match to convert a raw u8 to an repr(u8). We're still borrowing instruction info from the INSTRUCTIONS static array, in info(&self).
Anyway I'll try a benchmark on the new code.
| if let Some(instruction) = instruction { | ||
| if instruction == instructions::JUMPDEST { | ||
| jump_dests.insert(position); | ||
| } else if instruction.is_push() { |
There was a problem hiding this comment.
Maybe change that to if let Some(xx) = instructions.push_bytes() { ?
| while reader.position < code.len() { | ||
| let instruction = code[reader.position]; | ||
| let opcode = code[reader.position]; | ||
| let instruction = Instruction::from_u8(opcode); |
There was a problem hiding this comment.
Might be worth mapping the code to Instructions during the first phase along with jumpdestination analysis.
There was a problem hiding this comment.
The current issue is that we only calculate valid jump destinations upon first JUMP instruction, so on L129 that info may not be available.
I'll try and see (probably in a future PR) whether incrementally calculating jumpdests along with the exec loop would be possible, or another idea is to store jumpdest result to the db -- that info is static and never changes for a particular code array.
|
Here's the result got from This PR: Master Branch: Looks like somehow the conversion to enum makes it run just slightly faster.. Don't know whether this is representative enough. We can also try some benchmarks using some of the jsontests. |
|
I got some benchmarks using #8944. Currently, running the comparison requires manually checking out two branches so it's not hassle-free. Another issue is that many tests only run for a really short time, and I cannot find a reliable way to make some of the results really comparable. Below is a tuple of tests that run faster in this PR, 0-10% slower, 10-20% slower, 20-30% slower, 30-40% slower, 40-50% slower or 50%+ slower: This tuple varies a lot each time I run the tests and gather the csv. In particular, the performance tests in vm folder might be representive: The value is Those results are gathered by running |
|
@sorpaas Great, that's good enough for me. I think the performance tests and perhaps |
jimpo
left a comment
There was a problem hiding this comment.
Nice refactor! (and speedup)
| GasPriceTier::Invalid | ||
| impl GasPriceTier { | ||
| /// Returns the index in schedule for specific `GasPriceTier` | ||
| pub fn idx(&self) -> usize { |
There was a problem hiding this comment.
Could these also be numbered with a C-style enum instead of having the separate mapping method?
There was a problem hiding this comment.
That "c-style enum" is actually just GasPriceTier. :)
We still need this idx function. It's used in schedule, where we dispatch based on the index to know about the default gas.
There was a problem hiding this comment.
Ah, didn't realize the values were assigned implicitly. In that case, seems like this method could just be implemented as *self as usize instead of the match, or inlined.
There was a problem hiding this comment.
I'm not really sure about that -- the performance is mostly same, but I'm a little bit concerned about clarity.
Unlike Instruction, We don't have enum-to-number natural corresponding for GasPriceTier. I tried adding the index to enum and make GasPriceTier a c-like enum. The issue, however, is that there're other number corresponding for GasPriceTier (like default gas cost). And the result might be a little bit confusing.
| instructions::LOG0...instructions::LOG4 => { | ||
| let no_of_topics = instructions::get_log_topics(instruction); | ||
| instructions::LOG0 | instructions::LOG1 | instructions::LOG2 | instructions::LOG3 | instructions::LOG4 => { | ||
| let no_of_topics = instruction.log_topics().expect("log_topcis always return some for LOG* instructions; qed"); |
| let requirements = gasometer.requirements(ext, instruction, info, &stack, self.mem.size())?; | ||
| if do_trace { | ||
| ext.trace_prepare_execute(reader.position - 1, instruction, requirements.gas_cost.as_u256()); | ||
| ext.trace_prepare_execute(reader.position - 1, instruction as u8, requirements.gas_cost.as_u256()); |
There was a problem hiding this comment.
nit: Could use opcode instead of instruction as u8.
| if info.tier == instructions::GasPriceTier::Invalid { | ||
| return Err(vm::Error::BadInstruction { | ||
| instruction: instruction | ||
| instruction: instruction as u8 |
There was a problem hiding this comment.
nit: Could use opcode instead of instruction as u8.
There was a problem hiding this comment.
We don't have opcode variable in verify_instruction. Because instruction as u8 is basically a noop, I think this would be okay?
|
Please rebase on latest master to fix CI. |
…rpaas/evm-instructions
…rp_sync_on_light_client * 'master' of https://github.com/paritytech/parity: Refactor evm Instruction to be a c-like enum (openethereum#8914)
* master: Refactor evm Instruction to be a c-like enum (#8914) Fix deadlock in blockchain. (#8977) snap: downgrade rust to revision 1.26.2, ref snapcraft/+bug/1778530 (#8984) Use local parity-dapps-glue instead of crate published at crates.io (#8983) parity: omit redundant last imported block number in light sync informant (#8962) Disable hardware-wallets on platforms that don't support `libusb` (#8464) Bump error-chain and quick_error versions (#8972)
* master: Refactor evm Instruction to be a c-like enum (#8914) Fix deadlock in blockchain. (#8977) snap: downgrade rust to revision 1.26.2, ref snapcraft/+bug/1778530 (#8984) Use local parity-dapps-glue instead of crate published at crates.io (#8983) parity: omit redundant last imported block number in light sync informant (#8962) Disable hardware-wallets on platforms that don't support `libusb` (#8464) Bump error-chain and quick_error versions (#8972)
rel #6744
This allows some more type checking for the type, and also allows to put functions in a more Rust-y way.
from_u8usingmatch. A small part of this marco is fromenum_derivecrate. But that crate defines other functions (likefrom_u64, etc) which we don't use.INSTRUCTIONStable is made private, and it's accessed fromInstruction::info(&self).exec_stack_instructiontoexec_instruction. This allows Rust to type check match arms, and refuse to compile if new opcode is added, but its execution logic is not defined.