MetaVM: initial base class for an extensible EVM #441

pinkiebell · 2019-02-18T18:56:42Z

Introducing the VM.MetaVM class to support developers implementing
special logic (override specific opcode handlers) along other things
for state manipulation.

Related #410

axic · 2019-02-18T21:45:44Z

I think this is somewhat conflicting/replicating #424. If it does, it would be better to join efforts. Can you have a look?

holgerd77 · 2019-02-19T08:51:46Z

@pinkiebell yes, you should definitely coordinate with @s1na. It is generally risky to start such heavy rewrites/refactoring work without (deeper) prior discussion, though this looks promising. Nevertheless this can lead to a PR being rejected at the end (due to other work conflicting/too heavy API changes/introduction of new language features which can't be supported yet/...).

pinkiebell · 2019-02-19T10:25:22Z

@axic @holgerd77
Thanks for the pointer!
I looked at the EEI PR, I guess we can merge the functionality somehow, @s1na do you have an update of the PR status?

My primary intention with this PR is the ability to overwrite opcode handlers, in a way that is most flexible for other developers to take advantage of that. This is something the EEI does not cover.

Would be nice to get a discussion/decision about this feature, as the solEVM-enforcer depends on this functionality for example.
I will fix the remaining tests and rebase the PR, as I need to use it anyway 😁

holgerd77 · 2019-02-19T10:47:30Z

Haven't had a deeper look, but I have a first impression that this is all functionality that might be better settled directly on runCode or I am wrong here?

Especially since there is a tendency (see comment) to separate this inner VM (and maybe rather call it EVM or something) and the outer processing engine.

//cc @s1na Thoughts?

pinkiebell · 2019-02-19T10:58:21Z

Haven't had a deeper look, but I have a first impression that this is all functionality that might be better settled directly on runCode or I am wrong here?

runCode would be quite overloaded if we add more callbacks, state manipulation functions and so on.
As a developer who worked with the ethereumjs-vm, I can say that the extensible class-based system is really easy to work with.

Especially since there is a tendency (see comment) to separate this inner VM (and maybe rather call it EVM or something) and the outer processing engine.

//cc @s1na Thoughts?

The MetaVM could be further split-off. I also want to refactor the other runCall etc functions but did not in this PR, because that would be too big to review at once.

for use-cases like this are becoming pretty straightforward:

class MyVM extends MetaVM {
  async handleLOG(runState) {
    await super.handleLOG(runState)

    // do some custom magic with runState.logs
  }

  async handleMUL(runState) {
    const x = runState.stack.pop()
    const y = runState.stack.pop()

    // custom logic
    runState.stack.push(xy)
  }

  async runNextStep (runState) {
    // I can also modify runState for each execution step
    // And run the thing...
    await super.runNextStep(runState)
    // custom logic here? :)
  }
}

Introducing the `VM.MetaVM` class to support developers implementing special logic (override specific opcode handlers) along other things for state manipulation. Related #410

dist-files are now using some core-js polyfills

s1na · 2019-02-19T14:18:11Z

Thanks for the PR. It really helps to hear pain points from users of the library. I'm sure we can join forces to satisfy you requirements somehow. In the meanwhile, a few points come to mind:

My goal with Refactor evm and prepare for ewasm integration #424 was to somewhat structure the evm-related code to allow for integration of ewasm. I thought it might help with the design if I partly integrate ewasm (Basic support for ewasm precompiles #431) first, and that's the current status.
What I had in mind is not limited to EEI and partly covers what you're trying to do as well.

I see the VM class as a connector for the high-level pieces (runBlockchain, stateManager, etc.) (despite the name it's not only about evm) and personally don't think it's a good place to have low-level evm-related code.

My current thinking on the design is to have an evm module which has an Interpreter class (which would encompass most of the current runCode), as well as one for the opcode handlers (opFns), a class for the environment and interface (EEI and stateManager), etc. Somewhat similar to geth. Do you think something like that could work?

pinkiebell · 2019-02-19T16:20:21Z

Thanks for the PR. It really helps to hear pain points from users of the library. I'm sure we can join forces to satisfy you requirements somehow. In the meanwhile, a few points come to mind:

My goal with Refactor evm and prepare for ewasm integration #424 was to somewhat structure the evm-related code to allow for integration of ewasm. I thought it might help with the design if I partly integrate ewasm (Basic support for ewasm precompiles #431) first, and that's the current status.

What I had in mind is not limited to EEI and partly covers what you're trying to do as well.

I see the VM class as a connector for the high-level pieces (runBlockchain, stateManager, etc.) (despite the name it's not only about evm) and personally don't think it's a good place to have low-level evm-related code.

My current thinking on the design is to have an evm module which has an Interpreter class (which would encompass most of the current runCode), as well as one for the opcode handlers (opFns), a class for the environment and interface (EEI and stateManager), etc. Somewhat similar to geth. Do you think something like that could work?

Sounds great and is probably the same I intended for the MetaVM, except the EEI, stateManager and the high-level connector stuff.
You can think of the MetaVM as the Interpreter class (in this PR) and I also planned to move opFns directly into MetaVM, right now I have this little prototype wrapper (https://github.com/ethereumjs/ethereumjs-vm/pull/441/files#diff-7225c1b45dd3c5509c3db9d4aa729535R268)

holgerd77 · 2019-02-20T10:59:01Z

@s1na Not sure, haven't looked too deeply into it. Is it now possible to leave out the eWASM VM implementation stuff from #431 and just extract the agnostic outer structure where both eWASM and EVM can be plugged in and use this to continue here?

holgerd77 · 2019-02-20T11:02:53Z

@s1na @pinkiebell I would agree with Sina and see the VM as a high-level connector class and rather not go up to some meta VM but down to runCode for the refactoring work.

holgerd77 · 2019-02-28T09:20:22Z

How can we proceed here? This is a great and work-intensive PR and we should spent some time to think about possible integration paths. @s1na can you identify parts of the code here which can be integrated within the structure you have got in mind? Is it possible that you do some structural preparation, e.g. by opening a PR just introducing the relevant structural updates, and then you and @pinkiebell can work together on the PR?

holgerd77 · 2019-02-28T09:21:45Z

Side notes for @pinkiebell:

Sina will be on leave until early next week, so there probably won't be an imminent answer
If you both are going to EthCC (Sina does, I am not), it would probably make very much sense to meet! 😄

s1na · 2019-03-03T16:55:54Z

Sorry for being inactive here for some time. This topic is definitely on my mind and I'm thinking about how best we can proceed here. It would be much easier if we could meet at ethcc to discuss in person, @pinkiebell let me know if you'll be there.

Regarding the VM and higher-level stuff: In the current design the library has a single entry point (VM), and all the exposed code is exported as methods of VM. This is something that I believe can be improved in the modernisation effort (probably longer-term though), by exporting a variety of classes instead of only one. This would be much more flexible and modular. I'm not sure yet what these classes could be, and any ideas are welcome.

I think we can discuss the design in #456, then use this PR as inspiration, and introduce multiple small subsequent PRs to merge the logic gradually. This is also what I've been doing with #424, I only started drafting a structure there, and have been trying to gradually merge ideas from that in smaller PRs (e.g. #442, #425). I'll start working on a rough sketch for the structure that we can use as a starting point for discussion.

pinkiebell · 2019-03-03T22:36:35Z

Sorry for being inactive here for some time. This topic is definitely on my mind and I'm thinking about how best we can proceed here. It would be much easier if we could meet at ethcc to discuss in person, @pinkiebell let me know if you'll be there.

I'm not attending ethcc, but @johannbarbie and maybe others from leapDAO going to be there ;).

Regarding the VM and higher-level stuff: In the current design the library has a single entry point (VM), and all the exposed code is exported as methods of VM. This is something that I believe can be improved in the modernisation effort (probably longer-term though), by exporting a variety of classes instead of only one. This would be much more flexible and modular. I'm not sure yet what these classes could be, and any ideas are welcome.

Yes, if we could have something like a Memory, Stack etc classes to extend would be nice.
I think it makes sense to move that to a new 'umbrella' class though, in a sense that we don't change/introduce something new to the 'old' interface (VM).

I think we can discuss the design in #456, then use this PR as inspiration, and introduce multiple small subsequent PRs to merge the logic gradually. This is also what I've been doing with #424, I only started drafting a structure there, and have been trying to gradually merge ideas from that in smaller PRs (e.g. #442, #425). I'll start working on a rough sketch for the structure that we can use as a starting point for discussion.

👍

holgerd77 · 2019-03-08T09:39:24Z

Hi @krzkaczor if you could join the discussion here with all your experience on VM prototyping and refactoring that would be really valuable! 😃

s1na · 2019-03-11T13:20:14Z

I went through your code in more detail and here's how I suggest we proceed here (as a first step):

Leave VM mostly untouched for now.
Move MetaVM to lib/vm and rename to Interpreter (won't be a base class for VM anymore).
runCode only instantiates an Interpreter and runs it and parses the results (parseVmResults). For emitting events, Interpreter should accept a stepHook function and call it in runStep.
Interpreter.runStep calls a method getOpHandler (or something like execOp), which gets the handler from opFns.js instead of adding handler functions directly to the Interpreter prototype. For custom handlers users can extend the Interpreter and override this function. This leaves us somewhat more flexible for coming up with designs in future.
Make preprocessValidJumps a private method of Interpreter.
Refactor stack manipulation in evm #460 has changed how the stack operates in the vm, but I think after rebasing only a few minor modifications would be required.

Please feel free to discuss any of them, if you disagree.

pinkiebell · 2019-03-11T22:45:07Z

I went through your code in more detail and here's how I suggest we proceed here (as a first step):

Leave VM mostly untouched for now.

Move MetaVM to lib/vm and rename to Interpreter (won't be a base class for VM anymore).

runCode only instantiates an Interpreter and runs it and parses the results (parseVmResults). For emitting events, Interpreter should accept a stepHook function and call it in runStep.

Interpreter.runStep calls a method getOpHandler (or something like execOp), which gets the handler from opFns.js instead of adding handler functions directly to the Interpreter prototype. For custom handlers users can extend the Interpreter and override this function. This leaves us somewhat more flexible for coming up with designs in future.

Make preprocessValidJumps a private method of Interpreter.

Refactor stack manipulation in evm #460 has changed how the stack operates in the vm, but I think after rebasing only a few minor modifications would be required.

Please feel free to discuss any of them, if you disagree.

Sounds good. The only thing we have to consider is the event emitting stuff like in runCall.
As of now it expects a (this = VM).emit function.
If we change the design to the Interpreter class and VM does not extend Interpreter, we run into problems.

We can bubble-up events but that is ugly by design, we can introduce a Intepreter-delegate class,
or just class VM extends Interpreter. Thoughts? 🙂

s1na · 2019-03-12T09:10:17Z

Hmm, good point 🤔

I think we eventually have to pass the VM object to Interpreter, e.g. for initializing runState (hopefully we can improve on this later), and runState has a _vm property which can be used to call runCall (in opFns). I agree it's not the cleanest design, but we'll iterate on this 😃

One reason I'm resisting a direct coupling between VM (maybe a better name for VM would be state execution engine), is that we soon want to integrate ewasm, which will have an interpreter of its own.

holgerd77 · 2019-03-12T09:37:12Z

If I'm not mistaken in my thinking/analysis I think it should be the other way around, the Interpreter + execution engine should be passed as an option to the outer processing engine (atm called: the VM), similar to the state manager.

If someone wants to use events/do other modifications one is passing ones own inner VM, otherwise it gets instantiated with the EVM interpreter as default. Does this make sense?

s1na · 2019-03-12T10:10:51Z

Maybe an Interpreter can be passed to runCode as an optional opt. If it's undefined, runCode would instantiate a new one. But the way things are currently, Interpreter.initRunState would also need access to the VM object. Maybe runCode could pass the vm to initRunState (technically pass self, since runCode is defined on VM). @holgerd77 what do you think?

I'll be thinking about how we can refactor runState.

s1na · 2019-03-12T11:52:47Z

Initial impression (these can be done in later PRs):

Limit runState to internal evm state:
- pc, gasLeft, code, etc.
- stack and memory objects
- a nested results object which includes returnValue, stopped, logs, `gasRefund, etc.
In addition to this runState, opcode handlers should only have access to a restricted API (EEI) for fetching message, tx and block context, reading/writing state, etc. However, CALL opcodes change the message context, so the Interpreter should take care of updating the EEI before calls.

I'm now thinking whether it makes sense to have Interpreter only deal with a new object calledMessage, and move the actual opcode handling loop, stack, memory to Message. Initially and on calls, Interpreter would create a new Message object, call execute on it, and pass the results to parent message. In other words, Interpreter is for the transaction level (in this case maybe another name would be clearar), Message is for the message level. Does that make sense?

holgerd77 · 2019-03-12T12:31:45Z

This limitation of runState to just the internal EVM state makes super-much sense together with some well-defined and restricted access ways to the outer environment/processing through the EEI. Also already had some look couple of days ago what actually went inside runState over time and how it is logically clustered.

holgerd77 · 2019-03-12T12:41:58Z

Can you try some concept-grasping outline what the different tasks of the Interpreter class are respectively what they could be, to get a bit clearer picture of the different responsibilities?

I have a rough feeling that this Message concept goes in a good direction but also can't fully grasp the different module responsibilities and interplay between them, maybe we want to elaborate on this a bit more, at some point it might also be helpful to sketch something on paper.

s1na · 2019-03-19T09:33:52Z

Hey @pinkiebell, just wanted to clarify that the last few comments are not directly for this PR and were for the future (will move the discussion to #456). Please let me know if any problem arose, or you can't work on this in the near-term future, because my work depends on this.

pinkiebell · 2019-03-19T10:29:51Z

Hey @pinkiebell, just wanted to clarify that the last few comments are not directly for this PR and were for the future (will move the discussion to #456). Please let me know if any problem arose, or you can't work on this in the near-term future, because my work depends on this.

@s1na
👋 Sorry, I'm loaded with other stuff, so I can't really turn-around the changes in the near-term
but I hope that some bits of this PR are at least a bit useful 🙂

holgerd77 · 2019-03-19T10:42:48Z

@pinkiebell thanks for the work already done! 😀 So is it ok if we directly copy over parts of your code? Sorry that you don't get direct attribution then. 😕

pinkiebell · 2019-03-19T10:47:16Z

@pinkiebell thanks for the work already done! 😀 So is it ok if we directly copy over parts of your code? Sorry that you don't get direct attribution then. 😕

@holgerd77
👋No problem 🙂. One thing to note tho: The babel changes I introduced needs some more investigation because I just stumbled yesterday on a missing regenerator-runtime on a fresh project.
Just a dependency issue I guess, if you use the async/await stuff.

s1na · 2019-03-19T10:51:51Z

@pinkiebell as Holger said, your work is definitely valuable and I'm really grateful :) Sorry it turned out this way, as you might have noticed we're in the middle of an overhaul...

I hope I can still ping you for comments, specially for the evm debugging and state manipulation parts.

missing regenerator-runtime

Yeah I've also faced that in another library, I think adding @babel/runtime to dependencies would solve this.

pinkiebell · 2019-03-19T11:45:39Z

I hope I can still ping you for comments, specially for the evm debugging and state manipulation parts.

👍 Of course you can 😄

holgerd77 · 2020-01-20T17:50:00Z

Superseded by the refactoring work by @s1na in #424 and following work, will close.

pinkiebell added 5 commits February 19, 2019 12:23

MetaVM: initial base class for an extensible EVM

20d17d0

Introducing the `VM.MetaVM` class to support developers implementing special logic (override specific opcode handlers) along other things for state manipulation. Related #410

regen docs & add lib/metaVM.js

69acd8d

lint

68e735e

fix tests and upgrade to babel 7

4a3d574

dist-files are now using some core-js polyfills

fix tests?

7bf0ab1

remove direct use of self.run{Call,JIT,Code}

6222b82

holgerd77 mentioned this pull request Feb 28, 2019

Major Version Release Planning / Refactor Strategy / v3 and v4 Releases #455

Closed

11 tasks

pinkiebell added 5 commits March 1, 2019 14:27

runCall: check for self.emit

ff44a36

lint

fe113ad

omg, I'm stupid

5e84f68

move self.emit checks

218d52c

lint

d6746e5

s1na mentioned this pull request Apr 1, 2019

Use EEI methods in opFns #486

Merged

holgerd77 closed this Jan 20, 2020

evertonfraga added the package: vm label Feb 27, 2020

s1na mentioned this pull request May 18, 2020

VM: Run transaction with custom EVM/Interpreter / API Discussion #636

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MetaVM: initial base class for an extensible EVM #441

MetaVM: initial base class for an extensible EVM #441

pinkiebell commented Feb 18, 2019

axic commented Feb 18, 2019

holgerd77 commented Feb 19, 2019

pinkiebell commented Feb 19, 2019

holgerd77 commented Feb 19, 2019

pinkiebell commented Feb 19, 2019

s1na commented Feb 19, 2019

pinkiebell commented Feb 19, 2019

holgerd77 commented Feb 20, 2019

holgerd77 commented Feb 20, 2019

holgerd77 commented Feb 28, 2019

holgerd77 commented Feb 28, 2019

s1na commented Mar 3, 2019

pinkiebell commented Mar 3, 2019

holgerd77 commented Mar 8, 2019

s1na commented Mar 11, 2019

pinkiebell commented Mar 11, 2019

s1na commented Mar 12, 2019

holgerd77 commented Mar 12, 2019

s1na commented Mar 12, 2019

s1na commented Mar 12, 2019

holgerd77 commented Mar 12, 2019

holgerd77 commented Mar 12, 2019

s1na commented Mar 19, 2019

pinkiebell commented Mar 19, 2019

holgerd77 commented Mar 19, 2019

pinkiebell commented Mar 19, 2019

s1na commented Mar 19, 2019

pinkiebell commented Mar 19, 2019

holgerd77 commented Jan 20, 2020

MetaVM: initial base class for an extensible EVM #441

MetaVM: initial base class for an extensible EVM #441

Conversation

pinkiebell commented Feb 18, 2019

axic commented Feb 18, 2019

holgerd77 commented Feb 19, 2019

pinkiebell commented Feb 19, 2019

holgerd77 commented Feb 19, 2019

pinkiebell commented Feb 19, 2019

s1na commented Feb 19, 2019

pinkiebell commented Feb 19, 2019

holgerd77 commented Feb 20, 2019

holgerd77 commented Feb 20, 2019

holgerd77 commented Feb 28, 2019

holgerd77 commented Feb 28, 2019

s1na commented Mar 3, 2019

pinkiebell commented Mar 3, 2019

holgerd77 commented Mar 8, 2019

s1na commented Mar 11, 2019

pinkiebell commented Mar 11, 2019

s1na commented Mar 12, 2019

holgerd77 commented Mar 12, 2019

s1na commented Mar 12, 2019

s1na commented Mar 12, 2019

holgerd77 commented Mar 12, 2019

holgerd77 commented Mar 12, 2019

s1na commented Mar 19, 2019

pinkiebell commented Mar 19, 2019

holgerd77 commented Mar 19, 2019

pinkiebell commented Mar 19, 2019

s1na commented Mar 19, 2019

pinkiebell commented Mar 19, 2019

holgerd77 commented Jan 20, 2020