cmd, core, eth/tracers: support fancier js tracing by karalabe · Pull Request #15516 · ethereum/go-ethereum

karalabe · 2017-11-18T12:24:09Z

The first feature of this PR expands our JavaScript tracing capabilities with built in tracers. Tracers supported out of the box:

// noopTracer is just the barebone boilerplate code required from a JavaScript
// object to be usable as a transaction tracer.
debug.traceTransaction(txHash, {tracer: "noopTracer"})

// opcountTracer is a sample tracer that just counts the number of instructions
// executed by the EVM before the transaction terminated.
debug.traceTransaction(txHash, {tracer: "opcountTracer"})

// callTracer is a full blown transaction tracer that extracts and reports all
// the internal calls made by a transaction, along with any useful information.
debug.traceTransaction(txHash, {tracer: "callTracer"})

// prestateTracer outputs sufficient information to create a local execution of
// the transaction from a custom assembled genesis block.
debug.traceTransaction(txHash, {tracer: "prestateTracer"})

// 4byteTracer searches for 4byte-identifiers, and collects them for post-processing.
// It collects the methods identifiers along with the size of the supplied data, so
// a reversed signature can be matched against the size of the data.
debug.traceTransaction(txHash, {tracer: "4byteTracer"})

// evmdisTracer returns sufficent information from a trace to perform evmdis-style
// disassembly.
debug.traceTransaction(txHash, {tracer: "evmdisTracer"})

To add new tracers, create a file called my_awesome_tracer.js in eth/tracers/intrenal/tracers based on the above example tracers, run go generate ./eth/tracers/..., build Geth and from the console call debug.traceTransaction(txHash, {tracer: "myAwesomeTracer"}).

The second feature-set reworks the tracing API endpoints.

It simplifies the return value of a block trace so its's not a {validated, structlogs, error}, rather simply tracer, error. I.e. if the block fails validation, why not return it in the error? Also, the return type needed to change because with JavaScript tracers, we can have not only StructLogs as return types. The new format is consistent with traceTransaction, just instead of a single result, returns an array.

The commit also introduces a traceChain endpoint, which can trace transactions from multiple blocks. Since that's potentially a very long running operation (hours) and can also result in huge amounts of data, this endpoint is not a plain API call, rather a subscription:

$ nc -U /work/temp/rinkeby/geth.ipc
{"id": 1, "method": "debug_subscribe", "params": ["traceChain", "0x0", "0xffff", {"tracer": "callTracer"}]}

{"jsonrpc":"2.0","id":1,"result":"0xe1deecc4b399e5fd2b2a8abbbc4624e2"}
{"jsonrpc":"2.0","method":"debug_subscription","params":{"subscription":"0xe1deecc4b399e5fd2b2a8abbbc4624e2","result":{"block":"0x37","hash":"0xdb16f0d4465f2fd79f10ba539b169404a3e026db1be082e7fd6071b4c5f37db7","traces":[{"from":"0x31b98d14007bdee637298086988a0bbd31184523","gas":"0x0","gasUsed":"0x0","input":"0x","output":"0x","time":"1.077µs","to":"0x2ed530faddb7349c1efdbf4410db2de835a004e4","type":"CALL","value":"0xde0b6b3a7640000"}]}}}
{"jsonrpc":"2.0","method":"debug_subscription","params":{"subscription":"0xe1deecc4b399e5fd2b2a8abbbc4624e2","result":{"block":"0xf43","hash":"0xacb74aa08838896ad60319bce6e07c92edb2f5253080eb3883549ed8f57ea679","traces":[{"from":"0x31b98d14007bdee637298086988a0bbd31184523","gas":"0x0","gasUsed":"0x0","input":"0x","output":"0x","time":"1.568µs","to":"0xbedcf417ff2752d996d2ade98b97a6f0bef4beb9","type":"CALL","value":"0xde0b6b3a7640000"}]}}}
{"jsonrpc":"2.0","method":"debug_subscription","params":{"subscription":"0xe1deecc4b399e5fd2b2a8abbbc4624e2","result":{"block":"0xf47","hash":"0xea841221179e37ca9cc23424b64201d8805df327c3296a513e9f1fe6faa5ffb3","traces":[{"from":"0xbedcf417ff2752d996d2ade98b97a6f0bef4beb9","gas":"0x4687a0","gasUsed":"0x12e0d","input":"0x6060604052341561000c57fe5b5b6101828061001c6000396000f30060606040526000357c0100000000000000000000000000000000000000000000000000000000900463ffffffff168063230925601461003b575bfe5b341561004357fe5b61008360048080356000191690602001909190803560ff1690602001909190803560001916906020019091908035600019169060200190919050506100c5565b604051808273ffffffffffffffffffffffffffffffffffffffff1673ffffffffffffffffffffffffffffffffffffffff16815260200191505060405180910390f35b6000600185858585604051806000526020016040526000604051602001526040518085600019166000191681526020018460ff1660ff1681526020018360001916600019168152602001826000191660001916815260200194505050505060206040516020810390808403906000866161da5a03f1151561014257fe5b50506020604051035190505b9493505050505600a165627a7a7230582054abc8e7b2d8ea0972823aa9f0df23ecb80ca0b58be9f31b7348d411aaf585be0029","output":"0x60606040526000357c0100000000000000000000000000000000000000000000000000000000900463ffffffff168063230925601461003b575bfe5b341561004357fe5b61008360048080356000191690602001909190803560ff1690602001909190803560001916906020019091908035600019169060200190919050506100c5565b604051808273ffffffffffffffffffffffffffffffffffffffff1673ffffffffffffffffffffffffffffffffffffffff16815260200191505060405180910390f35b6000600185858585604051806000526020016040526000604051602001526040518085600019166000191681526020018460ff1660ff1681526020018360001916600019168152602001826000191660001916815260200194505050505060206040516020810390808403906000866161da5a03f1151561014257fe5b50506020604051035190505b9493505050505600a165627a7a7230582054abc8e7b2d8ea0972823aa9f0df23ecb80ca0b58be9f31b7348d411aaf585be0029","time":"658.529µs","to":"0x5481c0fe170641bd2e0ff7f04161871829c1902d","type":"CREATE","value":"0x0"}]}}}
{"jsonrpc":"2.0","method":"debug_subscription","params":{"subscription":"0xe1deecc4b399e5fd2b2a8abbbc4624e2","result":{"block":"0xfff","hash":"0x254ccbc40eeeb183d8da11cf4908529f45d813ef8eefd0fbf8a024317561ac6b"}}}

The API will stream back one RPC notification per non-empty block. An exception is the very last block, which will be reported even if empty so the user knows the stream is done.

Furthermore, the commit makes individual block tracing concurrent in the transactions (limited to num cores) and also makes chain tracing concurrent in the blocks (limited to num cores).

A further important functionality is transitioning from ottovm to duktape as our JavaScript engine for the tracers. This nets us a 5x performance increase in exchange of a much looser integration between Go <-> JavaScript types. This is not something we want to do for the console, but it's something completely acceptable for the tracer where we don't want to cross over generic types anyway.

holiman · 2017-11-18T12:53:42Z

Could we have block number aswell? Relevant since blocknumber determines the 'ruleset' with regards to forks

It's in there in a crude way https://github.com/karalabe/go-ethereum/blob/d2540d978e9e93b926d6b2183c5e1dac5a62f1ba/eth/tracers/tracer.go#L517.

We can rework it in a followup PR that pulls in all the execution context (parent block, current block, chain config).

holiman · 2017-11-18T12:57:21Z

This is maybe correct, using the same check for evm.depth . I think it's a bit unintuitive, though -- would it be possible to put a defer:ed call in the clause above where you call CaptureStart?

holiman · 2017-11-18T12:59:11Z

you forgot STATICCALL .. ?

It might even be possible to just check the first nybble for the 'call' class of opcodes.

holiman · 2017-11-18T13:00:09Z

You can't assume all traces will be on byzantium!

But the precompiles that don't exist won't be called on non-byzantium chains, and if they are, won't do anything interesting.

holiman · 2017-11-18T13:02:51Z

Maybe you should call the type calltype/createtype or something, instead of CALL/CREATE, since the opcodes CALL/CREATE is not really involved.

holiman · 2017-11-18T13:04:00Z

Also staticcall

holiman · 2017-11-18T13:05:57Z

So there are two cases here:

log.depth == : call failed or there was no code there. Check last stack item to find out which case it was.
log.depth < : probably a value-call within static context, which dropped us back a level. Need to close two calls contexts.

holiman · 2017-11-18T13:06:52Z

Might still fail, though.. ? Reverting with too large memory, for example, will result in a common throw

holiman · 2017-11-18T13:07:45Z

Wait, you're not checking the last stack item if the call was successfull or not? What if it threw?

holiman · 2017-11-20T09:14:23Z

I think there's something wrong with the nesting.

Testing with this one:

Etherscan:

https://etherscan.io/tx/0x6bcf8c5abc7e530abaca039fdd7e8d9c7620bcb57b2c0ad89b91f56d4ca928e9

Contract 0x4f2a0bd524d2748504c0047b79da6f86595fdd4c  
  TRANSFER  0.202126112178803008 Ether  to  0x209c4784ab1e8183cf58ca33cb740efbf3fc18ef
  TRANSFER  0.202126112178803008 Ether  to  0x32be343b94f860124dc4fee278fdcbd38c102d88

> var twointernals="0x6bcf8c5abc7e530abaca039fdd7e8d9c7620bcb57b2c0ad89b91f56d4ca928e9"
undefined
> x = debug.traceTransaction(twointernals, {tracer: "callTracer"})
{
  calls: [{
      calls: [{...}],
      from: "0x4f2a0bd524d2748504c0047b79da6f86595fdd4c",
      gas: "0x51e7",
      gasUsed: "0x24c8",
      input: "0x",
      output: "0x",
      to: "0x209c4784ab1e8183cf58ca33cb740efbf3fc18ef",
      type: "CALL",
      value: "0x2ce18a0cc3f9140"
  }],
  from: "0x52bc44d5378309ee2abf1539bf71de1b7d7be3b5",
  gas: "0x7148",
  gasUsed: "0x4850",
  input: "0x",
  output: "0x",
  time: "6.007043ms",
  to: "0x4f2a0bd524d2748504c0047b79da6f86595fdd4c",
  type: "CALL",
  value: "0x2ce18a0cc3f9140"
}
> x.calls[0].calls[0].to
"0x32be343b94f860124dc4fee278fdcbd38c102d88"
> x.calls[0].to
"0x209c4784ab1e8183cf58ca33cb740efbf3fc18ef"

It looks like the last one (to 0x32..) is nested within the first one, which I don't think is the case. They're two consecutive internal calls, IIUC etherscan correctly.

EDIT: I've looked more closely, I think it's actually correct, and etherscan shows it erroneously.

holiman · 2017-11-27T10:04:27Z

A couple of comments...

Would save some space if empty or null-results were not returned.
Would be great if closing the channel (from caller side) terminated the tracing

holiman · 2017-11-27T10:06:19Z

Also, I'd love to have some more data in there:

blocknumber
gaslimit

And have access to the database in the end, at result

#git diff internal/ethapi/tracer.go
diff --git a/internal/ethapi/tracer.go b/internal/ethapi/tracer.go
index b7fef2c..a56cfc9 100644
--- a/internal/ethapi/tracer.go
+++ b/internal/ethapi/tracer.go
@@ -354,6 +354,10 @@ func (jst *JavascriptTracer) CaptureState(env *vm.EVM, pc uint64, op vm.OpCode,
                jst.log["depth"] = depth
                jst.log["account"] = contract.Address()
 
+               //A bit of a hack - should be placed in CaptureStart/CaptureEnd, but there's no evm in those methods
+               jst.ctx["blocknumber"] = env.BlockNumber.Uint64()
+               jst.ctx["gasLimit"] = env.GasLimit
+
                delete(jst.log, "error")
                if err != nil {
                        jst.log["error"] = err
@@ -375,7 +379,7 @@ func (jst *JavascriptTracer) CaptureEnd(output []byte, gasUsed uint64, t time.Du
                jst.ctx["error"] = err.Error()
        }
        ctxvalue, _ := jst.vm.ToValue(jst.ctx)
-       jst.result, jst.err = jst.callSafely("result", ctxvalue)
+       jst.result, jst.err = jst.callSafely("result", ctxvalue, jst.dbvalue)
        if jst.err != nil {
                jst.err = wrapError("result", jst.err)
        }

Arachnid · 2017-12-20T11:46:04Z

Nit: 'procuded'

Arachnid · 2017-12-20T11:46:56Z

Couldn't you return an error here?

Well yes, but why bother if it's truly not implemented?

It's just that panicing when you could return an error seems like a generally bad idea.

Fair enough, given that it's live code (even if tracing), it's better not to panic. I'll fix.

Arachnid · 2017-12-20T11:48:27Z

Maybe add something like "this allows us to discard entries that are no longer referenced from the current state"

Arachnid · 2017-12-20T11:50:38Z

Isn't this a fast-forward, not a rewind?

I've renamed it to config.Reexec. It's arguable a bit better. I'm not really fond of fast forward, as it doesn't really convey that it's used to regenerate missing state.

Arachnid · 2017-12-20T11:54:53Z

This seems like a near duplicate of the chain tracing goroutine. Could these be extracted into their own function?

I don't think the duplication here is that much and I'd rather keep the code easier to read in exchange of a bit of copy-paste.

Ok, I've deduplicated some minor code by using computeStateDB in traceChain too, but in the end I've reverted it because currently I can interrupt historical chain generation if I close the connection, but this requires returning a subscription to the client before actually doing any data processing. That being said, I can still tell the subscribe fast if historical state is missing even with fast forwarding.

This means that I would need to split computeStateDB into two separate methods, one that looks up if we have a block available, and one that regenerates the state. Furthermore it would also require reworking the chain tracing so that the second phase still happens concurrently on a live subscription. Even if I were to do that however, the chain tracing still needs to process blocks sequentially, so it still needs all that duplicated functionality within itself anyway.

It may be doable the other way around, by using traceChain instead of computeStateDB in the other methods, but that seems a bit messy to spin up all the bells and whistles for a limited subset of the functionality.

If you have some specific ideas where you think this might be simplified I'm all ears, but I don't see any too low hanging fruits where we can dedup without making the code convoluted.

Arachnid · 2017-12-20T11:55:43Z

This also looks like it duplicates functionality in the chain tracer.

Yes, this is duplicated a bit. It's not trivial cleaning it up due to minor differences internally, but I'll try to make an attempt. It would make messy code appear only once.

Arachnid · 2017-12-20T11:59:03Z

Won't making the size part of the returned key make it hard to match against signatures with variable length inputs?

@holiman PTAL at this one

Removing the size is trivial during postprocessing, if you want that. The reverse problem is that if you don't know the selector for 0xdeadfeed, and want to brute-force it based on a word-list of methodnames, it definitely helps to know that the input is e.g. 32 bytes, so you can ignore some common args like (uint,uint).

So iirc, this one spits out ["0xdeadfeed-32", ...]. Makes it also possible to distinguish between actual call data and just 'data', which does not have to align to 32-byte boundaries.

Arachnid · 2017-12-20T12:00:49Z

Since 'op' will be undefined if it's not an 0xf* opcode, why not put the entire following section in an if, or break out early?

op is accessed various places and there are code segments in between that do not depend on op and still need to run even if op is undefined.

op will be undefined if syscall is false, so you could at least simplify the following checks to not check both.

My hunch was that evaluating whether syscall is false or not is a boolean operation and should be faster than mucking around with op. Running both codes for the start of Ropsten, there's indeed a slight performance increase for the current code versus the one that doesn't check for syscall:

Check both:

start=0 end=36863 current=22126 transactions=5027 elapsed=1m21.076843668s

vs. check only op:

start=0 end=36863 current=21778 transactions=4844 elapsed=1m20.62278927s

Not much, about 4% difference.

karalabe · 2017-12-20T16:03:43Z

@Arachnid I think I've fixed or replied to all your comments, PTAL.

karalabe · 2017-12-20T18:16:10Z

@cdetrio Please check whether this breaks any fuzzers or other tools you have for cross client comparisons. If yes, please provide a way to repro.

holiman

I think this looks good enough to merge: most of it is new functionality, and we can work out kinks in the implementation after merge.

From what I can tell, this should not break any core functionality. It may very well change the way evm --json works, but I think we can handle that.

There are a few outstanding comments, though, e.g. my request for block number in the CaptureStart. It's not critical to get that in, though, but if you have any particular reason not to add it, please answer the comment so I know (otherwise I may add it myself in a later PR).

I'm ok with merging this so it doesn't bitrot over christmas.

karalabe · 2017-12-21T08:42:05Z

I think the block number is already part of "ctx", it was you who added it in the first run of "step".

That being said, I was also pondering about pushing in the entire parent block, current block, chain config as a contexts to the tracers to allow assembling more complete "prestate" traces. Since these would require a bit of meshing and would also cover the block number, I'd just postpone "properly" doing it in a followup PR.

I would also like to support running multiple tracers at once somehow, so if we end up with 2-3-4 really good tracers eventually, we could get them all listen on etherscan without them having to trace the entire chain 4 times over.

cdetrio · 2017-12-21T09:07:16Z

Trace comparisons with evmlab are all passing, so evm --json still works as expected.

karalabe · 2017-12-21T09:15:26Z

@cdetrio Sweet, thanks for the confirm!

holiman · 2017-12-21T10:08:16Z

I think the block number is already part of "ctx", it was you who added it in the first run of "step".

IIRC, that was a bit of a hack: the step should contain step-specific information. At that time, there was no start-of-execution entrypoint, so I put it there. Now that we have a "start-of-execution", we should put context information which is not per-step in there instead.

But as I said, we can fix it later if you prefer.

karalabe · 2017-12-21T10:26:12Z

I'd vote for fixing later when we add the other contextual infos too.

AyushyaChitransh · 2018-07-09T12:16:19Z

I have currently bookmarked this PR. Would love to know if this description is available in any wiki section.

* cmd, core, eth/tracers: support fancier js tracing * eth, internal/web3ext: rework trace API, concurrency, chain tracing * eth/tracers: add three more JavaScript tracers * eth/tracers, vendor: swap ottovm to duktape for tracing * core, eth, internal: finalize call tracer and needed extras * eth, tests: prestate tracer, call test suite, rewinding * vendor: fix windows builds for tracer js engine * vendor: temporary duktape fix * eth/tracers: fix up 4byte and evmdis tracer * vendor: pull in latest duktape with my upstream fixes * eth: fix some review comments * eth: rename rewind to reexec to make it more obvious * core/vm: terminate tracing using defers

karalabe requested review from Arachnid and holiman November 18, 2017 12:30

holiman reviewed Nov 18, 2017

View reviewed changes

Arachnid reviewed Dec 20, 2017

View reviewed changes

karalabe added this to the 1.8.0 milestone Dec 20, 2017

karalabe added the please review label Dec 21, 2017

holiman approved these changes Dec 21, 2017

View reviewed changes

karalabe and others added 13 commits December 21, 2017 11:31

cmd, core, eth/tracers: support fancier js tracing

b6467c1

eth, internal/web3ext: rework trace API, concurrency, chain tracing

5fb85d4

eth/tracers: add three more JavaScript tracers

3947b54

eth/tracers, vendor: swap ottovm to duktape for tracing

087b79d

core, eth, internal: finalize call tracer and needed extras

becec01

eth, tests: prestate tracer, call test suite, rewinding

4aecb6c

vendor: fix windows builds for tracer js engine

4de0342

vendor: temporary duktape fix

42d03fc

eth/tracers: fix up 4byte and evmdis tracer

e7a1b2e

vendor: pull in latest duktape with my upstream fixes

09f9917

eth: fix some review comments

1a4a187

eth: rename rewind to reexec to make it more obvious

8ce0b32

core/vm: terminate tracing using defers

9ceee7c

Arachnid approved these changes Dec 21, 2017

View reviewed changes

karalabe merged commit 5258785 into ethereum:master Dec 21, 2017

karalabe mentioned this pull request Jan 8, 2018

debug.traceTransaction is a memory hog #15826

Closed

5chdn mentioned this pull request Jan 30, 2018

Merge go-ethereum updates into 2.5.4 Musicoin/go-musicoin#77

Merged

medvedev1088 mentioned this pull request Jul 16, 2018

Add support for internal transactions blockchain-etl/ethereum-etl#53

Closed

medvedev1088 mentioned this pull request Oct 3, 2018

Schema for internal transactions blockchain-etl/ethereum-etl#72

Closed

medvedev1088 mentioned this pull request Oct 24, 2018

Add support for geth traces blockchain-etl/ethereum-etl#114

Closed

karalabe mentioned this pull request Dec 3, 2018

Internal transactions #18223

Closed

liamaharon mentioned this pull request Jan 7, 2020

Built in tracers don't seem to work ConsenSys-archive/ganache#523

Closed

saman-pasha mentioned this pull request Oct 18, 2021

sigsev when tracing with light client #23766

Closed

Conversation

karalabe commented Nov 18, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

holiman commented Nov 20, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

holiman commented Nov 27, 2017

Uh oh!

holiman commented Nov 27, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

holiman Dec 20, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

karalabe commented Nov 18, 2017 •

edited

Loading

holiman commented Nov 20, 2017 •

edited

Loading

holiman Dec 20, 2017 •

edited

Loading