Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Receipts, return values, and events #65

Closed
adlerjohn opened this issue Feb 12, 2021 · 9 comments · Fixed by #196
Closed

Receipts, return values, and events #65

adlerjohn opened this issue Feb 12, 2021 · 9 comments · Fixed by #196
Assignees
Labels
comp:FVM Component: FuelVM enhancement New feature or request

Comments

@adlerjohn
Copy link
Contributor

adlerjohn commented Feb 12, 2021

This issue is for tracking how to handle three related concepts: receipts (did this transaction succeed or fail, and why), return values (returning data from the VM up all the way to an external process), and events (logs that are fired by contracts on state changes, e.g. whenever a token is transferred).

Receipts

This can be accomplished entirely outside of the VM, with the block proposer appending some metadata to each transaction indicating whether its script succeeded or failed, and the failure reason based on some standard enumeration or reasons (e.g. out of gas, illegal math op, forbidden write, etc.). If the receipt doesn't match up, it can be proven fraudulent by simply running the last instruction of the trace.

Return Values

Return values from the VM up to an external process is of dubious value. Volatile in-memory data should probably never be returned in this way, and storage slots can be returned by simply querying the underlying KV store. The additional complexity of supporting variable-length return data might not be worth it.

Events

Events improve application developer ergonomics by having a commitment to all fired events in a block in the block header, filterable by topics. However, events are hugely burdensome to nodes and abused to no end for cheap indexed permanent storage. An alternative to native events would be to allow local simulation of individual transaction execution (which the VM design is intended to support) and "watching" particular storage slot changes.

Another alternative would be having a LOG opcode that does nothing in the VM itself other than accept parameters, and whole execution can be logged client-side. Whether this is then committed to in the block is outside the scope of the VM.

@adlerjohn adlerjohn added the question Further information is requested label Feb 12, 2021
@adlerjohn adlerjohn self-assigned this Feb 12, 2021
@SilentCicero
Copy link
Member

@Arachnid if you could rebuild Ethereum again, how would you have handled Receipts / Logs and Events?

Do you think we should be doing more on the log front, or just have it as a basic way to send data to the outside world, but leave it up to clients to decide how data is managed and used?

@Arachnid
Copy link

I actually think that Ethereum did event logging very well. The basic model is that a contract can log all the deltas to important information (for example, account balances on an ERC20) and then an external process can replay them to reconstruct the state. This state can include things that aren't actually stored in state at all, making it a very good match with attempts to reduce state storage.

Addressing John's points in order:

This can be accomplished entirely outside of the VM, with the block proposer appending some metadata to each transaction indicating whether its script succeeded or failed, and the failure reason based on some standard enumeration or reasons (e.g. out of gas, illegal math op, forbidden write, etc.). If the receipt doesn't match up, it can be proven fraudulent by simply running the last instruction of the trace.

I'm not quite sure what this is getting at - how would a receipt be inside the VM?

Return values from the VM up to an external process is of dubious value. Volatile in-memory data should probably never be returned in this way, and storage slots can be returned by simply querying the underlying KV store. The additional complexity of supporting variable-length return data might not be worth it.

Return values from transactions are one thing that I wish nodes did support. This is partly useful for ordinary return data, but also for propagating exception/error information effectively.

Tangentially, the receipt should definitely contain a commitment to the ending PC value, so that tooling can trivially point to where execution terminated, making it easier to debug and to associate (offchain) error messages etc.

Relying on callers understanding and fetching data from underlying storage is a bad idea. It makes abstraction impossible, and means that any standards (eg, ERC20) have to enforce how the contract stores data - that becomes part of the interface, when it should be an implementation detail. It also interacts badly with contracts that want to store data offchain, putting further burdens on clients.

Events improve application developer ergonomics by having a commitment to all fired events in a block in the block header, filterable by topics. However, events are hugely burdensome to nodes and abused to no end for cheap indexed permanent storage. An alternative to native events would be to allow local simulation of individual transaction execution (which the VM design is intended to support) and "watching" particular storage slot changes.

Events are incredibly useful! It's not necessary to store the actual events in the blocks or receipts, but there should at least be enough indexing data to be able to narrow down which blocks and transactions contain information of interest, so that indexing nodes don't need to scan every single transaction when they care about only a single contract.

A lighter-weight option compared to Ethereum would be to only index on contract address and perhaps one user-specified field, which would be enough for nearly all sensible uses of events.

@SilentCicero
Copy link
Member

@Arachnid @adlerjohn

  1. I second the program counter idea for logs, you will want to know if the entire program succeeded or got to the end of execution.
  2. I'm fine with the lighter-weight log option compared to Ethereum, but what would that mean for enforcement on the contract side?

@adlerjohn adlerjohn added the comp:FVM Component: FuelVM label Feb 18, 2021
@adlerjohn
Copy link
Contributor Author

To follow up on this:

  1. A receipt and log root can be included for each transaction with non-zero script length. I propose a new transaction field for each. These don't need to be posted on-chain since they can be re-generated by running each transaction, but they do need to be committed to in the transactions root.
  2. The log root can be the root of an MMR or maybe a simple binary Merkle tree, since it's append-only.
  3. Since we support multiple native assets as of Support native tokens #117, transfer opcodes will also have to fire events, in addition to manual events through LOG.

@adlerjohn adlerjohn added enhancement New feature or request and removed question Further information is requested labels Feb 28, 2021
@Arachnid
Copy link

It sounds like you're conflating transactions and receipts? It should be possible to create an object that represents a transaction and not its result, right? And a receipt should be a separate object that commits to the transaction it's for, and what its result was.

Depending on your replay protection mechanism, too, you might need to allow for the possibility that one transaction could have multiple executions.

@adlerjohn
Copy link
Contributor Author

adlerjohn commented Feb 28, 2021

It sounds like you're conflating transactions and receipts? It should be possible to create an object that represents a transaction and not its result, right?

So...this is probably completely out of the blue since you probably haven't been following the conversations in this repo and especially not the conversations I've had with other-Nick offline (because they're, you know, offline 😂). But our scheme is to allow block proposers to malleate transaction data.

As an example, OutputVariable is an output whose recipient, amount, and color are not guaranteed at transaction sending time, because they're set by TRANSFEROUT opcodes during state-dependent execution. The way to guarantee transaction senders can properly sign over the transaction is simple and stupid: they just zero out any fields that are unknown at transaction signing time. Similarly for computing the transaction ID. When committing to a Merkle root of transactions, the block proposer will malleate these fields with the results from execution.

This is a completely different paradigm from contemporary blockchains, where block proposers simply order transactions and that's it.

What that out of the way, it should be obvious how and why receipts should be a transaction field:

  1. It's simply merging what would be two Merkle trees into one Merkle tree, which means lower costs to submit block headers.
  2. We have so many malleable fields that if we had a separate Merkle tree for all of them, things would get super complicated. So we just merge everything into a single BP-malleable tree.

@Arachnid
Copy link

It still seems like these should be two different structures? You can have a "transaction", which contains only the fields that the transaction issuer determines, and a "receipt" or "processed transaction" or whatever, that includes the transaction and all the additional fields. Otherwise it seems like you're going to have confusion between what it is a signer produces and sends to the chain, and what's included in the chain - calling them the same thing when one is a superset of the other.

@adlerjohn
Copy link
Contributor Author

adlerjohn commented Feb 28, 2021

Maybe from a purist point of view? I don't think that inherently makes things simpler. One of the reasons to put malleable fields inline and zero them on signing is that the transaction is placed in memory on VM initialization. Another is that transaction parsing needs to be done in the EVM.

  1. If you put the malleable fields at the end, then now whenever you modify a certain output you need to go find out exactly where that field is all the way in a different place.
  2. If you put the malleable fields inline and just delete them completely for signing, then the size of the signed transaction isn't the same as the size of the final transaction. Which means you now need to do a bunch of individual memory copies as you parse the transaction if you want to compute the transaction hash, then shrink the byte array or use inline assembly. With zeroing out you can do a single memcopy then zero out the appropriate fields then abi.encodePacked and you're done.

you're going to have confusion

This can be abstracted away at the UI/library level. Users (and developers) should never need to know about this process, unless they want to do some low-level stuff, at which point they should be expected to be able to understand.

@Arachnid
Copy link

Arachnid commented Mar 1, 2021

Maybe from a purist point of view? I don't think that inherently makes things simpler.

From an outsider's point of view, having a single object represent both a transaction request signed by the sender, and the result of executing that transaction, is hideously confusing.

One of the reasons to put malleable fields inline and zero them on signing is that the transaction is placed in memory on VM initialization.

If you need those fields, you can put a 'Receipt' into memory. If you don't, you can put a transaction in. One can be a prefix of the other.

If you put the malleable fields inline and just delete them completely for signing, then the size of the signed transaction isn't the same as the size of the final transaction. Which means you now need to do a bunch of individual memory copies as you parse the transaction if you want to compute the transaction hash, then shrink the byte array or use inline assembly. With zeroing out you can do a single memcopy then zero out the appropriate fields then abi.encodePacked and you're done.

I think you're expecting I know some things about your data encoding here that I don't, because I don't understand what you're saying.

What I'm suggesting is as much about nomenclature as anything else: please don't give two different things the same name - even if it's "just at low level", look at the number of low-level Ethereum abstractions that leak through to higher-layer developers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:FVM Component: FuelVM enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants