Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix merkle tree malleability #7

Conversation

GarrickOllivander
Copy link
Contributor

Duplicating last transaction hash for odd number of tree leafs allows for malleability, as
MerkleHash([a,b,c])=MerkleHash([a,b,c,c])
The receiver could not tell if the tree had odd number of leafs or the tree was corrupt containing transaction c twice.

To fix, define
MerkleHash[a,b,c]=MerkleHash([a,b,c,0])

Assuming that it is impossible to create valid transaction with hash 0

This issue came up with Bitcoin, see https://bitcointalk.org/index.php?topic=102395.0

@merope07
Copy link
Contributor

merope07 commented Oct 29, 2016

Good catch Garrick -- another potential issue with the current Merkle tree code is that it may be possible to make an input or output that looks like a hash, which provides another sort of malleability. This is nearly a problem in Bitcoin (we're safe because it requires an impractical amount of grinding to create a transaction that looks like a hash) and not a problem in grin in practice (unlike Bitcoin we have no scripts or any other "freeform" fields so the grinding is even worse). But as a matter of defense in depth I'd like to kill this dead.

I've been working on a PR to replace the current Merkle tree code with Peter Todd's MMR algorithm prepending every element with the current tree depth before hashing. This is described here and I have a working C implementation that shouldn't be too hard to port to Rust.

In addition to being a well-analyzed algorithm this also makes presentations of pruned trees much simpler, because the prepended heights tell the verifier enough of the shape of the tree that it can infer what is a parent of what, and also presents no malleability vectors. I think the current codebase has no use for pruned trees, but this may be something we want to support in the future, e.g. so that nodes can store only a small random percentage of the blockchain while being able to bootstrap the stuff that they do have.

I also want to make the code a bit more efficient and generalize it a bit so that it will support sum-trees without needing additional algorithmic support. We may want to store inputs and outputs in sum-trees because this enables more forms of partial validation.

If you're working on hashing/Merkle structures too we should probably coordinate. What are your thoughts on this?

~M

@GarrickOllivander
Copy link
Contributor Author

I am only poking around in the code for now and learning the language. Would not yet attack a bigger problem. As for MMR I like the idea. You are likely aware of Peter's more recent writeup on its use, but here for reference: https://petertodd.org/2016/delayed-txo-commitments

@merope07
Copy link
Contributor

Ok, cool, I'll write a general sum-MMR algorithm, then I propose the following data structures:

  • transaction Inputs/Outputs in a tree where we sum all commitment values
  • TxProof in a tree where all the "remainder" values are summed (then checking the sum means checking these two roots are the same). Signatures TBD, I understand Andrew is thinking about aggregate Schnorr signatures so maybe we will have one signature outside of the tree (?)
  • UTXO commitment is a sum tree of "spent count" (and gets updated in place when txos are spent). Then given a hash at depth n a verifier knows she needs more data iff the spent-count is less than 2^n

I'll give more details in a PR once I've worked everything out.

@ignopeverell
Copy link
Contributor

Nicely spotted @GarrickOllivander, thanks for taking a look, I'll apply this pull request as it's a net improvement regardless of other additions.

The current Merkle tree code was mostly a placeholder, you'll see that I'm a firm believer in incremental implementation. And we're going to need soonish something to represent the UTXO Merkle tree. We'll need cheap lookups, additions and deletions as well as efficient store and update operations on storage (likely key/value). I'd also like a solution that's as simple as possible, Ethereum's Patricia tree makes me shudder. Believe @apoelstra had some ideas on the subject as well.

Any exploration and help here would be greatly appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants