Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StateProofs: New block header field - SHA256 merkle root of the transactions #3829

Merged
merged 35 commits into from
May 10, 2022

Conversation

Aharonee
Copy link
Contributor

@Aharonee Aharonee commented Mar 24, 2022

Summary

Currently, the TxnRoot block header contains the root of the merkle tree built from the transactions in the block, using the SHA512_256 hash function.
Since the Ethereum VM (and others) does not support SHA512_256 natively, we have added a new header, which will be used by the Light Clients deployed on other networks in order to verify Algorand blocks.

@Aharonee Aharonee changed the title Crypto: New block header field - SHA256 merkle root of the transactions StateProofs: New block header field - SHA256 merkle root of the transactions Mar 24, 2022
@Aharonee Aharonee force-pushed the SHA256-Block-Header branch from 30b5a09 to 3ca179a Compare March 27, 2022 08:19
@codecov-commenter
Copy link

codecov-commenter commented Mar 27, 2022

Codecov Report

Merging #3829 (396fc3b) into master (5925aff) will increase coverage by 0.01%.
The diff coverage is 69.93%.

@@            Coverage Diff             @@
##           master    #3829      +/-   ##
==========================================
+ Coverage   49.80%   49.81%   +0.01%     
==========================================
  Files         409      409              
  Lines       68929    69145     +216     
==========================================
+ Hits        34332    34448     +116     
- Misses      30891    30981      +90     
- Partials     3706     3716      +10     
Impacted Files Coverage Δ
daemon/algod/api/server/v1/handlers/handlers.go 0.63% <0.00%> (ø)
daemon/algod/api/server/v2/handlers.go 0.00% <0.00%> (ø)
data/bookkeeping/genesis.go 0.00% <0.00%> (ø)
data/transactions/signedtxn.go 29.62% <0.00%> (-3.71%) ⬇️
data/transactions/transaction.go 35.80% <0.00%> (-0.35%) ⬇️
libgoal/libgoal.go 2.78% <0.00%> (ø)
netdeploy/remote/deployedNetwork.go 19.77% <0.00%> (ø)
crypto/hashes.go 56.09% <33.33%> (-3.91%) ⬇️
ledger/internal/eval.go 67.28% <33.33%> (ø)
data/bookkeeping/block.go 56.61% <65.21%> (+1.37%) ⬆️
... and 12 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5925aff...396fc3b. Read the comment docs.

@@ -539,6 +560,31 @@ func (block Block) paysetCommit(t config.PaysetCommitType) (crypto.Digest, error
}
}

func (block Block) paysetCommitSHA256() (crypto.Digest, error) {
params, ok := config.Consensus[block.CurrentProtocol]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this condition is already checked in the caller function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still need the params to verify that EnableSHA256TxnRootHeader is on, just ignoring the ok return value seems like bad practice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I think I'll just move the EnableSHA256TxnRootHeader check to the caller as well instead

if err != nil {
return crypto.Digest{}, err
}
// in case there are no leaves (e.g empty block with 0 txns) the merkle root is a slice with length of 0.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we use VC. the root will never be empty (even if there are no txn in the blocks).
This is due to the fact that we pad the VC to a full tree.

Copy link
Contributor

@id-ms id-ms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall looks quite good!
apart from some minor changes, I would recommend 4 tests
1- e2e test: run the network in the current version and make sure that TxnRoot.DigestSha256 is zero.
2-e2e test: run the network in the future version and make sure that TxnRoot.DigestSha256 is not zero.
3- a unit test that verifies that the catchpoint service makes sure that TxnRoot.DigestSha256 is present/not present according to the version.
4- a unit test that verifies that the evaluator makes sure that TxnRoot.DigestSha256 is present/not present according to the version.

data/bookkeeping/block.go Outdated Show resolved Hide resolved
@Aharonee Aharonee marked this pull request as ready for review March 29, 2022 13:57
config/consensus.go Outdated Show resolved Hide resolved
daemon/algod/api/client/restClient.go Outdated Show resolved Hide resolved
daemon/algod/api/server/v2/handlers.go Outdated Show resolved Hide resolved
data/bookkeeping/block.go Outdated Show resolved Hide resolved
data/bookkeeping/block.go Outdated Show resolved Hide resolved
@Aharonee Aharonee assigned cce and unassigned cce Mar 31, 2022
@Aharonee Aharonee requested review from cce and algorandskiy March 31, 2022 10:44
@Aharonee Aharonee self-assigned this Mar 31, 2022
daemon/algod/api/server/v2/handlers.go Show resolved Hide resolved
@@ -131,6 +131,13 @@ type (
ParticipationUpdates
}

// TxnRoot represents the root of the merkle tree generated from the transaction in this block.
TxnRoot struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wonder if we can somehow write a test to ensure the old serialized and the new formats match

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can use an instance of old serialized BlockHeader hardcoded in some test and make sure it can be de-serialized correctly into the updated struct.

data/bookkeeping/block_test.go Outdated Show resolved Hide resolved
data/bookkeeping/block_test.go Outdated Show resolved Hide resolved
test/e2e-go/features/transactions/proof_test.go Outdated Show resolved Hide resolved
@Aharonee
Copy link
Contributor Author

Update: since the TXID is generated using SHA512_256, using this as a leaf for transactions tree using SHA256 is useless, as you cannot verify that that transaction content is related to this specific TXID.
I've added support for SHA256 TXID for this specific use case.

@@ -630,7 +622,6 @@
}
},
"required": [
"hashtype",
"idx",
"proof",
"stibhash",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we add a comment that guides the client to interpret the proof field as a concatenated array in which every element is 32-bytes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could but I don't think it's specifically related to this PR, or this particular change. We didn't change that aspect of the proof.

daemon/algod/api/server/v2/test/handlers_test.go Outdated Show resolved Hide resolved
daemon/algod/api/server/v2/test/handlers_test.go Outdated Show resolved Hide resolved
data/bookkeeping/block_test.go Show resolved Hide resolved
// block, along with their ApplyData, as a Merkle tree vector commitment, using SHA256. This allows the
// caller to either extract the root hash (for inclusion in the block
// header), or to generate proofs of membership for transactions that are
// in this block.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we add here that we use the sha256 also for TXID and for signedTransactionInBlock?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's internal logic I'm not sure it's relevant here

data/transactions/signedtxn.go Outdated Show resolved Hide resolved
data/transactions/transaction.go Outdated Show resolved Hide resolved
test/e2e-go/features/transactions/proof_test.go Outdated Show resolved Hide resolved
copy(d[:], proofconcat)
proof.Path = append(proof.Path, d[:])
proofconcat = proofconcat[len(d):]
generateProof := func(h crypto.HashType, prfRsp generated.ProofResponse) (p merklearray.Proof) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just use a function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's only used by this test so why contaminate the namespace? also I think it's a bit easier to reason about since you can read the code in a sequential manner.

@Aharonee Aharonee force-pushed the SHA256-Block-Header branch 2 times, most recently from 4263d31 to 323bba4 Compare April 25, 2022 08:05
Copy link
Contributor

@algorandskiy algorandskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, couple minor comments.
@winder / @AlgoStephenAkiki could you have a look at REST API changes?

partitiontest.PartitionTest(t)
a := require.New(t)

// This serialized block header was generated from V32 e2e test, using the old BlockHeader struct which contains only TxnRoot SHA512_256 value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you maybe add the commands used to obtain this value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked out the master branch and added encode + print in the middle of an e2e test to get a somewhat full BlockHeader.
If it breaks you can alway checkout the previous commit and look at the actual decoded struct, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I understand. Just add couple pointers, like:

  1. edit XYZ test
  2. add the following (example)
blk, err := ledger.Block(10)
require.NoError(t, err)
fmt.Printf("%x\n", protocol.Encode(&blk))

This is good for the tests maintainability

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is we have to checkout an old commit to encode a BlockHeader, since I'm changing the struct in this PR. Is there a better way you can think of or should I just add some comment at to how to do that and maybe hint a commit hash?

data/bookkeeping/txn_merkle.go Outdated Show resolved Hide resolved
Comment on lines +555 to +556
"sha512_256",
"sha256"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would merkle-tree or vector-commitment be better names here? That seems like it would be more useful/descriptive than the generic hashing function being used to represent these things.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both are necessary to verify the proof, but the SHA512_256 not being supported natively on the EVM is the reason we've added this parameter, vector commitment is just an optimization of merkle tree that's a bit more secure cryptographically.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I understand. I thought this was switching between two different algorithms. But this is literally just using a different hash function while otherwise doing the same thing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it's actually both, the different hash function was a necessity, while the different algorithm is used since it it's a "better" implementation of merkle tree (it's not actually necessary but since it has been integrated it should almost always be used instead of a standard merkle tree).
Do you think it should be more explicit in the parameter?

"sha512_256_merkle",
"sha256_vector"

Copy link
Contributor

@winder winder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor changes requested.

Comment on lines 137 to 138
DigestSha256 crypto.Digest `codec:"txn256"` // root of transaction vector commitment merkle tree using SHA256 hash function
DigestSha512_256 crypto.Digest `codec:"txn"` // root of transaction merkle tree using SHA512_256 hash function
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These names don't seem helpful to me. DigestSha256 and DigestSha512_256 are implementation details.

Previously the name was simply TxnRoot. The name says nothing about the data except for what it represents. With the new names, there is nothing about what the data represents, just how it was implemented.

This is similar to the conversation we had about the REST APIs. I think this would be significantly clearer if the variables were named MerkleTree and VectorCommitment.

@algorandskiy I'm curious what you think about this too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, MerkleTree or MerkleTreeTxnRoot sounds better

Copy link
Contributor

@id-ms id-ms May 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid it is more complicated than that. we need to remember that TxnRoot might also represent a flat commitment (i.e hashing the concatenation of all the paysets in the block). That was the implementation on earlier protocol versions. Therefore I'm not sure using the word merkleTree or VectorCommitment is good.

so we might want to change the structure name to TxnCommitment or similar, but those fields above are just the result of hash and cannot really state what was the input

@@ -574,6 +575,10 @@ func (v2 *Handlers) GetProof(ctx echo.Context, round uint64, txid string, params
return badRequest(ctx, err, errNoTxnSpecified, v2.Log)
}

if params.Hashtype != nil && *params.Hashtype != "sha512_256" && *params.Hashtype != "sha256" {
Copy link
Contributor

@winder winder May 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: UnmarshalHashType and/or Sha512_256.String() / Sha256.String() can be used to simplify some of these checks and leverage the compiler to validate these strings.

Copy link
Contributor

@winder winder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@algorandskiy algorandskiy merged commit bd18d04 into algorand:master May 10, 2022
Stibhash: stibhash[:],
Idx: uint64(idx),
Treedepth: uint64(proof.TreeDepth),
Hashtype: proof.HashFactory.HashType.String(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you removed hashtype from the response, but it is still listed as required in the ProofResponse YAML spec...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

algorandskiy pushed a commit that referenced this pull request May 17, 2022
This should ensure make msgp has been run for changes
that impact msgp serialization on CI builds.

It looks like #3829 was merged but missed changes to agreement/msgp_gen.go
and gci updates from algorand/msgp#14 were not incorporated into #3919.
PhearZero pushed a commit to PhearNet/crypto that referenced this pull request Jan 17, 2025
…actions (algorand#3829)

Currently, the TxnRoot block header contains the root of the merkle tree
built from the transactions in the block, using the SHA512_256 hash function.
Since the Ethereum VM (and others) does not support SHA512_256 natively,
we have added a new header, which will be used by the Light Clients
deployed on other networks in order to verify Algorand blocks.

Co-authored-by: algoidan <[email protected]>
PhearZero pushed a commit to PhearNet/crypto that referenced this pull request Jan 17, 2025
This should ensure make msgp has been run for changes
that impact msgp serialization on CI builds.

It looks like algorand#3829 was merged but missed changes to agreement/msgp_gen.go
and gci updates from algorand/msgp#14 were not incorporated into algorand#3919.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants