-
Notifications
You must be signed in to change notification settings - Fork 491
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
StateProofs: New block header field - SHA256 merkle root of the transactions #3829
Conversation
30b5a09
to
3ca179a
Compare
Codecov Report
@@ Coverage Diff @@
## master #3829 +/- ##
==========================================
+ Coverage 49.80% 49.81% +0.01%
==========================================
Files 409 409
Lines 68929 69145 +216
==========================================
+ Hits 34332 34448 +116
- Misses 30891 30981 +90
- Partials 3706 3716 +10
Continue to review full report at Codecov.
|
data/bookkeeping/block.go
Outdated
@@ -539,6 +560,31 @@ func (block Block) paysetCommit(t config.PaysetCommitType) (crypto.Digest, error | |||
} | |||
} | |||
|
|||
func (block Block) paysetCommitSHA256() (crypto.Digest, error) { | |||
params, ok := config.Consensus[block.CurrentProtocol] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this condition is already checked in the caller function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still need the params to verify that EnableSHA256TxnRootHeader
is on, just ignoring the ok
return value seems like bad practice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I think I'll just move the EnableSHA256TxnRootHeader
check to the caller as well instead
data/bookkeeping/block.go
Outdated
if err != nil { | ||
return crypto.Digest{}, err | ||
} | ||
// in case there are no leaves (e.g empty block with 0 txns) the merkle root is a slice with length of 0. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we use VC. the root will never be empty (even if there are no txn in the blocks).
This is due to the fact that we pad the VC to a full tree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall looks quite good!
apart from some minor changes, I would recommend 4 tests
1- e2e test: run the network in the current version and make sure that TxnRoot.DigestSha256
is zero.
2-e2e test: run the network in the future version and make sure that TxnRoot.DigestSha256
is not zero.
3- a unit test that verifies that the catchpoint service makes sure that TxnRoot.DigestSha256
is present/not present according to the version.
4- a unit test that verifies that the evaluator makes sure that TxnRoot.DigestSha256
is present/not present according to the version.
data/bookkeeping/block.go
Outdated
@@ -131,6 +131,13 @@ type ( | |||
ParticipationUpdates | |||
} | |||
|
|||
// TxnRoot represents the root of the merkle tree generated from the transaction in this block. | |||
TxnRoot struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wonder if we can somehow write a test to ensure the old serialized and the new formats match
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can use an instance of old serialized BlockHeader
hardcoded in some test and make sure it can be de-serialized correctly into the updated struct.
Update: since the TXID is generated using SHA512_256, using this as a leaf for transactions tree using SHA256 is useless, as you cannot verify that that transaction content is related to this specific TXID. |
daemon/algod/api/algod.oas3.yml
Outdated
@@ -630,7 +622,6 @@ | |||
} | |||
}, | |||
"required": [ | |||
"hashtype", | |||
"idx", | |||
"proof", | |||
"stibhash", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we add a comment that guides the client to interpret the proof
field as a concatenated array in which every element is 32-bytes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could but I don't think it's specifically related to this PR, or this particular change. We didn't change that aspect of the proof.
// block, along with their ApplyData, as a Merkle tree vector commitment, using SHA256. This allows the | ||
// caller to either extract the root hash (for inclusion in the block | ||
// header), or to generate proofs of membership for transactions that are | ||
// in this block. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we add here that we use the sha256
also for TXID and for signedTransactionInBlock?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's internal logic I'm not sure it's relevant here
copy(d[:], proofconcat) | ||
proof.Path = append(proof.Path, d[:]) | ||
proofconcat = proofconcat[len(d):] | ||
generateProof := func(h crypto.HashType, prfRsp generated.ProofResponse) (p merklearray.Proof) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not just use a function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's only used by this test so why contaminate the namespace? also I think it's a bit easier to reason about since you can read the code in a sequential manner.
4263d31
to
323bba4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, couple minor comments.
@winder / @AlgoStephenAkiki could you have a look at REST API changes?
data/bookkeeping/block_test.go
Outdated
partitiontest.PartitionTest(t) | ||
a := require.New(t) | ||
|
||
// This serialized block header was generated from V32 e2e test, using the old BlockHeader struct which contains only TxnRoot SHA512_256 value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you maybe add the commands used to obtain this value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked out the master branch and added encode + print in the middle of an e2e test to get a somewhat full BlockHeader
.
If it breaks you can alway checkout the previous commit and look at the actual decoded struct, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I understand. Just add couple pointers, like:
- edit XYZ test
- add the following (example)
blk, err := ledger.Block(10)
require.NoError(t, err)
fmt.Printf("%x\n", protocol.Encode(&blk))
This is good for the tests maintainability
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is we have to checkout an old commit to encode a BlockHeader
, since I'm changing the struct in this PR. Is there a better way you can think of or should I just add some comment at to how to do that and maybe hint a commit hash?
"sha512_256", | ||
"sha256" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would merkle-tree or vector-commitment be better names here? That seems like it would be more useful/descriptive than the generic hashing function being used to represent these things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both are necessary to verify the proof, but the SHA512_256 not being supported natively on the EVM is the reason we've added this parameter, vector commitment is just an optimization of merkle tree that's a bit more secure cryptographically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I understand. I thought this was switching between two different algorithms. But this is literally just using a different hash function while otherwise doing the same thing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it's actually both, the different hash function was a necessity, while the different algorithm is used since it it's a "better" implementation of merkle tree (it's not actually necessary but since it has been integrated it should almost always be used instead of a standard merkle tree).
Do you think it should be more explicit in the parameter?
"sha512_256_merkle",
"sha256_vector"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor changes requested.
Co-authored-by: Will Winder <[email protected]>
Co-authored-by: Will Winder <[email protected]>
data/bookkeeping/block.go
Outdated
DigestSha256 crypto.Digest `codec:"txn256"` // root of transaction vector commitment merkle tree using SHA256 hash function | ||
DigestSha512_256 crypto.Digest `codec:"txn"` // root of transaction merkle tree using SHA512_256 hash function |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These names don't seem helpful to me. DigestSha256 and DigestSha512_256 are implementation details.
Previously the name was simply TxnRoot
. The name says nothing about the data except for what it represents. With the new names, there is nothing about what the data represents, just how it was implemented.
This is similar to the conversation we had about the REST APIs. I think this would be significantly clearer if the variables were named MerkleTree
and VectorCommitment
.
@algorandskiy I'm curious what you think about this too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, MerkleTree
or MerkleTreeTxnRoot
sounds better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm afraid it is more complicated than that. we need to remember that TxnRoot
might also represent a flat commitment (i.e hashing the concatenation of all the paysets in the block). That was the implementation on earlier protocol versions. Therefore I'm not sure using the word merkleTree
or VectorCommitment
is good.
so we might want to change the structure name to TxnCommitment
or similar, but those fields above are just the result of hash and cannot really state what was the input
@@ -574,6 +575,10 @@ func (v2 *Handlers) GetProof(ctx echo.Context, round uint64, txid string, params | |||
return badRequest(ctx, err, errNoTxnSpecified, v2.Log) | |||
} | |||
|
|||
if params.Hashtype != nil && *params.Hashtype != "sha512_256" && *params.Hashtype != "sha256" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: UnmarshalHashType
and/or Sha512_256.String()
/ Sha256.String()
can be used to simplify some of these checks and leverage the compiler to validate these strings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Stibhash: stibhash[:], | ||
Idx: uint64(idx), | ||
Treedepth: uint64(proof.TreeDepth), | ||
Hashtype: proof.HashFactory.HashType.String(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you removed hashtype from the response, but it is still listed as required in the ProofResponse YAML spec...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should ensure make msgp has been run for changes that impact msgp serialization on CI builds. It looks like #3829 was merged but missed changes to agreement/msgp_gen.go and gci updates from algorand/msgp#14 were not incorporated into #3919.
…actions (algorand#3829) Currently, the TxnRoot block header contains the root of the merkle tree built from the transactions in the block, using the SHA512_256 hash function. Since the Ethereum VM (and others) does not support SHA512_256 natively, we have added a new header, which will be used by the Light Clients deployed on other networks in order to verify Algorand blocks. Co-authored-by: algoidan <[email protected]>
This should ensure make msgp has been run for changes that impact msgp serialization on CI builds. It looks like algorand#3829 was merged but missed changes to agreement/msgp_gen.go and gci updates from algorand/msgp#14 were not incorporated into algorand#3919.
Summary
Currently, the TxnRoot block header contains the root of the merkle tree built from the transactions in the block, using the SHA512_256 hash function.
Since the Ethereum VM (and others) does not support SHA512_256 natively, we have added a new header, which will be used by the Light Clients deployed on other networks in order to verify Algorand blocks.