wire: cache the non-witness serialization of MsgTx to memoize part of…#1376
wire: cache the non-witness serialization of MsgTx to memoize part of…#1376Roasbeef wants to merge 4 commits intobtcsuite:masterfrom
Conversation
|
There's a bug here in that it doesn't detect mutations in the underlying |
|
@jcvernaleo (as per #1530)
|
|
Here's a recent run late in the chain: From this we can see that when checking signatures, we actually spend most of our time just encoding the tx again to compute the sighash. Here's a flame graph for another view: |
|
I need a refresher on this but if memory serves right, doing away with the channel stuff in binaryFreeList and using a sync.Pool got rid of a big chunk of the time taken.
But yeah if this is something that doesn't need to happen in the first place with memoization that'd be much better. |
|
Related PR re binary free list: #1426 |
… TxHash In this commit, we add a new field to the `MsgTx` struct: `cachedSeralizedNoWitness`. As we decode the main transaction, we use an `io.TeeReader` to copy over the non-witness bytes into this new field. As a result, we can fully cache all tx serialization when computing the TxHash. This has been shown to show up on profiles during IBD. Caching this value allows us to optimize TxHash calculation across the entire daemon as a whole.
c73570b to
0924825
Compare
|
Rebased! |
|
Dug a bit, and I realize the issue is in the test itself, it mutates transactions, with stuff like |
|
Pushed a tx implementing the above, don't think it's ideal though...worried about weird edge cases where transaction generation code uses a tx as a scratch pad and makes multiple versions to sign, with this it'll give the wrong Also added a commit that'll re-use the non-witness serialization for |
Crypt-iQ
left a comment
There was a problem hiding this comment.
I looked in both btcd and lnd to see if we modify wire.MsgTx after calculating the hash, but couldn't find any instances. The code looks good, but I think we need to be extra careful and see if there are any instances across our codebases
| func (msg *MsgTx) SerializeNoWitness(w io.Writer) error { | ||
| if msg.cachedSeralizedNoWitness != nil { | ||
| w.Write(msg.cachedSeralizedNoWitness) | ||
| } |
| // the rawTxTeeReader here, as these are segwit specific bytes. | ||
| var ( | ||
| flag [1]byte | ||
| hasWitneess bool |
guggero
left a comment
There was a problem hiding this comment.
While I like the elegance of using a io.TeeReader (TIL by the way), I'm not sure if this would break a bunch of things outside of btcd, namely in lnd and related projects.
What if we make the caching optional with an additional struct member boolean (or if memory footprint is a concern, use a bit within the Version field) that turns the caching on?
That way we could both benefit from the optimization within btcd while not breaking any assumptions for outside code.
|
|
||
| // Write out the actual number of inputs as this won't be the very byte | ||
| // series after the versino of segwit transactions. | ||
| if WriteVarInt(&rawTxBuf, pver, count); err != nil { |
There was a problem hiding this comment.
Need to add err := WriteVarInt().
| } | ||
|
|
||
| // Write out the actual number of inputs as this won't be the very byte | ||
| // series after the versino of segwit transactions. |
| // this transaction without witness data. When we decode a transaction, | ||
| // we'll write out the non-witness bytes to this so we can quickly | ||
| // calculate the TxHash later if needed. | ||
| cachedSeralizedNoWitness []byte |
There was a problem hiding this comment.
nit: s/cachedSeralizedNoWitness/cachedSerializedNoWitness/
| // useful to be able to get the correct txid after mutating a transaction's | ||
| // state. | ||
| func (msg *MsgTx) WipeCache() { | ||
| msg.cachedSeralizedNoWitness = nil |
There was a problem hiding this comment.
Don't we also need to call this in the helper methods like AddTxIn() and AddTxOut()? Or is the assumption that TxHash() would in practice only be called once the transaction is fully built, so the cache doesn't need to be invalidated?
I'm mostly worrying about uses of MsgTx outside of btcd, where I'm not sure we can 100% guarantee that we're always using this pattern...
Not a bad idea, agree that this could have a lot of unintended consequences. The other approach uses |
|
Closing in favor of #2023, see this comment for some details: #2023 (comment) TL;DR: I think we need to remove/optimize the binary free list instead. |



… TxHash
In this commit, we add a new field to the
MsgTxstruct:cachedSeralizedNoWitness. As we decode the main transaction, we use anio.TeeReaderto copy over the non-witness bytes into this new field.As a result, we can fully cache all tx serialization when computing the
TxHash. This has been shown to show up on profiles during IBD. Caching
this value allows us to optimize TxHash calculation across the entire
daemon as a whole.