Skip to content

btcutil: reuse serialized tx during TxHash#2023

Closed
kcalvinalvin wants to merge 1 commit intobtcsuite:masterfrom
kcalvinalvin:dont-serialize-tx-for-txhash
Closed

btcutil: reuse serialized tx during TxHash#2023
kcalvinalvin wants to merge 1 commit intobtcsuite:masterfrom
kcalvinalvin:dont-serialize-tx-for-txhash

Conversation

@kcalvinalvin
Copy link
Copy Markdown
Collaborator

btcutil.Block caches the serialized raw bytes of the block during ibd. This serialized block bytes includes the serialized tx. The current tx hash generation will re-serialized the de-serialized tx to create the raw bytes and it'll only then hash that.

This commit changes the code so that the re-serialization never happens, saving tons of cpu and memory overhead.

@coveralls
Copy link
Copy Markdown

coveralls commented Aug 21, 2023

Pull Request Test Coverage Report for Build 7284089562

  • 45 of 71 (63.38%) changed or added relevant lines in 2 files are covered.
  • 1 unchanged line in 1 file lost coverage.
  • Overall coverage increased (+0.02%) to 56.626%

Changes Missing Coverage Covered Lines Changed/Added Lines %
btcutil/tx.go 24 50 48.0%
Files with Coverage Reduction New Missed Lines %
connmgr/connmanager.go 1 86.27%
Totals Coverage Status
Change from base Build 7278303266: 0.02%
Covered Lines: 28660
Relevant Lines: 50613

💛 - Coveralls

Copy link
Copy Markdown
Collaborator

@halseth halseth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the speedup?

Comment thread btcutil/tx.go Outdated
@kcalvinalvin
Copy link
Copy Markdown
Collaborator Author

kcalvinalvin commented Nov 16, 2023

What is the speedup?

The speedup is resulted from the fact that the tx is never serialized. Since we receive it serialized in the block, we just hash those bytes, saving a lot of overhead during txhash calculation.

It's a different optimization from #1978 and this speedup is specific to ibd.

I tried to show it via cpu profiling but without the utxocache PR #1955, the speedup isn't observable because the utxo i/o is such a big bottleneck

@kcalvinalvin kcalvinalvin marked this pull request as ready for review November 20, 2023 05:04
@Roasbeef Roasbeef requested a review from guggero November 29, 2023 18:37
Comment thread btcutil/block.go
Comment thread btcutil/tx.go Outdated
Copy link
Copy Markdown
Collaborator

@guggero guggero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a first round to load some context. Can't really add much to the discussion whether to use this PR or #1376, other than I think this code is probably harder to reason about.

Comment thread btcutil/tx.go Outdated
Comment thread btcutil/tx.go Outdated
Comment thread btcutil/tx.go Outdated
Copy link
Copy Markdown
Collaborator

@Crypt-iQ Crypt-iQ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good, I did a once over to see if btcutil.Tx had any cases where it was modified after using Hash but couldn't find any. I think somebody else should also double-check just to be safe

Comment thread btcutil/tx.go Outdated
@Roasbeef
Copy link
Copy Markdown
Member

Roasbeef commented Dec 8, 2023

Code looks good, I did a once over to see if btcutil.Tx had any cases where it was modified after using Hash but couldn't find any.

Yeah I think that's one advtange: for block validation, things come off the wire, then go into the btcutil wrapper, after which they're more or less immutable.

@Roasbeef
Copy link
Copy Markdown
Member

Running a combined branch for a sync test (utxo cache, this PR, and #1978), so far on my machine it's on schedule for a 6-ish hour sync (connected to bitcoind node over LAN, same machine).

One thing that still shows up is this hotspot later into the chain with full validation:
Screenshot 2023-12-15 at 3 53 53 PM

Currently of the opinion that my attempt in #1376 may not be the best way to go about it, due to unintended consequences (PSBT interactions signing the long sighash, etc). Spending more time with routines like calcSignatureHash, it's also clear that it could actually lead to incorrect behavior when computing the sighash. When computing the non-segwit sighash, we blank out the sigScripts of inputs not currently being spent. But the cached version of SerializeNoWitness would emit the serialized version we read on the wire, which would include the sigScripts that were blanked our earlier.

@Roasbeef
Copy link
Copy Markdown
Member

Zooming in a bit more, we can see that the channel based binary free list logic is actually what is making things so slow here when writing inputs/outputs:
Screenshot 2023-12-15 at 4 07 51 PM

@Roasbeef Roasbeef closed this Dec 16, 2023
@Roasbeef
Copy link
Copy Markdown
Member

Roasbeef commented Dec 16, 2023

Oops, closed the wrong version...

Also here's the full SVG from above:
profile001

Comment thread btcutil/tx.go
btcutil.Block caches the serialized raw bytes of the block during ibd.
This serialized block bytes includes the serialized tx. The current tx
hash generation will re-serialized the de-serialized tx to create the
raw bytes and it'll only then hash that.

This commit changes the code so that the re-serialization never happens,
saving tons of cpu and memory overhead.
@kcalvinalvin kcalvinalvin force-pushed the dont-serialize-tx-for-txhash branch from 3afc77e to 83605e4 Compare December 21, 2023 04:43
@kcalvinalvin
Copy link
Copy Markdown
Collaborator Author

Rebased and addressed the review comments.

@guggero guggero self-requested a review December 21, 2023 07:34
Comment thread btcutil/tx.go
@Roasbeef
Copy link
Copy Markdown
Member

Did the final nits in #2081

@Roasbeef Roasbeef closed this Dec 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants