core: miner: reduce allocations in block building#33375
core: miner: reduce allocations in block building#33375MariusVanDerWijden merged 9 commits intoethereum:masterfrom
Conversation
reduces overall block building allocations by 1%
| // are replay-protected as well as unprotected homestead transactions. | ||
| // Deprecated: always use the Signer interface type | ||
| type EIP155Signer struct { | ||
| chainId, chainIdMul *big.Int |
There was a problem hiding this comment.
We still use the 155 signer for recovering the sender of legacy transactions right?
Did you observe significant allocation comes from here?
There was a problem hiding this comment.
yes we still use it and every time we instantiate any signer, we get this allocation
BenchmarkSigner-14 11503028 100.0 ns/op 144 B/op 4 allocs/op
BenchmarkSigner-14 21896109 46.88 ns/op 64 B/op 2 allocs/op
func BenchmarkSigner(b *testing.B) {
id := big.NewInt(1)
for range b.N {
NewPragueSigner(id)
}
}
There was a problem hiding this comment.
ROUTINE ======================== github.com/ethereum/go-ethereum/core/types.NewEIP155Signer in github.com/ethereum/go-ethereum/core/types/transaction_signing.go
240049308 480632052 (flat, cum) 0.061% of Total
. . 333: chainId, chainIdMul *big.Int
. . 334:}
. . 335:
. . 336:func NewEIP155Signer(chainId *big.Int) EIP155Signer {
. . 337: if chainId == nil {
. . 338: chainId = new(big.Int)
240049308 480632052 339: }
. . 340: return EIP155Signer{
. . 341: chainId: chainId,
. . 342: chainIdMul: new(big.Int).Mul(chainId, big.NewInt(2)),
. . 343: }
. . 344:}
There was a problem hiding this comment.
But it's achieved with the cost of sender recovery right? I feel there are still a tons of legacy transactions in the network.
There was a problem hiding this comment.
Not really, its not much slower iirc. Also the last number I have in my head for legacy type tx was something like 10% of the network. Let me quickly write a benchmark
There was a problem hiding this comment.
func BenchmarkRecoverSig(b *testing.B) {
key, _ := defaultTestKey()
tx := NewTransaction(0, common.Address{}, new(big.Int), 0, new(big.Int), nil)
tx, err := SignTx(tx, NewEIP155Signer(big.NewInt(1)), key)
if err != nil {
b.Fatal(err)
}
b.ResetTimer()
for b.Loop() {
_, err := Sender(NewPragueSigner(big.NewInt(1)), tx)
if err != nil {
b.Error(err)
}
}
}
// BenchmarkRecoverSig-14 7953165 151.8 ns/op 184 B/op 6 allocs/op
// BenchmarkRecoverSig-14 17003580 70.78 ns/op 88 B/op 3 allocs/op
Looks like its much faster if you factor in the Signer creation
There was a problem hiding this comment.
The slowdown if you don't factor in the signer creation is also pretty negligble imho:
// BenchmarkRecoverSig-14 172942711 6.947 ns/op 0 B/op 0 allocs/op
// BenchmarkRecoverSig-14 164950395 7.335 ns/op 0 B/op 0 allocs/op
``
There was a problem hiding this comment.
goos: darwin
goarch: arm64
pkg: github.com/ethereum/go-ethereum/core/types
cpu: Apple M1 Pro
BenchmarkBigMul
BenchmarkBigMul-8 22464758 52.80 ns/op
func BenchmarkBigMul(b *testing.B) {
id := big.NewInt(1)
for b.Loop() {
x := new(big.Int).Mul(id, big.NewInt(2))
_ = x
}
}
There was a problem hiding this comment.
Add/Sub is much faster than mul
| } | ||
|
|
||
| func opGasprice(pc *uint64, evm *EVM, scope *ScopeContext) ([]byte, error) { | ||
| v, _ := uint256.FromBig(evm.GasPrice) |
There was a problem hiding this comment.
I guess you think Clone is cheaper than FromBig?
There was a problem hiding this comment.
No using a uint256 is better in the evm than a big int and having to convert everytime we use it.
There was a problem hiding this comment.
// BenchmarkOpGasprice-14 145359624 8.234 ns/op 0 B/op 0 allocs/op
// BenchmarkOpGasprice-14 250398700 4.838 ns/op 0 B/op 0 allocs/op
func BenchmarkOpGasprice(b *testing.B) {
b.ReportAllocs()
evm := NewEVM(BlockContext{}, nil, params.TestChainConfig, Config{})
evm.GasPrice = big.NewInt(1_000_000_000)
scope := &ScopeContext{
Stack: newstack(),
}
pc := uint64(0)
for i := 0; i < b.N; i++ {
scope.Stack.data = scope.Stack.data[:0]
if _, err := opGasprice(&pc, evm, scope); err != nil {
b.Fatalf("opGasprice error: %v", err)
}
}
}
| revert(*StateDB) | ||
|
|
||
| // dirtied returns the Ethereum address modified by this journal entry. | ||
| dirtied() *common.Address |
There was a problem hiding this comment.
Will it introdice a noticeable impact on the allocation? Just out of curiosity.
There was a problem hiding this comment.
ROUTINE ======================== github.com/ethereum/go-ethereum/core/state.(*journal).append in github.com/ethereum/go-ethereum/core/state/journal.go
3735047495 3735047495 (flat, cum) 0.48% of Total
. . 101:func (j *journal) append(entry journalEntry) {
594360687 594360687 102: j.entries = append(j.entries, entry)
3005545312 3005545312 103: if addr := entry.dirtied(); addr != nil {
135141496 135141496 104: j.dirties[*addr]++
. . 105: }
Almost half a percent, but its also in the critical path, so it might make a small impact
There was a problem hiding this comment.
Are we sure allocation can be reduced by adopting the new approach? The address object is returned as the copy alongside the boolean flag.
There was a problem hiding this comment.
escape analysis shows "moved to heap: ch" in journal.go on master for lines 297/328/343/357/372/387/402 due to return &ch.account from dirtied() *Address. Those escapes go away with (Address,bool).
$ go test ./core/state -run TestNonExistent -gcflags='-m=2' 2>&1 | rg 'journal\.go' | rg 'moved to heap: ch|escapes to heap: ch'
core/state/journal.go:297:7: moved to heap: ch
core/state/journal.go:328:7: moved to heap: ch
core/state/journal.go:343:7: moved to heap: ch
core/state/journal.go:357:7: moved to heap: ch
core/state/journal.go:372:7: moved to heap: ch
core/state/journal.go:387:7: moved to heap: ch
core/state/journal.go:402:7: moved to heap: ch
I recently went on a longer flight and started profiling the geth block production pipeline.
This PR contains a bunch of individual fixes split into separate commits.
I can drop some if necessary.
Benchmarking is not super easy, the benchmark I wrote is a bit non-deterministic.
I will try to write a better benchmark later