Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: Amortize clearing unsorted cache entries (Juno genesis fix) #12885

Merged
merged 10 commits into from
Aug 18, 2022
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ Ref: https://keepachangelog.com/en/1.0.0/
* [#12634](https://github.com/cosmos/cosmos-sdk/pull/12634) Move `sdk.Dec` to math package.
* [#12596](https://github.com/cosmos/cosmos-sdk/pull/12596) Remove all imports of the non-existent gogo/protobuf v1.3.3 to ease downstream use and go workspaces.
* [#12187](https://github.com/cosmos/cosmos-sdk/pull/12187) Add batch operation for x/nft module.
* [#12886](https://github.com/cosmos/cosmos-sdk/pull/12886) Amortize cost of processing cache KV store

### State Machine Breaking

Expand Down
36 changes: 36 additions & 0 deletions store/cachekv/search_benchmark_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
package cachekv

import (
db "github.com/tendermint/tm-db"
"strconv"
"testing"
)

func BenchmarkLargeUnsortedMisses(b *testing.B) {
for i := 0; i < b.N; i++ {
cache := map[string]*cValue{}
unsorted := map[string]struct{}{}
for i := 0; i < 100_000; i++ {
key := "A" + strconv.Itoa(i)
unsorted[key] = struct{}{}
cache[key] = &cValue{}
}

for i := 0; i < 100_000; i++ {
key := "Z" + strconv.Itoa(i)
unsorted[key] = struct{}{}
cache[key] = &cValue{}
}

store := Store{
cache: cache,
unsortedCache: unsorted,
sortedCache: db.NewMemDB(),
}
blazeroni marked this conversation as resolved.
Show resolved Hide resolved
for k := 0; k < 10000; k++ {
// cache has A + Z values
// these are within range, but match nothing
store.dirtyItems([]byte("B1"), []byte("B2"))
}
}
}
13 changes: 12 additions & 1 deletion store/cachekv/store.go
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ import (
"github.com/cosmos/cosmos-sdk/store/tracekv"
"github.com/cosmos/cosmos-sdk/store/types"
"github.com/cosmos/cosmos-sdk/types/kv"
"github.com/tendermint/tendermint/libs/math"
)

// cValue represents a cached value.
Expand Down Expand Up @@ -280,6 +281,7 @@ const (

// Constructs a slice of dirty items, to use w/ memIterator.
func (store *Store) dirtyItems(start, end []byte) {
const THRESHOLD = 1024
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we move this outside the method and give it a better name?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also explain the reasoning behind 1024 in a godoc comment :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, though 1024 pre-dates this change and apparently is a magic number: #10026 (comment)

It's re-used in this PR since it made sense (to me) that the minimum number of elements removed matches the threshold to sort.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexanderbez moved and renamed. Since it's magic and documented where it's used, I haven't added a godoc comment. Open to suggestions.

startStr, endStr := conv.UnsafeBytesToStr(start), conv.UnsafeBytesToStr(end)
if startStr > endStr {
// Nothing to do here.
Expand All @@ -294,7 +296,7 @@ func (store *Store) dirtyItems(start, end []byte) {
// O(N^2) overhead.
// Even without that, too many range checks eventually becomes more expensive
// than just not having the cache.
if n < 1024 {
if n < THRESHOLD {
for key := range store.unsortedCache {
if dbm.IsKeyInDomain(conv.UnsafeStrToBytes(key), start, end) {
cacheValue := store.cache[key]
Expand Down Expand Up @@ -325,6 +327,15 @@ func (store *Store) dirtyItems(start, end []byte) {
startIndex = 0
}

// since we took the time to sort the cache, we should use that effort
// we store at least THRESHOLD values -- cost of storing all is amortized across multiple calls
if endIndex-startIndex < THRESHOLD {
endIndex = math.MinInt(startIndex+THRESHOLD, len(strL)-1)
if endIndex-startIndex < THRESHOLD {
startIndex = math.MaxInt(endIndex-THRESHOLD, 0)
}
}
odeke-em marked this conversation as resolved.
Show resolved Hide resolved

kvL := make([]*kv.Pair, 0)
for i := startIndex; i <= endIndex; i++ {
key := strL[i]
Expand Down