feat: Fast storage optimization for queries and iterations #468

p0mvn · 2022-02-13T02:46:48Z

Background

Link to the original PR in Osmosis

Background
Historically IAVL has had a very slow performance during state machine execution, and for responding to queries to live state. This release speeds up these routines by an order of magnitude, alleviating large amounts of pressure from all users of the IAVL database.

Details

This PR introduces an auxiliary fast storage system to IAVL, which represents a copy of the latest state much more amenable to efficient querying and iteration.

Prior to this PR, all data gets & iterations suffered two significant performance drawdowns:

Every get/iteration is forced to walk the tree structure
Every node (including leaves) is stored on disk, with its key being a SHA256 hash of something.

All nodes were indexed by their Merkle tree inner node hash. This breaks data locality and makes every Get() that should be in RAM / CPU caches instead be a random leveldb file open.

The fast storage nodes are instead indexed by the logical key on the disk. This allows us to preserve data locality for the latest state, significantly improving iterations and queries. (Depending on the particular benchmark, between 5-30x improvements) This implementation introduces a negligible overhead for writes.

Downgrade-re-upgrade protection
We introduced a downgrade and re-upgrade protection where we guard for potential downgrades of iavl and the subsequent enablement of the fast storage again. This is done so by storing the metadata about the current version of the storage and the latest live state stored.

Summary of Changes
IAVL is divided into two trees, mutable_tree and immutable_tree. Sets only happen on the mutable tree.

Things that need to change and be investigated for getting and setting, and the fast node:

mutable tree
- GetVersioned
  - Change the logic to check the FastNode cache first, then do the GetImmutable logic
- Set
  - Cache unsaved fast nodes in memory, avoid persisting them to disk right away
  - Update unsaved removals if needed
- Remove
  - Cache unsaved removals in memory, avoid persisting changes to disk
  - Update unsaved additions if needed
- SaveVersion
  - Persist unsaved fast node additions and removals to disk
  - Sort by key before saving to db to ensure data locality
  - Potential optimizations:
    - Look into if removing before writing to db is faster
    - Look into if combining sorted removals with sorted additions is faster than doing them one after the other
    - Conclusion is documented here: Develop #13
- Iterate
  - Now that we have some unsaved changes cached in the mutable tree, we cannot use the Iterate's implementation from the immutable tree. Introduce this new method.
  - Ensure that we iterate in the most efficient manner in sorted order by having 2 pointers:
    1. to the next element on disk
    2. to the next unsaved change in the mutable tree
  - Compare the values of the 2 pointers on each iteration and choose the next one
- Iterator
  - Immutable tree is embedded into mutable. From the way composition works in Go, the mutable tree can access all methods and fields of the immutable. So we need to overwrite the implementation of the immutable tree and return an invalid iterator by default.
  - The iterator in the mutable tree is invalid because we cannot support updates and delayed iterations at the same time.
- Get
  - For the same reason as in Iterate, we must check unsaved additions first before attempting to use the strategy employed by the immutable tree
- enableFastStorageAndCommit and its variations
  - These are the methods that are used to perform automatic upgrades to fast storage if the system detects that fast cache is not used. This detection happens by checking the value of a special metadata key called mstorage_version where m is a new prefix. If the version is lower than the fastStorageVersionValue threshold - migration is triggered.
  - LoadVersion, LazyLoadVersion
    - added upgrade logic and downgrade + re-upgrade protection. It works by storing the storage version in the database.
    - if storage version is low -> upgrade
    - if the latest fast node version on disk does not match latest version in ndb -> upgrade
  - isUpgradeable - determines if the upgrade sis going to be performed. This method can be called on the SDK side to determine if we should log a warning message
immutable_tree
- Get and (GetWithIndex
  - renamed Get to GetWithIndex. GetWithIndex always uses the default live state traversal strategy
  - introduced Get method. Get attempts to use the fast cache first. Only fallbacks to regular tree traversal strategy if the fast cache is disabled or tree is not of the latest version
- Iterator
  - returns either regular or updated fast iterator (see more below) depending on if fast storage is enabled and migration is complete
nodedb
- updated and tested the underlying storage management logic
fast_iterator
- introduced and tested a new iterator that binds directly to the database iterator by searching for keys that begin with the prefix f which stands for fast. Basically, all fast nodes are sorted on disk by key in ascending order so we can simply traverse the disk ensuring efficient hardware access.
unsaved_fast_iterator
- iterates over the latest state via fast nodes, taking into account unsaved changes to fast nodes that could have been made within a mutable tree
testing
- unit tests for mutable tree Get, Set, Save, Delete
- unit tests for immutable - Get, Iterator - fast and slow
- some unit tests for nodedb
- updated randomized tests that function like integration tests
- updated bench tests

Benchstat

~/projects/iavl (master)$ benchstat bench_old_large.log bench_fast_large.log
name                                                             old time/op    new time/op    delta
Large/memdb-1000000-100-16-40/query-no-in-tree-guarantee-16        11.6µs ±18%     2.1µs ± 6%  -81.86%  (p=0.008 n=5+5)
Large/memdb-1000000-100-16-40/query-hits-16                        13.0µs ± 3%     1.5µs ±21%  -88.06%  (p=0.008 n=5+5)
Large/memdb-1000000-100-16-40/iteration-16                          7.05s ± 9%     0.52s ± 2%  -92.67%  (p=0.008 n=5+5)
Large/memdb-1000000-100-16-40/update-16                             220µs ±10%     218µs ± 9%     ~     (p=0.841 n=5+5)
Large/memdb-1000000-100-16-40/block-16                             23.0ms ± 3%    24.3ms ± 2%   +5.28%  (p=0.016 n=5+5)
Large/goleveldb-1000000-100-16-40/query-no-in-tree-guarantee-16    21.5µs ±11%     5.7µs ± 2%  -73.60%  (p=0.008 n=5+5)
Large/goleveldb-1000000-100-16-40/query-hits-16                    21.8µs ± 5%     2.7µs ±28%  -87.75%  (p=0.008 n=5+5)
Large/goleveldb-1000000-100-16-40/iteration-16                      16.6s ±11%      0.5s ± 1%  -96.81%  (p=0.008 n=5+5)
Large/goleveldb-1000000-100-16-40/update-16                         245µs ±11%     269µs ±10%     ~     (p=0.222 n=5+5)
Large/goleveldb-1000000-100-16-40/block-16                         28.8ms ± 9%    31.8ms ±19%     ~     (p=0.151 n=5+5)

name                                                             old alloc/op   new alloc/op   delta
Large/memdb-1000000-100-16-40/query-no-in-tree-guarantee-16          566B ±37%       88B ± 0%  -84.45%  (p=0.008 n=5+5)
Large/memdb-1000000-100-16-40/query-hits-16                          606B ± 0%       67B ±46%  -88.91%  (p=0.008 n=5+5)
Large/memdb-1000000-100-16-40/iteration-16                          888MB ± 0%     184MB ± 0%  -79.28%  (p=0.016 n=4+5)
Large/memdb-1000000-100-16-40/update-16                            36.7kB ± 3%    37.2kB ± 3%     ~     (p=0.222 n=5+5)
Large/memdb-1000000-100-16-40/block-16                             3.80MB ± 1%    3.86MB ± 1%   +1.62%  (p=0.008 n=5+5)
Large/goleveldb-1000000-100-16-40/query-no-in-tree-guarantee-16    2.72kB ±29%    0.99kB ± 0%  -63.51%  (p=0.008 n=5+5)
Large/goleveldb-1000000-100-16-40/query-hits-16                    2.70kB ± 1%    0.34kB ±46%  -87.34%  (p=0.008 n=5+5)
Large/goleveldb-1000000-100-16-40/iteration-16                     3.71GB ± 0%    0.30GB ± 0%  -92.00%  (p=0.016 n=4+5)
Large/goleveldb-1000000-100-16-40/update-16                        79.8kB ± 6%    84.1kB ±10%     ~     (p=0.421 n=5+5)
Large/goleveldb-1000000-100-16-40/block-16                         9.05MB ± 9%    9.82MB ±16%     ~     (p=0.095 n=5+5)

name                                                             old allocs/op  new allocs/op  delta
Large/memdb-1000000-100-16-40/query-no-in-tree-guarantee-16          10.8 ±39%       3.0 ± 0%  -72.22%  (p=0.008 n=5+5)
Large/memdb-1000000-100-16-40/query-hits-16                          11.0 ± 0%       1.0 ± 0%  -90.91%  (p=0.000 n=5+4)
Large/memdb-1000000-100-16-40/iteration-16                          16.0M ± 0%      3.0M ± 0%  -81.25%  (p=0.016 n=4+5)
Large/memdb-1000000-100-16-40/update-16                               466 ± 8%       459 ± 8%     ~     (p=1.000 n=5+5)
Large/memdb-1000000-100-16-40/block-16                              52.0k ± 0%     53.2k ± 0%   +2.29%  (p=0.008 n=5+5)
Large/goleveldb-1000000-100-16-40/query-no-in-tree-guarantee-16      37.2 ±29%      19.0 ± 0%  -48.92%  (p=0.008 n=5+5)
Large/goleveldb-1000000-100-16-40/query-hits-16                      36.0 ± 0%       6.0 ±50%  -83.33%  (p=0.008 n=5+5)
Large/goleveldb-1000000-100-16-40/iteration-16                      49.8M ± 0%      5.3M ± 0%  -89.34%  (p=0.016 n=4+5)
Large/goleveldb-1000000-100-16-40/update-16                           684 ± 6%       729 ±12%     ~     (p=0.310 n=5+5)
Large/goleveldb-1000000-100-16-40/block-16                          81.5k ± 7%     84.8k ±12%     ~     (p=0.421 n=5+5)

Old Benchmark
Date: 2022-01-22 12:33 AM PST
Branch: dev/iavl_data_locality with some modifications to the bench tests

go version go1.17.6 linux/amd64

Init Tree took 78.75 MB
goos: linux
goarch: amd64
pkg: github.com/cosmos/iavl/benchmarks
cpu: Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz
BenchmarkMedium/memdb-100000-100-16-40/query-no-in-tree-guarantee-slow-8         	   95841	     12315 ns/op	     592 B/op	      12 allocs/op
BenchmarkMedium/memdb-100000-100-16-40/query-hits-slow-8                         	   85990	     15533 ns/op	     760 B/op	      15 allocs/op
BenchmarkMedium/memdb-100000-100-16-40/iteration-slow-8                          	       2	 838458850 ns/op	88841084 B/op	 1600092 allocs/op
--- BENCH: BenchmarkMedium/memdb-100000-100-16-40/iteration-slow-8
    bench_test.go:109: completed 100000 iterations
    bench_test.go:109: completed 100000 iterations
    bench_test.go:109: completed 100000 iterations
BenchmarkMedium/memdb-100000-100-16-40/update-8                                  	   10000	    148915 ns/op	   27312 B/op	     335 allocs/op
BenchmarkMedium/memdb-100000-100-16-40/block-8                                   	      76	  16865728 ns/op	 2910568 B/op	   35668 allocs/op
Init Tree took 46.71 MB
BenchmarkMedium/goleveldb-100000-100-16-40/query-no-in-tree-guarantee-slow-8     	   55309	     22354 ns/op	    1550 B/op	      30 allocs/op
BenchmarkMedium/goleveldb-100000-100-16-40/query-hits-slow-8                     	   43566	     27137 ns/op	    2093 B/op	      39 allocs/op
BenchmarkMedium/goleveldb-100000-100-16-40/iteration-slow-8                      	       1	2285116100 ns/op	225813440 B/op	 3857215 allocs/op
--- BENCH: BenchmarkMedium/goleveldb-100000-100-16-40/iteration-slow-8
    bench_test.go:109: completed 100000 iterations
BenchmarkMedium/goleveldb-100000-100-16-40/update-8                              	    6194	    307266 ns/op	   40138 B/op	     406 allocs/op
BenchmarkMedium/goleveldb-100000-100-16-40/block-8                               	      28	  40663600 ns/op	 5150422 B/op	   53771 allocs/op
PASS
ok  	github.com/cosmos/iavl/benchmarks	25.797s

Latest Benchmark
Date: 2022-01-22 10:15 AM PST
Branch: roman/fast-node-get-set

go version go1.17.6 linux/amd64

Init Tree took 114.29 MB
goos: linux
goarch: amd64
pkg: github.com/cosmos/iavl/benchmarks
cpu: Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz
BenchmarkMedium/memdb-100000-100-16-40/query-no-in-tree-guarantee-fast-8         	  672999	      1887 ns/op	     112 B/op	       4 allocs/op
BenchmarkMedium/memdb-100000-100-16-40/query-no-in-tree-guarantee-slow-8         	   95888	     11884 ns/op	     440 B/op	       8 allocs/op
BenchmarkMedium/memdb-100000-100-16-40/query-hits-fast-8                         	  891831	      1208 ns/op	      16 B/op	       0 allocs/op
BenchmarkMedium/memdb-100000-100-16-40/query-hits-slow-8                         	   79842	     15644 ns/op	     607 B/op	      11 allocs/op
BenchmarkMedium/memdb-100000-100-16-40/iteration-fast-8                          	      20	  63956090 ns/op	18400254 B/op	  300000 allocs/op
--- BENCH: BenchmarkMedium/memdb-100000-100-16-40/iteration-fast-8
    bench_test.go:117: completed 100000 iterations
    bench_test.go:117: completed 100000 iterations
    bench_test.go:117: completed 100000 iterations
    bench_test.go:117: completed 100000 iterations
    bench_test.go:117: completed 100000 iterations
    bench_test.go:117: completed 100000 iterations
    bench_test.go:117: completed 100000 iterations
    bench_test.go:117: completed 100000 iterations
    bench_test.go:117: completed 100000 iterations
    bench_test.go:117: completed 100000 iterations
	... [output truncated]
BenchmarkMedium/memdb-100000-100-16-40/iteration-slow-8                          	       2	 947611750 ns/op	88841044 B/op	 1600092 allocs/op
--- BENCH: BenchmarkMedium/memdb-100000-100-16-40/iteration-slow-8
    bench_test.go:117: completed 100000 iterations
    bench_test.go:117: completed 100000 iterations
    bench_test.go:117: completed 100000 iterations
BenchmarkMedium/memdb-100000-100-16-40/update-8                                  	    9198	    174306 ns/op	   27524 B/op	     342 allocs/op
BenchmarkMedium/memdb-100000-100-16-40/block-8                                   	      58	  19855266 ns/op	 2948779 B/op	   36495 allocs/op
Init Tree took 66.82 MB
BenchmarkMedium/goleveldb-100000-100-16-40/query-no-in-tree-guarantee-fast-8     	  228343	      4938 ns/op	     814 B/op	      16 allocs/op
BenchmarkMedium/goleveldb-100000-100-16-40/query-no-in-tree-guarantee-slow-8     	   59304	     18046 ns/op	    1420 B/op	      24 allocs/op
BenchmarkMedium/goleveldb-100000-100-16-40/query-hits-fast-8                     	  611349	      1684 ns/op	      93 B/op	       1 allocs/op
BenchmarkMedium/goleveldb-100000-100-16-40/query-hits-slow-8                     	   50778	     23126 ns/op	    2005 B/op	      34 allocs/op
BenchmarkMedium/goleveldb-100000-100-16-40/iteration-fast-8                      	      12	  94702442 ns/op	29327220 B/op	  522988 allocs/op
--- BENCH: BenchmarkMedium/goleveldb-100000-100-16-40/iteration-fast-8
    bench_test.go:117: completed 100000 iterations
    bench_test.go:117: completed 100000 iterations
    bench_test.go:117: completed 100000 iterations
    bench_test.go:117: completed 100000 iterations
    bench_test.go:117: completed 100000 iterations
    bench_test.go:117: completed 100000 iterations
    bench_test.go:117: completed 100000 iterations
    bench_test.go:117: completed 100000 iterations
    bench_test.go:117: completed 100000 iterations
    bench_test.go:117: completed 100000 iterations
	... [output truncated]
BenchmarkMedium/goleveldb-100000-100-16-40/iteration-slow-8                      	       1	1716585400 ns/op	235504072 B/op	 3998006 allocs/op
--- BENCH: BenchmarkMedium/goleveldb-100000-100-16-40/iteration-slow-8
    bench_test.go:117: completed 100000 iterations
BenchmarkMedium/goleveldb-100000-100-16-40/update-8                              	    8994	    257683 ns/op	   44702 B/op	     447 allocs/op
BenchmarkMedium/goleveldb-100000-100-16-40/block-8                               	      31	  44907345 ns/op	 6973362 B/op	   72924 allocs/op
PASS
ok  	github.com/cosmos/iavl/benchmarks	43.513s

Benchmarks Interpretation
Highlighting the difference in performance from the latest benchmarks:

Old branch is dev/iavl_data_locality
New branch is roman/fast-node-get-set

Initial size: 100,000 key-val pairs
Block size: 100 keys
Key length: 16 bytes
Value length: 40 bytes

Query with no guarantee of the key being in the tree:

Old: 22354 ns/op
New on regular logic: 18046 ns/op
New on fast logic: 4938 ns/op
New fast logic shows a 77% decrease in time relative to the old branch

Query with the key guaranteed to be in the latest tree:

Old: 27137 ns/op
New on regular logic: 23126 ns/op
New on fast logic: 1684 ns/op
New fast logic shows a 93% decrease in time relative to the old branch

Iteration:

Old: 2285116100 ns/op
New on old logic: 1716585400 ns/op
New on fast logic: 94702442 ns/op
New fast logic shows a 96% decrease in time relative to the old branch

Update:
run Set, if this is a try that is divisible by blockSize, attempt to SaveVersion and if the latest saved version number history exceeds 20, delete the oldest version

Old: 307266 ns/op
New: 257683 ns/op
New logic shows a 16% decrease in time relative to the old branch

Block:
for block size, run Get and Set. At the end of the block, SaveVersion and if the latest saved version number history exceeds 20, delete the oldest version

Old: 40663600 ns/op
New: 44907345 ns/op
New logic shows a 9% increase in time relative to the old branch

… is deleted, fix all tests but random and with index

…not being cleared when latest version is saved

…le tree'

…eight export (#33) * Revert "sync access to fast node cache to avoid concurrent write fatal error (#23)" This reverts commit 2a1daf4. * return correct iterator in mutable tree

* fix data race related to VersionExists * use regular lock instead of RW in mutable_tree.go

p0mvn · 2022-03-29T18:25:09Z

Updates:

Added several data race fixes that have been identified by running nodes in Osmosis.
Also, hardcoded fast node cache size to 100000. We identified that using the same value for regular nodes cache was causing a continuous RAM growth so we capped it at a lower value. I'm currently working on a refactor to provide this value as a config. I will open another PR with the change
This PR is updated with the latest changes and caught up with the master branch.
My local tests and linter are passing. Could someone run the CI please? I don't have the permissions.
gobencher is failing for reasons outside of my control: https://github.com/cosmos/iavl/pull/468/checks?check_run_id=5726930260

tac0turtle · 2022-04-05T22:47:10Z

thanks for the update, sorry for the delay I was off last week. We will get to this asap

robert-zaremba · 2022-04-07T13:31:42Z

Could you run the benchmarks with https://pkg.go.dev/golang.org/x/perf/cmd/benchstat ?

p0mvn · 2022-04-08T20:49:15Z

Better benchstat summary:

~/projects/iavl (master)$ benchstat bench_old_large.log bench_fast_large.log
name                                                             old time/op    new time/op    delta
Large/memdb-1000000-100-16-40/query-no-in-tree-guarantee-16        11.6µs ±18%     2.1µs ± 6%  -81.86%  (p=0.008 n=5+5)
Large/memdb-1000000-100-16-40/query-hits-16                        13.0µs ± 3%     1.5µs ±21%  -88.06%  (p=0.008 n=5+5)
Large/memdb-1000000-100-16-40/iteration-16                          7.05s ± 9%     0.52s ± 2%  -92.67%  (p=0.008 n=5+5)
Large/memdb-1000000-100-16-40/update-16                             220µs ±10%     218µs ± 9%     ~     (p=0.841 n=5+5)
Large/memdb-1000000-100-16-40/block-16                             23.0ms ± 3%    24.3ms ± 2%   +5.28%  (p=0.016 n=5+5)
Large/goleveldb-1000000-100-16-40/query-no-in-tree-guarantee-16    21.5µs ±11%     5.7µs ± 2%  -73.60%  (p=0.008 n=5+5)
Large/goleveldb-1000000-100-16-40/query-hits-16                    21.8µs ± 5%     2.7µs ±28%  -87.75%  (p=0.008 n=5+5)
Large/goleveldb-1000000-100-16-40/iteration-16                      16.6s ±11%      0.5s ± 1%  -96.81%  (p=0.008 n=5+5)
Large/goleveldb-1000000-100-16-40/update-16                         245µs ±11%     269µs ±10%     ~     (p=0.222 n=5+5)
Large/goleveldb-1000000-100-16-40/block-16                         28.8ms ± 9%    31.8ms ±19%     ~     (p=0.151 n=5+5)

name                                                             old alloc/op   new alloc/op   delta
Large/memdb-1000000-100-16-40/query-no-in-tree-guarantee-16          566B ±37%       88B ± 0%  -84.45%  (p=0.008 n=5+5)
Large/memdb-1000000-100-16-40/query-hits-16                          606B ± 0%       67B ±46%  -88.91%  (p=0.008 n=5+5)
Large/memdb-1000000-100-16-40/iteration-16                          888MB ± 0%     184MB ± 0%  -79.28%  (p=0.016 n=4+5)
Large/memdb-1000000-100-16-40/update-16                            36.7kB ± 3%    37.2kB ± 3%     ~     (p=0.222 n=5+5)
Large/memdb-1000000-100-16-40/block-16                             3.80MB ± 1%    3.86MB ± 1%   +1.62%  (p=0.008 n=5+5)
Large/goleveldb-1000000-100-16-40/query-no-in-tree-guarantee-16    2.72kB ±29%    0.99kB ± 0%  -63.51%  (p=0.008 n=5+5)
Large/goleveldb-1000000-100-16-40/query-hits-16                    2.70kB ± 1%    0.34kB ±46%  -87.34%  (p=0.008 n=5+5)
Large/goleveldb-1000000-100-16-40/iteration-16                     3.71GB ± 0%    0.30GB ± 0%  -92.00%  (p=0.016 n=4+5)
Large/goleveldb-1000000-100-16-40/update-16                        79.8kB ± 6%    84.1kB ±10%     ~     (p=0.421 n=5+5)
Large/goleveldb-1000000-100-16-40/block-16                         9.05MB ± 9%    9.82MB ±16%     ~     (p=0.095 n=5+5)

name                                                             old allocs/op  new allocs/op  delta
Large/memdb-1000000-100-16-40/query-no-in-tree-guarantee-16          10.8 ±39%       3.0 ± 0%  -72.22%  (p=0.008 n=5+5)
Large/memdb-1000000-100-16-40/query-hits-16                          11.0 ± 0%       1.0 ± 0%  -90.91%  (p=0.000 n=5+4)
Large/memdb-1000000-100-16-40/iteration-16                          16.0M ± 0%      3.0M ± 0%  -81.25%  (p=0.016 n=4+5)
Large/memdb-1000000-100-16-40/update-16                               466 ± 8%       459 ± 8%     ~     (p=1.000 n=5+5)
Large/memdb-1000000-100-16-40/block-16                              52.0k ± 0%     53.2k ± 0%   +2.29%  (p=0.008 n=5+5)
Large/goleveldb-1000000-100-16-40/query-no-in-tree-guarantee-16      37.2 ±29%      19.0 ± 0%  -48.92%  (p=0.008 n=5+5)
Large/goleveldb-1000000-100-16-40/query-hits-16                      36.0 ± 0%       6.0 ±50%  -83.33%  (p=0.008 n=5+5)
Large/goleveldb-1000000-100-16-40/iteration-16                      49.8M ± 0%      5.3M ± 0%  -89.34%  (p=0.016 n=4+5)
Large/goleveldb-1000000-100-16-40/update-16                           684 ± 6%       729 ±12%     ~     (p=0.310 n=5+5)
Large/goleveldb-1000000-100-16-40/block-16                          81.5k ± 7%     84.8k ±12%     ~     (p=0.421 n=5+5)

Since the API has changes and bench tests were rewritten between master and the current branch, I had to create a custom version of bench_test.go that works for both branches.

@robert-zaremba I will do a new one in the next few days, sorry for the delay. However, here's the latest one.

lyh169 · 2022-07-25T10:05:30Z

hi, why modify mtx from sync.RWMutex to sync.Mutex type . I think the sync.RWMutex should also work

@marbar3778 thanks

robert-zaremba · 2022-07-26T14:37:01Z

@lyh169 there was a benchmark suggesting that. (normal Mutex was faster)

giskook · 2022-08-31T01:27:28Z

mutable_tree.go

+	orphans                  map[string]int64       // Nodes removed by changes to working tree.
+	versions                 map[int64]bool         // The previous, saved versions of the tree.
+	allRootLoaded            bool                   // Whether all roots are loaded or not(by LazyLoadVersion)
+	unsavedFastNodeAdditions map[string]*FastNode   // FastNodes that have not yet been saved to disk


Why the unsavedFastNodeAdditions and unsavedFastNodeRemovals do not need mtx to protect? the map would not be concurrent access? @p0mvn

giskook · 2022-09-02T14:33:43Z

mutable_tree.go

+	versions                 map[int64]bool         // The previous, saved versions of the tree.
+	allRootLoaded            bool                   // Whether all roots are loaded or not(by LazyLoadVersion)
+	unsavedFastNodeAdditions map[string]*FastNode   // FastNodes that have not yet been saved to disk
+	unsavedFastNodeRemovals  map[string]interface{} // FastNodes that have not yet been removed from disk


Hi, I review the code that the unsavedFastNodeRemovals' value is only bool. so why not use map[string]struct{}? @p0mvn

that is a good optimization

robert-zaremba · 2022-04-07T13:32:50Z

CHANGELOG.md

+
+## 0.17.3 (December 1, 2021)
+
+### Improvements


please remove the 3 lines above. We are targetting master, so #468 should go under Unreleased

robert-zaremba · 2022-04-07T13:34:05Z

benchmarks/bench_test.go

-		{"badgerdb", 1000, 100, 4, 10},
+		// {"cleveldb", 1000, 100, 4, 10},
+		// {"boltdb", 1000, 100, 4, 10},
+		// {"rocksdb", 1000, 100, 4, 10},


could you enable rocksdb benchmarks as well?

robert-zaremba · 2022-04-07T13:35:28Z

export_test.go

-		versions    = 32   // number of versions to generate
-		versionOps  = 4096 // number of operations (create/update/delete) per version
+		versions    = 8    // number of versions to generate
+		versionOps  = 1024 // number of operations (create/update/delete) per version


why changing this?

robert-zaremba · 2022-04-07T13:35:49Z

fast_iterator.go

+	start, end []byte
+
+	valid bool
+
+	ascending bool
+
+	err error
+
+	ndb *nodeDB
+
+	nextFastNode *FastNode
+
+	fastIterator dbm.Iterator


Suggested change

start, end []byte

valid bool

ascending bool

err error

ndb *nodeDB

nextFastNode *FastNode

fastIterator dbm.Iterator

start, end []byte

valid bool

ascending bool

err error

ndb *nodeDB

nextFastNode *FastNode

fastIterator dbm.Iterator

robert-zaremba · 2022-04-07T13:37:10Z

fast_iterator.go

+	start, end := iter.fastIterator.Domain()
+
+	if start != nil {
+		start = start[1:]


why do we remove the first byte? Could you add a doc comment?

robert-zaremba · 2022-04-07T13:39:20Z

fast_iterator.go

+	iter.valid = iter.valid && iter.fastIterator.Valid()
+	if iter.valid {
+		iter.nextFastNode, iter.err = DeserializeFastNode(iter.fastIterator.Key()[1:], iter.fastIterator.Value())
+		iter.valid = iter.err == nil


shouldn't we check iter.fastIterator.Valid() here as well?

robert-zaremba · 2022-09-03T10:34:32Z

wow, I've just realized that I had not submitted comments.

ValarDragon and others added 30 commits February 12, 2022 17:43

Add unbounded key string to KeyFormat

0d91ad8

Add test vectors for unbounded length keys

4040da1

Add some notes

3740d0b

update .gitignore

e175568

Add FastNode struct

24cb701

WIP: make Get work with new FastNode

1c0188d

when retrieving fastnode fails, return errors vs. panic

67f000a

add comments clarifying what index represents

ab5f2cd

make the linter happy

3618100

Add small tweaks to fast_node.go

44c6035

add TODO & small linter tweaks

19fe40d

Update comment

fdfe996

update fast node cache in set

8b5b9a7

add debugging output when falling back to original logic

db293f9

return error instead of panic

606b56f

WIP: refactor set ops to work with fast store

36f0760

update Set of mutable tree, begin unit testing

36b30a1

update GetVersioned to check fast nodes before trying the immutable

2a5c19b

check fast node version before nil value check in get of immutable tree

5e942a1

fix small bugs and typos, continue writing unit tests for Set

2dcf439

unit test saveFastNodeVersion

1dfc795

simplify storing unsaved fast nodes

2e4daa7

resolve a bug with not writing prefix for fast node to disk

6f4630c

remove fast nodes from disk on save and clear fast cache when version…

599937d

… is deleted, fix all tests but random and with index

resolve an issue with randomized tests caused by the fast node cache …

348d7be

…not being cleared when latest version is saved

split unsaved fast node changes into additions and removals

ec07dd1

save fast node removals

ffb4c30

move fast node cache clearing to nodedb

6a1d2d7

use regular logic only when fast node version is greater than immutab…

1db02b3

…le tree'

clean up tree_random_test.go

f3dcb71

p0mvn mentioned this pull request Mar 21, 2022

refactor: snapshot and pruning functionality osmosis-labs/cosmos-sdk#140

Merged

p0mvn and others added 9 commits March 28, 2022 12:49

Merge branch 'master' into roman/iavl_data_locality

b5fd3db

revert #23 (sync access to fast node cache), fix bug related to old h…

67b66c0

…eight export (#33) * Revert "sync access to fast node cache to avoid concurrent write fatal error (#23)" This reverts commit 2a1daf4. * return correct iterator in mutable tree

fix concurrent map panic when querying and comittting concurrently

69129bf

avoid clearing fast node cache during pruning (#35)

b4f8932

fix data race related to VersionExists (#36)

71e959a

* fix data race related to VersionExists * use regular lock instead of RW in mutable_tree.go

hardcode fast node cache size to 100k

b0d8cdf

go fmt

f660a25

restore proof_ics23.go

f56de3a

fix linter

ff9f32d

p0mvn force-pushed the roman/iavl_data_locality branch from 2437420 to ff9f32d Compare March 28, 2022 20:15

p0mvn mentioned this pull request Mar 29, 2022

Upstream iavl and sdk database changes osmosis-labs/osmosis#1175

Closed

alexanderbez approved these changes Apr 8, 2022

View reviewed changes

tac0turtle merged commit 0dcb21b into cosmos:master Apr 9, 2022

p0mvn mentioned this pull request Jun 25, 2022

Benchmarks are not working properly #512

Closed

Ywmet mentioned this pull request Jul 8, 2022

#468 bug report #520

Closed

yihuang mentioned this pull request Aug 24, 2022

Problem: archive node takes a long time to startup crypto-org-chain/cronos#667

Merged

13 tasks

giskook reviewed Aug 31, 2022

View reviewed changes

giskook reviewed Sep 2, 2022

View reviewed changes

robert-zaremba reviewed Sep 3, 2022

View reviewed changes

facundomedica mentioned this pull request Sep 12, 2022

Store: Concurrent iterators cause panic with iavl 0.19 cosmos/cosmos-sdk#13220

Closed

2 tasks

danwt mentioned this pull request Apr 2, 2024

Get fraud proofs working in full dymensionxyz/dymension#786

Closed

coderabbitai bot mentioned this pull request May 13, 2024

chore: changelog onto master #946

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Fast storage optimization for queries and iterations #468

feat: Fast storage optimization for queries and iterations #468

p0mvn commented Feb 13, 2022 •

edited

Loading

p0mvn commented Mar 29, 2022 •

edited

Loading

tac0turtle commented Apr 5, 2022

robert-zaremba commented Apr 7, 2022

p0mvn commented Apr 8, 2022

lyh169 commented Jul 25, 2022 •

edited

Loading

robert-zaremba commented Jul 26, 2022

giskook Aug 31, 2022 •

edited

Loading

giskook Sep 2, 2022

robert-zaremba Sep 3, 2022

robert-zaremba Apr 7, 2022

robert-zaremba Apr 7, 2022

robert-zaremba Apr 7, 2022

robert-zaremba Apr 7, 2022

robert-zaremba Apr 7, 2022

robert-zaremba Apr 7, 2022

robert-zaremba commented Sep 3, 2022

feat: Fast storage optimization for queries and iterations #468

feat: Fast storage optimization for queries and iterations #468

Conversation

p0mvn commented Feb 13, 2022 • edited Loading

p0mvn commented Mar 29, 2022 • edited Loading

tac0turtle commented Apr 5, 2022

robert-zaremba commented Apr 7, 2022

p0mvn commented Apr 8, 2022

lyh169 commented Jul 25, 2022 • edited Loading

robert-zaremba commented Jul 26, 2022

giskook Aug 31, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

robert-zaremba commented Sep 3, 2022

p0mvn commented Feb 13, 2022 •

edited

Loading

p0mvn commented Mar 29, 2022 •

edited

Loading

lyh169 commented Jul 25, 2022 •

edited

Loading

giskook Aug 31, 2022 •

edited

Loading