add findCache for memoryIdx #1233

woodsaj · 2019-03-08T19:26:24Z

Cache patterns and their results in a LRU.
When new defs are added to the index, only cached patterns that match
the new series name are invalidated. The cache is purged after every
prune task runs and after any call to delete items from the index.

Cache patterns and their results in a LRU. When new defs are added to the index, only cached patterns that match the new series name are invalidated. The cache is purged after every prune task runs and after any call to delete items from the index.

When new series are recieved we need to invlidate any findCache patterns that match the series name. If there are lots of new series being created, this can have a negative performance impact. This change causes the cache to be disabled for 1minute when the rate of new series is higher than what we can process.

- add config settings for: - find-cache-invalidate-queue: number of new series names to process Each new series name is compared against expresions in the findCache if the expression matches the new name, it is removed from the cache. - find-cache-backoff: amount of time to disable the findCache for when the invalidate-queue fills up. If there are lots of new series being added then scanning the cache for patterns that match can become exspensive, so it is just best to disable the cache for a while. - log a message when the cache is disabled due to the invalidateQueue limit being reached.

idx/memory/find_cache.go

replay · 2019-03-11T10:26:52Z

Over the past few weeks we've been trying to make MT use less memory, while sacrificing a bit more CPU for that (f.e. object interning). This change is going into the opposite direction again (more mem / less cpu), so maybe we should disable the find cache by default and only enable it when we know we need it?

idx/memory/find_cache.go

idx/memory/memory.go

robert-milan · 2019-03-11T10:41:57Z

idx/memory/find_cache.go

+		findCacheMiss.Inc()
+		return nil, false
+	}
+	nodes, ok := cache.Get(pattern)


Is there a chance that cache could have already been deleted before attempting this call in some fringe situations?

Edit:
I think all of the calls are wrapped inside RLocks or Locks in the index.

It is possible that one thread calls cache, ok := c.cache[orgId] then releases the lock, then before nodes, ok := cache.Get(pattern) is executed another thread gets a write lock and calls delete(c.cache, orgId). But it doesnt really matter. This is still safe as the items in c.cache[orgId] are pointers. The end result is that calls to findCache.Get() will return results based on the content of cache when the Rlock() was acquired an not on the contents of the cache at the specific time that nodes, ok := cache.Get(pattern) is executed.

Co-Authored-By: woodsaj <[email protected]>

woodsaj · 2019-03-11T11:38:02Z

Over the past few weeks we've been trying to make MT use less memory, while sacrificing a bit more CPU for that (f.e. object interning). This change is going into the opposite direction again (more mem / less cpu),

It is more memory, but not much, as the cache is just a map of strings, that point to slices of pointers. Also, though heap memory could be potentially higher, instances with moderate query loads will see a reduction in allocations (which are needed to search the tree) which will help keep RSS lower.

robert-milan

How are the benchmarks looking?

replay

LGTM

Dieterbe · 2019-03-11T23:31:05Z

+1 to what Robert said.
any PR with an impact on performance should show relevant benchmark results.

woodsaj · 2019-03-12T14:42:22Z

Comparison between master and this branch for relevant Benchmarks

benchmark                                         old ns/op     new ns/op     delta
BenchmarkConcurrent4Find/partitioned-8            150287        40939         -72.76%
BenchmarkConcurrent4Find/unPartitioned-8          43931         22132         -49.62%
BenchmarkConcurrent8Find/partitioned-8            201209        37329         -81.45%
BenchmarkConcurrent8Find/unPartitioned-8          59867         19462         -67.49%
BenchmarkConcurrentInsertFind/partitioned-8       412081        42666         -89.65%
BenchmarkConcurrentInsertFind/unPartitioned-8     364459        20785         -94.30%

benchmark                                         old allocs     new allocs     delta
BenchmarkConcurrent4Find/partitioned-8            2324           257            -88.94%
BenchmarkConcurrent4Find/unPartitioned-8          637            187            -70.64%
BenchmarkConcurrent8Find/partitioned-8            2324           257            -88.94%
BenchmarkConcurrent8Find/unPartitioned-8          637            187            -70.64%
BenchmarkConcurrentInsertFind/partitioned-8       11788          258            -97.81%
BenchmarkConcurrentInsertFind/unPartitioned-8     10115          187            -98.15%

benchmark                                         old bytes     new bytes     delta
BenchmarkConcurrent4Find/partitioned-8            95997         27794         -71.05%
BenchmarkConcurrent4Find/unPartitioned-8          34411         18879         -45.14%
BenchmarkConcurrent8Find/partitioned-8            95987         27792         -71.05%
BenchmarkConcurrent8Find/unPartitioned-8          34411         18881         -45.13%
BenchmarkConcurrentInsertFind/partitioned-8       524699        27841         -94.69%
BenchmarkConcurrentInsertFind/unPartitioned-8     484934        18897         -96.10%

Dieterbe · 2019-03-12T17:24:11Z

this needs more comments about the design and implementation.
the backoff/InvalidateQueue stuff in particular is hard to follow.

idx/memory/memory.go

Dieterbe · 2019-03-12T17:54:04Z

Can we add a fast-path for the find-cache disabled case? so it immediately returns out of Add/Purge/.. calls when disabled?

idx/memory/find_cache.go

woodsaj · 2019-03-12T18:23:40Z

Can we add a fast-path for the find-cache disabled case? so it immediately returns out of Add/Purge/.. calls when disabled?

we already do that. Caches are per orgId, of c.cache[orgId] doesnt exist we immediately return.

idx/memory/find_cache.go

Dieterbe · 2019-03-13T08:49:12Z

Can we add a fast-path for the find-cache disabled case? so it immediately returns out of Add/Purge/.. calls when disabled?

we already do that. Caches are per orgId, of c.cache[orgId] doesnt exist we immediately return.

this doesn't seem to be true. Add() will simply add the new cache if it doesn't exist yet, so any subsequent calls will see a cache and operate on it.

woodsaj · 2019-03-13T11:10:08Z

this doesn't seem to be true. Add() will simply add the new cache if it doesn't exist yet, so any subsequent calls will see a cache and operate on it.

https://github.com/grafana/metrictank/blob/master/idx/memory/find_cache.go#L76-L79

Dieterbe · 2019-03-13T11:20:29Z

Not sure what you're trying to say here. t would be time.Time{} there (== unix epoch) so that branch wouldn't be taken. what am i missing?

woodsaj · 2019-03-13T18:19:25Z

t would be time.Time{}

If the cache is disabled, t will be a time in the future. https://github.com/grafana/metrictank/blob/master/idx/memory/find_cache.go#L138

Dieterbe · 2019-03-19T13:00:16Z

future proofing the above URL's:

metrictank/idx/memory/find_cache.go

Lines 76 to 79 in 707e6ce

    
           // dont init the cache if we are in backoff mode. 
        
           if time.Until(t) > 0 { 
        
           	return 
        
           }

metrictank/idx/memory/find_cache.go

Line 138 in 707e6ce

c.backoff[orgId] = time.Now().Add(c.backoffTime)

Dieterbe · 2019-03-19T18:13:08Z

we cleared it up in a call: the cache cannot be disabled.

woodsaj requested review from replay and robert-milan March 8, 2019 19:26

add findCache for memoryIdx

c596f23

Cache patterns and their results in a LRU. When new defs are added to the index, only cached patterns that match the new series name are invalidated. The cache is purged after every prune task runs and after any call to delete items from the index.

woodsaj force-pushed the findCache branch from fa43a86 to c596f23 Compare March 8, 2019 19:27

woodsaj added 4 commits March 9, 2019 04:27

update deps

a988601

gofmt

e61cc0c

update docs/metrics.md

9857ef1

purge findCache if rate of new series is high

040ec58

woodsaj force-pushed the findCache branch 3 times, most recently from 8e3f270 to 2c70e84 Compare March 11, 2019 06:13

woodsaj added 2 commits March 11, 2019 14:43

woodsaj force-pushed the findCache branch from db19dfe to 2088aab Compare March 11, 2019 06:55

woodsaj added 3 commits March 11, 2019 16:44

fix buggy InvalidateFor func. Add idx.PurgeAll() func

b219cdc

update ConcurrentInsertFind benchmark to perform real searches

45a97ae

fix docs/metrics.md for findCache metrics

3c3ff4b

replay reviewed Mar 11, 2019

View reviewed changes

idx/memory/find_cache.go Show resolved Hide resolved

increment findCacheMiss when the findCache is empty

2d6eb07

robert-milan reviewed Mar 11, 2019

View reviewed changes

idx/memory/find_cache.go Show resolved Hide resolved

replay reviewed Mar 11, 2019

View reviewed changes

idx/memory/find_cache.go Outdated Show resolved Hide resolved

woodsaj added 2 commits March 11, 2019 18:08

correctly build tree in findCache.InvalidateFor

e20f590

add findCache settings for metrictank-docker.ini

4450bb6

robert-milan reviewed Mar 11, 2019

View reviewed changes

idx/memory/find_cache.go Outdated Show resolved Hide resolved

replay reviewed Mar 11, 2019

View reviewed changes

idx/memory/memory.go Outdated Show resolved Hide resolved

robert-milan reviewed Mar 11, 2019

View reviewed changes

robert-milan and others added 2 commits March 11, 2019 03:58

be consistent with return values in FindCache.Get

7881e57

Co-Authored-By: woodsaj <[email protected]>

avoid race conditions in find_cache

6d798d8

robert-milan reviewed Mar 11, 2019

View reviewed changes

replay approved these changes Mar 11, 2019

View reviewed changes

robert-milan approved these changes Mar 12, 2019

View reviewed changes

Dieterbe reviewed Mar 12, 2019

View reviewed changes

idx/memory/memory.go Show resolved Hide resolved

Dieterbe reviewed Mar 12, 2019

View reviewed changes

idx/memory/memory.go Show resolved Hide resolved

Dieterbe reviewed Mar 12, 2019

View reviewed changes

idx/memory/find_cache.go Outdated Show resolved Hide resolved

woodsaj added 3 commits March 13, 2019 02:15

dont print info logs in tests and benchmarks

1b7f60f

pass findCache tunables to constructor

5cebf22

add unit tests for findCache

1c17c57

Dieterbe reviewed Mar 12, 2019

View reviewed changes

idx/memory/find_cache.go Show resolved Hide resolved

woodsaj added 4 commits March 13, 2019 03:23

call InvlidateFor when there are not too many items to purge.

3b6d799

update code comments for findCache

0471892

call findCache.InvalidateFor if number of deletedDefs is low

d27fb9b

improve accuracy of findCache unit test

15786e9

woodsaj merged commit 707e6ce into master Mar 12, 2019

woodsaj deleted the findCache branch March 12, 2019 19:48

replay mentioned this pull request Apr 2, 2019

findCache performance improvements #1262

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add findCache for memoryIdx #1233

add findCache for memoryIdx #1233

woodsaj commented Mar 8, 2019

replay commented Mar 11, 2019

robert-milan Mar 11, 2019 •

edited

Loading

woodsaj Mar 11, 2019 •

edited

Loading

woodsaj commented Mar 11, 2019

robert-milan left a comment

replay left a comment

Dieterbe commented Mar 11, 2019

woodsaj commented Mar 12, 2019

Dieterbe commented Mar 12, 2019

Dieterbe commented Mar 12, 2019

woodsaj commented Mar 12, 2019

Dieterbe commented Mar 13, 2019 •

edited

Loading

woodsaj commented Mar 13, 2019

Dieterbe commented Mar 13, 2019

woodsaj commented Mar 13, 2019

Dieterbe commented Mar 19, 2019

Dieterbe commented Mar 19, 2019

add findCache for memoryIdx #1233

add findCache for memoryIdx #1233

Conversation

woodsaj commented Mar 8, 2019

replay commented Mar 11, 2019

robert-milan Mar 11, 2019 • edited Loading

Choose a reason for hiding this comment

woodsaj Mar 11, 2019 • edited Loading

Choose a reason for hiding this comment

woodsaj commented Mar 11, 2019

robert-milan left a comment

Choose a reason for hiding this comment

replay left a comment

Choose a reason for hiding this comment

Dieterbe commented Mar 11, 2019

woodsaj commented Mar 12, 2019

Dieterbe commented Mar 12, 2019

Dieterbe commented Mar 12, 2019

woodsaj commented Mar 12, 2019

Dieterbe commented Mar 13, 2019 • edited Loading

woodsaj commented Mar 13, 2019

Dieterbe commented Mar 13, 2019

woodsaj commented Mar 13, 2019

Dieterbe commented Mar 19, 2019

Dieterbe commented Mar 19, 2019

robert-milan Mar 11, 2019 •

edited

Loading

woodsaj Mar 11, 2019 •

edited

Loading

Dieterbe commented Mar 13, 2019 •

edited

Loading