Improve shards evictions in searchable snapshot cache service#67160
Improve shards evictions in searchable snapshot cache service#67160tlrx merged 19 commits intoelastic:masterfrom
Conversation
|
Pinging @elastic/es-distributed (Team:Distributed) |
| /** | ||
| * Generates one or more cache files using the specified {@link CacheService}. Each cache files have been written at least once. | ||
| */ | ||
| protected List<CacheFile> randomCacheFiles(CacheService cacheService) throws Exception { |
There was a problem hiding this comment.
This has been moved from the PersistentCacheTests class and sightly changed so that it always generate at least one random cache file and always read/write a range in each cache file.
There was a problem hiding this comment.
I think it is ok to only wait for running shard evictions to complete and not process other shard evictions that have not been processed yet: if the shard files are still on disk they will be reused or removed after node starts again and if the shard files have been deleted from disk they won't be reloaded at node start up by the persistent cache logic.
There was a problem hiding this comment.
Agreed. We want to terminate as fast as we can.
I wonder if a simple read-write lock scheme could suffice? Processing the eviction takes the read-lock (and checks that state==started) and stop takes the write lock.
|
@henningandersen This relates one of your #66173 (comment) - sorry for the time it took me to get there. |
This commit removes an assertion that makes many tests to fail on CI until elastic#67160 is merged, which has been opened to fix the underlying issues around that assertion tripping (and brings back the assertion). Relates elastic#67160
This commit removes an assertion that makes many tests to fail on CI until elastic#67160 is merged, which has been opened to fix the underlying issues around that assertion tripping (and brings back the assertion). Relates elastic#67160
henningandersen
left a comment
There was a problem hiding this comment.
Thanks for addressing this. I have a few suggestions to simplify the code a bit. I might have missed a finer point on why this cannot work out.
There was a problem hiding this comment.
I wonder if we could remove runningShardsEvictions if we make pendingShardsEvictions a Map<ShardEviction, Future> and register the future that threadPool.generic().submit() returns in the map? Since we do this under the lock, we should be able to check whether it exists before submitting and registering in the map.
evictCacheFilesIfNeeded would get from the map (under lock) and wait for the future (not under lock).
There was a problem hiding this comment.
I agree this simplifies the code. One of the motivation to differentiate pending and running shard evictions was to allow a searchable snapshot shard to immediately execute the eviction of cache files, while in your suggestion it would mean that the shard will have to wait for the submitted runnable to complete in the generic thread pool before moving forward with recovery. Maybe we are OK with this?
There was a problem hiding this comment.
That is a good point, thanks for clarifying.
With the current threading this seems impossible to happen, since evictCacheFilesIfNeeded is always called on the generic thread pool. Since the evict of shard folders must happen before restoring the same searchable snapshot shard again, the initial shard evict job would come before a subsequent restore of the same searchable snapshot shard in the generic queue. Thus if the restore is running, the evict job must either have started to run or completed.
I would be inclined to rely on this for now. If those were separate threads, I suppose waiting would like be out of the question so would require a new design if we change the threading model.
There was a problem hiding this comment.
the initial shard evict job would come before a subsequent restore of the same searchable snapshot shard in the generic queue
Thanks Henning, I think you're right. I went with your suggestion of simplification.
There was a problem hiding this comment.
Agreed. We want to terminate as fast as we can.
I wonder if a simple read-write lock scheme could suffice? Processing the eviction takes the read-lock (and checks that state==started) and stop takes the write lock.
There was a problem hiding this comment.
I am not sure we want a timeout here? I have a hard time reason through what it means when timed out - and if we can safely handle that, we should just avoid the wait altogether (not proposing that). I see value though in first waiting for some seconds, then outputting a warning before waiting indefinitely.
There was a problem hiding this comment.
I think we can also just do without a timeout, in tests we will interrupt if the shutdown takes too long anyway so just no timeout and logging the interrupted exception is good enough IMO
There was a problem hiding this comment.
Sure. I went with Henning's suggestion of using a read/write lock for this.
original-brownbear
left a comment
There was a problem hiding this comment.
Some smaller comments, didn't go through all details now before since there's some open design questions from Henning.
There was a problem hiding this comment.
Maybe it would be better to use a GroupedActionListener + PlainActionFuture here so we collect the Exceptions if any occur instead of suppressing them?
There was a problem hiding this comment.
I don't think this really needs a timeout setting does it? It's really only relevant for tests to begin with and we can just hard-code chose a reasonable value like 10s (that's what we did in pretty much all other similar places like waiting for recoveries to finish/cancel etc).
There was a problem hiding this comment.
No it does not, I went with a hard coded timeout of 10s then wait indefinitely.
...-snapshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/CacheService.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
NIT: This can just return Future
There was a problem hiding this comment.
I think this could just be simplified with computeIfAbsent in an else branch?
There was a problem hiding this comment.
I think we can also just do without a timeout, in tests we will interrupt if the shutdown takes too long anyway so just no timeout and logging the interrupted exception is good enough IMO
|
Thanks a lot @henningandersen for the multiple comments and suggestions! |
…c#67160) The searchable snapshot's cache service is notified when cache files of a specific shard must be evicted. The notifications are usually done in a cluster state applier thread that calls the CacheService# markShardAsEvictedInCache method. The markShardAsEvictedInCache adds the shard to an internal set of ShardEviction and submits the eviction of the shard to the generic thread pool. Because there's nothing preventing the cache service (and persistent cache service) to be closed before all shared evictions are processed, it is possible that invalidating a cache file fails and trips an assertion (as it happened in many tests failures recently elastic#66958, elastic#66730). This commit changes the CacheService so that it now waits for the evictions of shards to complete before closing the cache and persistent cache services.
…c#67160) The searchable snapshot's cache service is notified when cache files of a specific shard must be evicted. The notifications are usually done in a cluster state applier thread that calls the CacheService# markShardAsEvictedInCache method. The markShardAsEvictedInCache adds the shard to an internal set of ShardEviction and submits the eviction of the shard to the generic thread pool. Because there's nothing preventing the cache service (and persistent cache service) to be closed before all shared evictions are processed, it is possible that invalidating a cache file fails and trips an assertion (as it happened in many tests failures recently elastic#66958, elastic#66730). This commit changes the CacheService so that it now waits for the evictions of shards to complete before closing the cache and persistent cache services.
#67519) The searchable snapshot's cache service is notified when cache files of a specific shard must be evicted. The notifications are usually done in a cluster state applier thread that calls the CacheService# markShardAsEvictedInCache method. The markShardAsEvictedInCache adds the shard to an internal set of ShardEviction and submits the eviction of the shard to the generic thread pool. Because there's nothing preventing the cache service (and persistent cache service) to be closed before all shared evictions are processed, it is possible that invalidating a cache file fails and trips an assertion (as it happened in many tests failures recently #66958, #66730). This commit changes the CacheService so that it now waits for the evictions of shards to complete before closing the cache and persistent cache services.
#67517) The searchable snapshot's cache service is notified when cache files of a specific shard must be evicted. The notifications are usually done in a cluster state applier thread that calls the CacheService# markShardAsEvictedInCache method. The markShardAsEvictedInCache adds the shard to an internal set of ShardEviction and submits the eviction of the shard to the generic thread pool. Because there's nothing preventing the cache service (and persistent cache service) to be closed before all shared evictions are processed, it is possible that invalidating a cache file fails and trips an assertion (as it happened in many tests failures recently #66958, #66730). This commit changes the CacheService so that it now waits for the evictions of shards to complete before closing the cache and persistent cache services.
The searchable snapshot's cache service is notified when cache files of a specific shard must be evicted. The notifications are usually done in a cluster state applier thread that calls the
CacheService#markShardAsEvictedInCachemethod.The
markShardAsEvictedInCacheadds the shard to an internal set ofShardEvictionand submits the eviction of the shard to thegenericthread pool. Because there's nothing preventing the cache service (and persistent cache service) to be closed before all shared evictions are processed, it is possible that invalidating a cache file fails and trips an assertion (as it happened in many tests failures recently #66958, #66730).This pull request changes the
CacheServiceso that it now waits for the evictions of shards to complete before closing the cache and persistent cache services. Like before, it allows searchable snapshot shards that have been previously marked as evicted to start by forcing the eviction of existing cache files. Finally, it removes theKeyedLockused before and improves test coverage.