Skip to content

Improve shards evictions in searchable snapshot cache service#67519

Merged
tlrx merged 1 commit intoelastic:7.11from
tlrx:improve-shards-evictions-7.11
Jan 14, 2021
Merged

Improve shards evictions in searchable snapshot cache service#67519
tlrx merged 1 commit intoelastic:7.11from
tlrx:improve-shards-evictions-7.11

Conversation

@tlrx
Copy link
Member

@tlrx tlrx commented Jan 14, 2021

The searchable snapshot's cache service is notified when cache files
of a specific shard must be evicted. The notifications are usually done
in a cluster state applier thread that calls the CacheService#
markShardAsEvictedInCache method.

The markShardAsEvictedInCache adds the shard to an internal set
of ShardEviction and submits the eviction of the shard to the generic
thread pool. Because there's nothing preventing the cache service
(and persistent cache service) to be closed before all shared evictions
are processed, it is possible that invalidating a cache file fails and trips
an assertion (as it happened in many tests failures recently #66958, #66730).

This commit changes the CacheService so that it now waits for the evictions
of shards to complete before closing the cache and persistent cache services.

Backport of #67160 for 7.11.1

…c#67160)

The searchable snapshot's cache service is notified when cache files
of a specific shard must be evicted. The notifications are usually done
in a cluster state applier thread that calls the CacheService#
markShardAsEvictedInCache method.

The markShardAsEvictedInCache adds the shard to an internal set
of ShardEviction and submits the eviction of the shard to the generic
 thread pool. Because there's nothing preventing the cache service
(and persistent cache service) to be closed before all shared evictions
are processed, it is possible that invalidating a cache file fails and trips
an assertion (as it happened in many tests failures recently elastic#66958, elastic#66730).

This commit changes the CacheService so that it now waits for the evictions
of shards to complete before closing the cache and persistent cache services.
@tlrx tlrx added :Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs backport v7.11.1 labels Jan 14, 2021
@elasticmachine elasticmachine added the Team:Distributed Meta label for distributed team. label Jan 14, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@tlrx tlrx merged commit 1ee5d91 into elastic:7.11 Jan 14, 2021
@tlrx tlrx deleted the improve-shards-evictions-7.11 branch January 14, 2021 16:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport :Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs Team:Distributed Meta label for distributed team. v7.11.1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments