Simplify evictable cache implementation#10949
Conversation
eccab78 to
2299a21
Compare
lib/trino-collect/src/test/java/io/trino/collect/cache/TestEvictableCache.java
Outdated
Show resolved
Hide resolved
lib/trino-collect/src/main/java/io/trino/collect/cache/EvictableCache.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
why did you choose to invalidate first the dataCache and not the tokens ?
There was a problem hiding this comment.
Let's consider 2 threads: T1, T2
//invalidation call T1
Token token = tokens.remove(key); // 1
if (token != null) {
dataCache.invalidate(token); // 3
// get call T2
Token token = tokens.computeIfAbsent(key, ignored -> newToken); // 2
return dataCache.get(token); // 4
tokens map will contain an outdated useless token.
Per com.google.common.cache.Cache javadoc:
@param the type of the cache's values, which are not permitted to be null
In case that the value returned by dataCache.get(token) is null we should maybe remove remove the token from tokens. I'm not sure however if there's any gain in this.
There was a problem hiding this comment.
tokensmap will contain an outdated useless token.
that would be a problem, but i don't see this happening in your example.
There was a problem hiding this comment.
It does not look correct. It feels you can get leaked entry in dataCache.
Consider following flow:
* (T1) invalidateAll
* (T1) dataCache.invalidateAll();
* entries from cache are gone
* (T2) put(K, V)
* (T2) load new value into `dataCache` for token which is present in the `tokens` map
* (T1) `tokens.clear()`
Now we have an entry in dataCache for token which is no longer present in tokens map.
Note: maybe that is not possible if we removalListeners are executed synchronously as part of dataCache.invalidateAll();. But I am. not sure if that is true.
There was a problem hiding this comment.
It does not look correct. It feels you can get leaked entry in dataCache.
correct, there is a small window where a value is loaded and subsequently made inaccessible.
this isn't a leak though, assuming the cache is bounded -- which was my assumption, but was not guaranteed.
I am adding a check in EvictableCacheBuilder mandating the cache to be bounded.
There was a problem hiding this comment.
Feel safer would be to get a copy of
tokens;
that would require switching away from ConcurrentHashMap, right?
There was a problem hiding this comment.
(T1) invalidateAll call
dataCache.invalidateAll(); //1
tokens.clear(); //4
T2 get(key, valueLoader) call
Token<K> token = tokens.computeIfAbsent(key, ignored -> newToken); //2
return dataCache.get(token, valueLoader); //3
After //4 the dataCache still contains an "lost" entry for which there is no corresponding tokens entry.
The same problem may occur if we flip the statements in the invalidateAll call to have the tokens cleared first.
There was a problem hiding this comment.
Reproduction scenario which showcases that the cache size is sometimes greater than 0 after calling invalidateAll:
@Test
public void testInvalidateAllAndLoadConcurrently()
throws Exception
{
AtomicLong remoteState = new AtomicLong(1);
Cache<Integer, Long> cache = EvictableCacheBuilder.newBuilder().maximumSize(10_000).build();
Integer key = 42;
int threads = 10_000;
ExecutorService executor = newFixedThreadPool(threads);
try {
List<Future<Void>> futures = IntStream.range(0, threads)
.mapToObj(threadNumber -> executor.submit(() -> {
if (threadNumber%2 == 0) {
cache.get(key, remoteState::get);
}else{
cache.invalidateAll();
long cacheSize = cache.size();
if (cacheSize > 0){
System.out.println(cacheSize);
}
}
return (Void) null;
}))
.collect(toImmutableList());
futures.forEach(MoreFutures::getFutureValue);
}
finally {
executor.shutdownNow();
executor.awaitTermination(10, SECONDS);
}
}
There was a problem hiding this comment.
After //4 the
dataCachestill contains an "lost" entry for which there is no correspondingtokensentry.
absolutely.
lib/trino-collect/src/main/java/io/trino/collect/cache/EvictableCache.java
Outdated
Show resolved
Hide resolved
lib/trino-collect/src/main/java/io/trino/collect/cache/EvictableCache.java
Outdated
Show resolved
Hide resolved
lib/trino-collect/src/main/java/io/trino/collect/cache/EvictableCache.java
Outdated
Show resolved
Hide resolved
2299a21 to
2501486
Compare
lib/trino-collect/src/main/java/io/trino/collect/cache/EvictableCache.java
Outdated
Show resolved
Hide resolved
2501486 to
ad35166
Compare
There was a problem hiding this comment.
I don't understand why this is disabled
There was a problem hiding this comment.
because it has maximumSize(0)
lib/trino-collect/src/main/java/io/trino/collect/cache/EvictableCache.java
Outdated
Show resolved
Hide resolved
eaf6eeb to
6bd36cd
Compare
lib/trino-collect/src/test/java/io/trino/collect/cache/TestEvictableCache.java
Outdated
Show resolved
Hide resolved
6bd36cd to
42f50c2
Compare
|
rebased (conflicts) |
There was a problem hiding this comment.
From hashCode javadoc:
As much as is reasonably practical, the hashCode method defined
by class {@code Object} does return distinct integers for
distinct objects
IIUC this says same hashcode can be returned for different objects present in VM at the same time (maybe I am reading it incorrectly).
Given that does it make sense to put hashcode as part of the toString result?
There was a problem hiding this comment.
i found it helpful and default Object.toString does exactly that, but i see it could be misleading.
3061254 to
a2ee24c
Compare
There was a problem hiding this comment.
nit: this is not really related to the current commit.
Reusing the token-based approach of `EvictableLoadingCache` allows implementing both evictable `Cache` (previously known as `EvictableCache`) and evictable `LoadingCache` (`EvictableLoadingCache`) in a concise manner. This also introducing a builder for the caches (`EvictableCacheBuilder`), which unlocks flexibility of `CacheBuilder` (like weighted cache entries), while still preventing unsupported usage patterns. For example, registering `removalListener` is not allowed, as removal listener is internally used by the cache implementation.
|
CI #10932, some PT timeouts |
978e914 to
a48b9af
Compare
|
CI #10932 |
Reusing the token-based approach of
EvictableLoadingCacheallowsimplementing both evictable
Cache(previously known asEvictableCache) and evictableLoadingCache(EvictableLoadingCache)in a concise manner. This also introducing a builder for the caches
(
EvictableCacheBuilder), which unlocks flexibility ofCacheBuilder(like weighted cache entries), while still preventingunsupported usage patterns. For example, registering
removalListeneris not allowed, as removal listener is internally used by the cache
implementation.