Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace caffeine cache with reference to full map #2214

Merged
merged 45 commits into from
Aug 6, 2021
Merged
Show file tree
Hide file tree
Changes from 28 commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
3b66cc1
Split up caffeine cache
Jun 24, 2021
72348b6
Fix unit tests
Jun 24, 2021
bc29c9c
Update cache to LoadingCache
Jun 24, 2021
652d42a
Use isEnabled
Jun 24, 2021
df17bbe
Clean up test module
Jun 24, 2021
8cb4138
Clean up more imports
Jun 24, 2021
a020b92
Update ttls and deploy states call
Jun 25, 2021
7a77ce2
Rename config values
Jun 25, 2021
80bb901
Missed a value
Jun 25, 2021
1e95103
update logging
Jun 25, 2021
7b1ca90
rename to ApiCache
Jun 29, 2021
e8af8b1
Update deploy api cache load
Jun 30, 2021
ee556cd
Split up cache enabling config values
Jun 30, 2021
50e1dee
Add debug for cache loading
Jun 30, 2021
69ebebc
Use a set as key
Jul 1, 2021
f7174d3
Use cache loader for async reloading
Jul 16, 2021
488b35c
copypasta error config value
Jul 16, 2021
dd24913
Use AtomiceReference map instead of LoadingCache
Jul 23, 2021
5947b22
Remove caffeine dep
Jul 23, 2021
6a0732b
update zk call
Jul 26, 2021
9373f5e
another log line and return typo
Jul 26, 2021
1517b59
More debug lines
Jul 26, 2021
ebaa6ed
Add try/catch
Jul 26, 2021
b2a1f25
Removed a heavy debug line
Jul 26, 2021
e3e3ea2
Be more selective with debug info
Jul 27, 2021
f2a2e22
Trying to get around first run exception
Jul 27, 2021
5b2069b
Update timing for scheduled reloader
Jul 27, 2021
bfbcf36
Reloader keeps starting before curator framework
Jul 27, 2021
642ebe1
Single threaded executor, lifecycle manage reloading
Jul 28, 2021
c23fcaf
Add managers to test Lifecycle
Jul 28, 2021
76aaa10
Load map ref when started
Jul 28, 2021
14a9ab9
use request cache for other calls
Jul 28, 2021
9e7555b
Update caching to 5 seconds
Jul 29, 2021
efcd68b
Add skipCache flag
Jul 29, 2021
5b97f8b
Add flag to deploy manager side
Jul 30, 2021
254d6e4
make skipApiCache a setter and deprecate
Jul 30, 2021
12d6162
More debug and fallback to ZK
Jul 30, 2021
862c22e
Finesse logging
Jul 30, 2021
3514f4b
ApiCache disabled for leader & missed params
Aug 3, 2021
9cafb6c
Try implement leader listener
Aug 3, 2021
2223fdb
Logging around starting and stopping reloader on leader update
Aug 4, 2021
9cdd205
give ApiCache the leaderlatch instead
Aug 4, 2021
73875d6
debug logs for leader clean up
Aug 4, 2021
6f2849b
More debug logs
Aug 4, 2021
e7bc9d9
Make all get* logs trace
Aug 4, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 0 additions & 5 deletions SingularityService/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -510,11 +510,6 @@
<scope>test</scope>
</dependency>

<dependency>
<groupId>com.github.ben-manes.caffeine</groupId>
<artifactId>caffeine</artifactId>
</dependency>

</dependencies>

<build>
Expand Down
Original file line number Diff line number Diff line change
@@ -1,13 +1,10 @@
package com.hubspot.singularity;

import com.github.benmanes.caffeine.cache.Cache;
import com.github.benmanes.caffeine.cache.Caffeine;
import com.google.common.base.Function;
import com.google.inject.Binder;
import com.google.inject.Module;
import com.google.inject.Provides;
import com.google.inject.Singleton;
import com.google.inject.name.Named;
import com.hubspot.dropwizard.guicier.DropwizardAwareModule;
import com.hubspot.mesos.client.SingularityMesosClientModule;
import com.hubspot.mesos.client.UserAndPassword;
Expand All @@ -29,14 +26,10 @@
import com.hubspot.singularity.resources.SingularityOpenApiResource;
import com.hubspot.singularity.resources.SingularityResourceModule;
import com.hubspot.singularity.scheduler.SingularitySchedulerModule;
import java.util.List;
import java.util.Optional;
import java.util.concurrent.TimeUnit;

public class SingularityServiceModule
extends DropwizardAwareModule<SingularityConfiguration> {
public static final String REQUESTS_CAFFEINE_CACHE =
"singularity.service.resources.request";
private final Function<SingularityConfiguration, Module> dbModuleProvider;
private Optional<Class<? extends LoadBalancerClient>> lbClientClass = Optional.empty();

Expand Down Expand Up @@ -135,16 +128,4 @@ public IndexViewConfiguration provideIndexViewConfiguration(
.contains(SingularityAuthenticatorClass.WEBHOOK)
);
}

@Provides
@Singleton
@Named(REQUESTS_CAFFEINE_CACHE)
public Cache<String, List<SingularityRequestParent>> getRequestsCaffeineCache() {
SingularityConfiguration configuration = getConfiguration();

return Caffeine
.newBuilder()
.expireAfterWrite(configuration.getCaffeineCacheTtl(), TimeUnit.SECONDS)
.build();
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -444,10 +444,12 @@ public class SingularityConfiguration extends Configuration {
private double statusQueueNearlyFull = 0.8;

// Enable caffeine cache on heavily requested endpoint
private boolean useCaffeineCache = false;
private boolean useApiCacheInRequestManager = false;
private boolean useApiCacheInDeployManager = false;

// Caffeine cache ttl
private int caffeineCacheTtl = 1;
// Atomic Reference cache TTLs
private int deployCacheTtlInSeconds = 1;
private int requestCacheTtlInSeconds = 1;

public long getAskDriverToKillTasksAgainAfterMillis() {
return askDriverToKillTasksAgainAfterMillis;
Expand Down Expand Up @@ -2076,19 +2078,35 @@ public void setStatusQueueNearlyFull(double statusQueueNearlyFull) {
this.statusQueueNearlyFull = statusQueueNearlyFull;
}

public boolean useCaffeineCache() {
return useCaffeineCache;
public boolean useApiCacheInRequestManager() {
return useApiCacheInRequestManager;
}

public void setUseCaffeineCache(boolean useCaffeineCache) {
this.useCaffeineCache = useCaffeineCache;
public void setUseApiCacheInRequestManager(boolean useApiCacheInRequestManager) {
this.useApiCacheInRequestManager = useApiCacheInRequestManager;
}

public int getCaffeineCacheTtl() {
return caffeineCacheTtl;
public boolean useApiCacheInDeployManager() {
return useApiCacheInDeployManager;
}

public void setCaffeineCacheTtl(int caffeineCacheTtl) {
this.caffeineCacheTtl = caffeineCacheTtl;
public void setUseApiCacheInDeployManager(boolean useApiCacheInDeployManager) {
this.useApiCacheInDeployManager = useApiCacheInDeployManager;
}

public int getDeployCacheTtl() {
return deployCacheTtlInSeconds;
}

public void setDeployCacheTtl(int deployCacheTtlInSeconds) {
this.deployCacheTtlInSeconds = deployCacheTtlInSeconds;
}

public int getRequestCacheTtl() {
return requestCacheTtlInSeconds;
}

public void setRequestCacheTtl(int requestCacheTtlInSeconds) {
this.requestCacheTtlInSeconds = requestCacheTtlInSeconds;
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
package com.hubspot.singularity.data;

import com.google.inject.Inject;
import java.util.Collection;
import java.util.HashMap;
import java.util.Map;
import java.util.concurrent.ScheduledExecutorService;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicReference;
import java.util.function.Function;
import java.util.function.Supplier;
import java.util.stream.Collectors;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class ApiCache<K, V> {
private static final Logger LOG = LoggerFactory.getLogger(ApiCache.class);

public final boolean isEnabled;
private final AtomicReference<Map<K, V>> zkValues;
private final Supplier<Map<K, V>> supplyMap;

@Inject
public ApiCache(
boolean isEnabled,
int cacheTtl,
Supplier<Map<K, V>> supplyMap,
ScheduledExecutorService executor
) {
this.isEnabled = isEnabled;
this.supplyMap = supplyMap;
this.zkValues = new AtomicReference<>(new HashMap<>());

if (this.isEnabled) {
executor.scheduleAtFixedRate(this::reloadZkValues, 2, cacheTtl, TimeUnit.SECONDS);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put in a 2 second delay because the reloading thread kept starting before the CuratorFrameworkImpl started

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it make more sense to try and do the first call to these inside some of the Managed classses that run during startup. Then we can:

  • Make sure that the first fetch of this runs after curator has started
  • Avoid writing zk calls in a constructor context vs some other method on an already instantiated class
  • Have the cached filled before the service actually starts listening for requests
    Thinking something like a start() method that we can call somewhere in SingularityLifecycleManaged

}
}

private void reloadZkValues() {
try {
Map<K, V> newZkValues = supplyMap.get();
zkValues.set(newZkValues);
} catch (Exception e) {
LOG.warn("Reloading ApiCache failed: {}", e.getMessage());
}
}

public V get(K key) {
return this.zkValues.get().get(key);
}

public Map<K, V> getAll() {
Map<K, V> allValues = this.zkValues.get();
if (allValues.isEmpty()) {
LOG.debug("ApiCache getAll returned empty");
}
return allValues;
}

public Map<K, V> getAll(Collection<K> keys) {
Map<K, V> allValues = this.zkValues.get();
Map<K, V> filteredValues = keys
.stream()
.filter(allValues::containsKey)
.collect(Collectors.toMap(Function.identity(), allValues::get));

if (filteredValues.isEmpty()) {
LOG.debug("ApiCache getAll returned empty for {}", keys);
}

return filteredValues;
}

public boolean isEnabled() {
return isEnabled;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
import com.hubspot.singularity.SingularityDeployStatistics;
import com.hubspot.singularity.SingularityDeployUpdate;
import com.hubspot.singularity.SingularityDeployUpdate.DeployEventType;
import com.hubspot.singularity.SingularityManagedScheduledExecutorServiceFactory;
import com.hubspot.singularity.SingularityPendingDeploy;
import com.hubspot.singularity.SingularityRequest;
import com.hubspot.singularity.SingularityRequestDeployState;
Expand Down Expand Up @@ -70,6 +71,8 @@ public class DeployManager extends CuratorAsyncManager {
private static final String DEPLOY_STATISTICS_KEY = "STATISTICS";
private static final String DEPLOY_RESULT_KEY = "RESULT_STATE";

private final ApiCache<String, SingularityRequestDeployState> deployCache;

@Inject
public DeployManager(
CuratorFramework curator,
Expand All @@ -85,7 +88,8 @@ public DeployManager(
IdTranscoder<SingularityDeployKey> deployKeyTranscoder,
Transcoder<SingularityUpdatePendingDeployRequest> updateRequestTranscoder,
ZkCache<SingularityDeploy> deploysCache,
SingularityLeaderCache leaderCache
SingularityLeaderCache leaderCache,
SingularityManagedScheduledExecutorServiceFactory executorServiceFactory
) {
super(curator, configuration, metricRegistry);
this.singularityEventListener = singularityEventListener;
Expand All @@ -99,6 +103,13 @@ public DeployManager(
this.updateRequestTranscoder = updateRequestTranscoder;
this.deploysCache = deploysCache;
this.leaderCache = leaderCache;
this.deployCache =
new ApiCache<>(
configuration.useApiCacheInDeployManager(),
configuration.getDeployCacheTtl(),
this::fetchAllDeployStates,
executorServiceFactory.get("deploy-api-cache-reloader")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is a version of this that returns a single threaded executor, which is all we should need for this case

Copy link
Contributor Author

@rosalind210 rosalind210 Jul 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have an option for a single threaded ScheduledExecutorService in SingularityManagedScheduledExecutorServiceFactory or SingularityManagedThreadPoolFactory but Executors has newSingleThreadScheduledExecutor that I can add.

);
}

public List<SingularityDeployKey> getDeployIdsFor(String requestId) {
Expand All @@ -116,6 +127,11 @@ public List<SingularityDeployKey> getAllDeployIds() {
return getChildrenAsIdsForParents("getAllDeployIds", paths, deployKeyTranscoder);
}

public Map<String, SingularityRequestDeployState> fetchAllDeployStates() {
final List<String> requestIds = getChildren(BY_REQUEST_ROOT);
return fetchDeployStatesByRequestIds(requestIds);
}

@Timed
public Map<String, SingularityRequestDeployState> getRequestDeployStatesByRequestIds(
Collection<String> requestIds
Expand All @@ -124,6 +140,15 @@ public Map<String, SingularityRequestDeployState> getRequestDeployStatesByReques
return leaderCache.getRequestDeployStateByRequestId(requestIds);
}

Map<String, SingularityRequestDeployState> deployStatesByRequestIds;

if (deployCache.isEnabled()) {
deployStatesByRequestIds = deployCache.getAll(requestIds);
if (!deployStatesByRequestIds.isEmpty()) {
return deployStatesByRequestIds;
}
}

return fetchDeployStatesByRequestIds(requestIds);
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
import com.hubspot.singularity.SingularityCreateResult;
import com.hubspot.singularity.SingularityDeleteResult;
import com.hubspot.singularity.SingularityDeployKey;
import com.hubspot.singularity.SingularityManagedScheduledExecutorServiceFactory;
import com.hubspot.singularity.SingularityPendingRequest;
import com.hubspot.singularity.SingularityPendingRequest.PendingType;
import com.hubspot.singularity.SingularityRequest;
Expand All @@ -32,11 +33,13 @@
import com.hubspot.singularity.expiring.SingularityExpiringScale;
import com.hubspot.singularity.expiring.SingularityExpiringSkipHealthchecks;
import com.hubspot.singularity.scheduler.SingularityLeaderCache;
import java.util.ArrayList;
import java.util.Collection;
import java.util.Comparator;
import java.util.List;
import java.util.Map;
import java.util.Optional;
import java.util.function.Function;
import java.util.stream.Collectors;
import org.apache.curator.framework.CuratorFramework;
import org.apache.curator.utils.ZKPaths;
Expand Down Expand Up @@ -90,6 +93,7 @@ public class RequestManager extends CuratorAsyncManager {
);

private final Map<Class<? extends SingularityExpiringRequestActionParent<? extends SingularityExpiringRequestParent>>, Transcoder<? extends SingularityExpiringRequestActionParent<? extends SingularityExpiringRequestParent>>> expiringTranscoderMap;
private final ApiCache<String, SingularityRequestWithState> requestsCache;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shoudl the getRequests(List<String> requestIds) and the singular getRequest(String requestId) also use the cache? Doesn't look like they are at the moment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't update elsewhere yet because I wasn't sure if that's what we wanted to do since we were most concerned by the endpoint to get all requests

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since it all pulls form the same place, and we are constnatly updating everything, I think it'd be worth it to update. I believe that the individual request endpoint was also pretty high up on the usage. Can always be a follow up PR if we want to check how effective this is first too

Copy link
Contributor Author

@rosalind210 rosalind210 Jul 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the cache to the getRequests(List requestIds) and the singular getRequest(String requestId), and the singular request has a non-cache flag now for Orion usage


@Inject
public RequestManager(
Expand All @@ -108,7 +112,8 @@ public RequestManager(
Transcoder<SingularityExpiringSkipHealthchecks> expiringSkipHealthchecksTranscoder,
SingularityWebCache webCache,
SingularityLeaderCache leaderCache,
Transcoder<CrashLoopInfo> crashLoopInfoTranscoder
Transcoder<CrashLoopInfo> crashLoopInfoTranscoder,
SingularityManagedScheduledExecutorServiceFactory scheduledExecutorServiceFactory
) {
super(curator, configuration, metricRegistry);
this.requestTranscoder = requestTranscoder;
Expand All @@ -133,6 +138,16 @@ public RequestManager(

this.leaderCache = leaderCache;
this.webCache = webCache;
this.requestsCache =
new ApiCache<>(
configuration.useApiCacheInRequestManager(),
configuration.getRequestCacheTtl(),
() ->
fetchRequests()
.stream()
.collect(Collectors.toMap(r -> r.getRequest().getId(), Function.identity())),
scheduledExecutorServiceFactory.get("request-api-cache-reloader")
);
}

private String getRequestPath(String requestId) {
Expand Down Expand Up @@ -632,11 +647,22 @@ public List<SingularityRequestWithState> getRequests(boolean useWebCache) {
if (useWebCache && webCache.useCachedRequests()) {
return webCache.getRequests();
}

if (requestsCache.isEnabled()) {
List<SingularityRequestWithState> requests = new ArrayList<>(
(requestsCache.getAll()).values()
);
if (!requests.isEmpty()) {
return requests;
}
}

List<SingularityRequestWithState> requests = fetchRequests();

if (useWebCache) {
webCache.cacheRequests(requests);
}

return requests;
}

Expand Down
Loading