Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Significantly improves startup performance by asynchronously building… #203

Merged
merged 7 commits into from
Oct 4, 2017

Conversation

garrettmoon
Copy link
Collaborator

… known state on startup.

@appleguy
Copy link
Contributor

appleguy commented Oct 2, 2017

Excited for this PR :). Let me know when it's ready for a full review — I'll try to look in depth within a day or two.

Copy link
Contributor

@appleguy appleguy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a big advancement for PINCache! I'm very excited to integrate this in my team's app.

I think the most important feedback in my comments relates to the spinlock / sleep behavior. I thought about it for a bit and think the suggestion should work pretty well!

PINDiskCacheSerializerBlock _serializer;
PINDiskCacheDeserializerBlock _deserializer;

PINDiskCacheKeyEncoderBlock _keyEncoder;
PINDiskCacheKeyDecoderBlock _keyDecoder;
}

@property (assign, nonatomic) pthread_mutex_t mutex;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, this should be a solid improvement!

if (date && key)
[_dates setObject:date forKey:key];
// Do not continue to hold the lock while processing files.
[self lock];
Copy link
Contributor

@appleguy appleguy Oct 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment doesn't seem to match the code - should this be re-locking or unlocking?

Suggestion that would avoid the many lock / unlock cycles:

  • Build up a separate NSMutableDictionary instance (easier if merged into a shared _cacheItemMetadata, but can be done with each _sizes, _dates and _knownKeys)
  • No locking within the loop
  • At the end of the loop, lock and merge into the new instances any keys and values that have been set on the _sizes / _dates / _knownKeys while loading from disk has been going on.
  • Overwrite the instance variables with the new collections.

This would be the only way to avoid contention in the common case of a busy cache (e.g. many images loading at startup) while loading a large disk cache (e.g. a 200MB+ cache of 10-100KB images).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, the comment is supposed to mean, do not hold the lock the entire time while processing files.

The real issue around this locking is not the in memory state of ivars but disk access. PINDiskCache manages access to the disk solely through the lock. Without it, we can't prevent another operation from overwriting our file while we're trying to get its attributes.

In the future future, I really want to investigate using NSFileCoordinators instead of locking to manage disk access.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very interesting! I wonder if we should use a separate lock for disk access and for protecting the shared state? There are probably a bunch of methods that don't need to block on file access...

I think this explains why I've seen more lock contention in this layer than I would expect from just protecting the shared state. The good news is that, unlike some hierarchical objects like nodes where multiple locks create deadlocks extremely easily, this class might be ideally structured for a clean multi-lock strategy.

_byteCount = byteCount;

if (self->_byteLimit > 0 && self->_byteCount > self->_byteLimit)
[self trimToSizeByDateAsync:self->_byteLimit completion:nil];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we may need a way to avoid scheduling a large number of trim operations while this is processing. It probably makes sense to have a _trimPending / scheduled variable, so that a series of setObject: calls during this time doesn't schedule a lot of them.

Since the operation queue is concurrent, it looks like there may be an issue with multiple trim operations starting in parallel too — this could be addressed by using a barrier block on the underlying queue, which GCD will ensure is run serially.

Copy link
Collaborator Author

@garrettmoon garrettmoon Oct 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@appleguy the trim's are actually coalesced thanks to @nguyenhuy's improvement in PINOperationQueue! We could also skip trimming until we have a known state.

@@ -974,7 +1005,7 @@ - (void)setObject:(id <NSCoding>)object forKey:(NSString *)key fileURL:(NSURL **
return;
}

[self lock];
[self lockUntilWritable];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary to block until the setup is done? If this happens, then won't the common path that triggers cache init (e.g. an image download starting from PINRemoteImage) will still block displaying the image until after the cache is fully read in?

I may be missing something, like perhaps PINRemoteImage performs the setObject: asynchronously and we won't block.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lockUntilWritable only waits until the cache directory is confirmed to exist or is created if it doesn't. lockUntilKnownState (I think that's what I called it) guarantees everything is created, iterated and populated.

if (self->_ttlCache) {
// We actually need to know the entire disk state if we're a TTL cache.
[self unlock];
[self lockUntilDiskStateKnown];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is that? Can't we just return nil for any objectForKey: before initialization is done, including for a TTL cache?

I think I recall there being a default TTL of 30 days. We should consider turning this off by default if it incurs a meaningful performance penalty, which blocking on cache init would be a pretty big cost if the developer hasn't specifically indicated they want TTL behavior.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TTL is off by default. It was a feature added by a community member which enforces ttl (I.e. it won't return an object even if it has it if it exceeds the TTL). There are separate time limits are probably the default you're think of but they only relate to trimming, not a guarantee something won't be returned. To be frank I'm a bit regretful we merged in this feature, it's not really in the spirit of the framework and adds a lot of complexity.

[self unlock];
usleep(100);

__unused int result = pthread_mutex_lock(&_mutex);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not call [self lock] here and in the method below?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point.

// spinlock if the disk isn't writable
while (_diskWritable == NO) {
[self unlock];
usleep(100);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using this technique could cause some other problems, like a lot of threads to be spawned. We should be able to rely on the lock to be released in order to let the threads proceed.

How about a separate read-write mutex, just for cache init? At initialization of the class, the writer lock would be acquired by the code doing the read from disk. Any code calling -lockUntilWritable would first acquire the read lock (which could be grabbed by many threads at the same time, as long as the write lock isn't held). Secondly it would acquire the main [self lock].

This should allow all of them to wait on the initialization, without busying the main cache lock for operations that don't depend on _diskWritable.

As an alternative, I think there is a pthread version of the condition lock or a semaphore which can be signalled.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, I'll play around with this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the pthread condition might be the best option. I'm guessing performance isn't any better for read / write locks and it's far less indicative of what we're doing. Need to actually figure out how to do pthread conditions though, not a lot of great docs…

@garrettmoon garrettmoon force-pushed the improveStartupPerformance branch from cdf4bd2 to c673afc Compare October 4, 2017 02:25
@garrettmoon garrettmoon changed the base branch from improveCacheMissPerformance to master October 4, 2017 20:49
@garrettmoon garrettmoon force-pushed the improveStartupPerformance branch from 46704b2 to f2ac652 Compare October 4, 2017 20:53
Copy link

@Adlai-Holler Adlai-Holler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is super duper good. Async all day!

attributes:nil
error:&error];
PINDiskCacheError(error);
created = success;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Let's assign directly to created and remove success

if (_diskStateKnown == NO) {
pthread_cond_wait(&_diskStateKnownCondition, &_mutex);
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two alternate names might be lockForWriting and lockAndWaitForKnownDiskState or something. lockUntil isn't quite on the money.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like those better.

includingPropertiesForKeys:keys
options:NSDirectoryEnumerationSkipsHiddenFiles
error:&error];
[self unlock];

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the motivation behind this lock? cacheURL is immutable and I think it's safe to use NSFileManager from multiple threads.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right that it's not strictly necessary and I think we should likely audit this behavior to further improve performance. However in it's current state, all access to the filesystem is done with the lock held and I want to keep that behavior intact until we decide on a cohesive replacement strategy.

@@ -859,17 +895,22 @@ - (id)objectForKeyedSubscript:(NSString *)key
NSDate *now = [[NSDate alloc] init];

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to this diff, but I just noticed we can save some work by moving this date creation down.

@@ -830,7 +866,7 @@ - (void)enumerateObjectsWithBlockAsync:(PINDiskCacheFileURLEnumerationBlock)bloc
- (void)synchronouslyLockFileAccessWhileExecutingBlock:(PINCacheBlock)block
{
if (block) {
[self lock];
[self lockUntilWritable];
block(self);
[self unlock];
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also modify containsObjectForKey:?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

- (void)dealloc
{
__unused int result = pthread_mutex_destroy(&_mutex);
NSCAssert(result == 0, @"Failed to destroy lock in PINMemoryCache %p. Code: %d", (void *)self, result);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add pthread_cond_destroy here

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I knew there had to be something like that (but couldn't find docs)

Copy link

@Adlai-Holler Adlai-Holler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bingo!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants