Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to set an object level TTL #209

Merged
merged 39 commits into from
May 11, 2018

Conversation

mjlazar
Copy link
Contributor

@mjlazar mjlazar commented Dec 16, 2017

I think this partially addresses #31.

Copy link
Collaborator

@garrettmoon garrettmoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a really useful change. Can you please look over the comments and add some unit tests to PINCacheTests.m?


@param object An object to store in the cache.
@param key A key to associate with the object. This string will be copied.
@param ageLimit The age limit (in seconds) to associate with the object.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe mention 0 being no age limit here and in the other doc headers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -485,6 +491,11 @@ - (void)initializeDiskProperties
_metadata[key].size = fileSize;
byteCount += [fileSize unsignedIntegerValue];
}

NSTimeInterval ageLimit;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have any guesses as to the performance cost of this?

Separately, I believe this method returns -1 on failure? That should be handled here too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added code to handle the failure.

So, I've spent some time doing some analysis and trying out different performance tweaks (which is why I haven't responded to your comments yet). Since PINDiskCache.ttlCache is not known at disk initialization, we cannot skip over the code which reads the extended file system attributes from disk.

There is a performance hit that will be incurred on cache initialization and is a function of the number of files in the cache. I tested adding 1000 objects into the disk cache and then compared initialization times between the current code and the code within this pull request. On average it took ~24ms to initialize the cache with the current code. On average it took ~35ms to initialize after the code changes, which is ~11ms/44% slower.

The slowdown might not be worth it for users that do not take advantage of the object-level age limits, so I have a proposal for mitigation:

  • Make PINDiskCache.ttlCache (and PINDiskCache.ageLimit?) a readonly property (which will obviously break people who are currently using it).
  • Add a ttlCache parameter to PINDiskCache's designated initializer.
  • Skip over reading the extended file system attributes if ttlCache == NO.

This change will basically have zero effect on users who are not using ttlCache.


- (BOOL)_locked_setAgeLimit:(NSTimeInterval)ageLimit forURL:(NSURL *)fileURL
{
if (ageLimit <= 0.0 || !fileURL) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If ageLimit being <= 0 early returns, is there a way to clear the ageLimit on an object?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. Added code here to clear the ageLimit if 0 is passed in.

}

NSError *error = nil;
if (setxattr([fileURL fileSystemRepresentation], PINDiskCacheAgeLimitAttributeName, &ageLimit, sizeof(NSTimeInterval), 0, 0) != 0) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this method thread safe? The fact that it can set the global errno variable leads me to believe it may not be?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it is thread safe. errno is actually a macro that expands to a function that returns the calling thread's errno value. See /usr/include/sys/errno.h.

@@ -845,8 +898,13 @@ - (BOOL)containsObjectForKey:(NSString *)key
{
[self lock];
if (_metadata[key] != nil || _diskStateKnown == NO) {
BOOL objectExpired = NO;
if (self->_ttlCache && _metadata[key].date != nil) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be an || ? Or does the ageLimit option only apply if ttlCache is true? If that's the case, should we assert if someone attempts to set an ageLimit on a cache which isn't marked ttlCache?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, ageLimit only applies if ttlCache is true. I added an assertion as you suggested.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the assert got removed in a follow up commit?

[self unlock];
return ([self fileURLForKey:key updateFileModificationDate:NO] != nil);
return ([self fileURLForKey:key updateFileModificationDate:NO] != nil && !objectExpired);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flip these around for performance?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


if (!self->_ttlCache || self->_ageLimit <= 0 || fabs([_metadata[key].date timeIntervalSinceDate:now]) < self->_ageLimit) {

NSTimeInterval ageLimit = _metadata[key].hasAgeLimit ? _metadata[key].ageLimit : self->_ageLimit;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question as above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Obsolete. hasAgeLimit has been removed.

if (!self->_ttlCache || self->_ageLimit <= 0 || fabs([_metadata[key].date timeIntervalSinceDate:now]) < self->_ageLimit) {

NSTimeInterval ageLimit = _metadata[key].hasAgeLimit ? _metadata[key].ageLimit : self->_ageLimit;
if (!self->_ttlCache || ageLimit <= 0 || fabs([_metadata[key].date timeIntervalSinceDate:now]) < ageLimit) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this different behavior than above? I.e. it behaves like a ttlCache if there's an age limit or it's marked as a ttlCache here but not above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is basically saying:

  • If ttlCache == NO, return the object.
  • If ttlCache == YES and ageLimit is not set, return the file.
  • If ttlCache == YES, ageLimit is set and object is not expired return the object.

@@ -1025,7 +1094,9 @@ - (void)setObject:(id <NSCoding>)object forKey:(NSString *)key fileURL:(NSURL **
if (date) {
self->_metadata[key].date = date;
}

if (ageLimit > 0.0) {
[self asynchronouslySetAgeLimit:ageLimit forURL:fileURL];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure the answer to this: will there be implications to this being asynchronous? Specifically that _metadata[key].ageLimit won't be set immediately and checks against it could happen before it is?

I'm guessing the answer is no, but maybe there could be a comment here suggesting it's been thought through and why it's safe?


- (void)setAgeLimit:(NSTimeInterval)ageLimit {
_ageLimit = ageLimit;
_hasAgeLimit = YES;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to just make a getter for hasAgeLimit which checks _ageLimit > 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thoughts were if ageLimit were set to anything (including zero), it would override the cache's ageLimit. I realize this is actually confusing, so I removed hasAgeLimit; the object's ageLimit must now be greater than zero in order to override the cache's ageLimit.

@@ -455,6 +455,7 @@ - (BOOL)_locked_createCacheDirectory
- (void)initializeDiskProperties
{
NSUInteger byteCount = 0;
BOOL hasAtLeastOneAgeLimit = NO;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be initialized to the value of isTTLCache?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The value of isTTLCache is unset at this point. See my other comment about adding this as a param to the designated initializer.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is no longer true?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the hasAtLeastOneAgeLimit var all together since the isTTLCache check is all we need.

@@ -553,15 +566,26 @@ - (void)asynchronouslySetAgeLimit:(NSTimeInterval)ageLimit forURL:(NSURL *)fileU

- (BOOL)_locked_setAgeLimit:(NSTimeInterval)ageLimit forURL:(NSURL *)fileURL
{
if (ageLimit <= 0.0 || !fileURL) {
if (!fileURL) {
return NO;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should throw an assert here? Any valid reason to call this with a nil URL?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't sure what the implications of asserting here. -_locked_setFileModificationDate does a similar nil check/return early. WDTY?

}
}];

for (NSString *key in expiredObjectKeys) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you just unlock before this loop since expiredObjectKeys is created in this method? Then you don't need to unlock and relock in each loop iteration? Hmm, I bet I can do this elsewhere…

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -673,7 +697,7 @@ - (void)trimDiskToDate:(NSDate *)trimDate

for (NSString *key in keysSortedByDate) { // oldest files first
NSDate *accessDate = _metadata[key].date;
if (!accessDate)
if (!accessDate || _metadata[key].ageLimit > 0.0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a bit confusing to me… Should trimDiskToDate do nothing to an object if an ageLimit is set? Perhaps this is correct… but trimToAgeLimitRecursively should call removeExpiredObjectsAsync? Or maybe trimToAgeLimitRecursively should be removed in favor of removeExpiredObjects?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree it is sort of confusing. My thoughts here are that the trimDate that is passed in here is based on the cache-level age limit and we don't want to remove objects that have overridden the age limit.

I don't know what the correct thing to do here would be. -trimToAgeLimitRecursively calling -removeExpiredObjectsAsync would certainly be an option, but I wanted to avoid over-calling it since it has to iterate over all of the objects in the cache to determine if they are expired or not. Although, I don't think there is much harm in letting these expired objects lay dormant on the filesystem until the next initialization; the getters/iterators will treat them as if they don't exist.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, that makes sense, I think if we address the issue I mention below regarding trimDiskToSizeByDate and trimToCostLimitByDate

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -196,6 +200,27 @@ - (void)trimMemoryToDate:(NSDate *)trimDate
}
}

- (void)removeExpiredObjects
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry this comment is in the wrong place (you're using dates[key] below which gets updated on access in objectForKey), but, should objectForKey: not set the access date if an ageLimit is set?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, good point. The disk cache checks if object && !self->_ttlCache before setting the access date. This logic currently exists; I wonder if that's a bug. Shall I update the memory cache to do the same check?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering which is actually the bug…

trimDiskToSizeByDate and trimToCostLimitByDate both use last access to trim to order which objects they trim first. If ttlCache is enabled or there are age limits, objects which exceed the age limit should be trimmed first and then objects should be trimmed by last access. Does that seem right? In which case the two methods will need to be updated and the diskCache will need to remove its check of object && !self->_ttlCache

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made the updates you suggested. PTAL.

@garrettmoon
Copy link
Collaborator

Thanks for writing the tests!

- (void)setObjectAsync:(id <NSCoding>)object forKey:(NSString *)key withCost:(NSUInteger)cost completion:(PINCacheObjectBlock)block
{
[self setObjectAsync:object forKey:key withCost:cost ageLimit:0.0 completion:block];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once this lands, I wonder if we could reduce the complexity of everything by changing this line to:
[self setObjectAsync:object forKey:key withCost:cost ageLimit:self.isTTLCache ? self.ageLimit : 0.0 completion:block]

Then we could get rid of any other logic that deals with isTTLCache?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that could certainly be done, but it would have to be in conjunction with some changes to the trim-to-date methods. Since this would essentially set object-level age limits all of the time, we could combine the current trimming logic with the new -removeExpiredObjects logic I'm proposing in this PR.

@mjlazar
Copy link
Contributor Author

mjlazar commented Feb 24, 2018

Warnings should be fixed now.

@mjlazar
Copy link
Contributor Author

mjlazar commented Feb 25, 2018

@garrettmoon In one of my previous comments, I proposed passing the ttlCache switch to the initializer(s) and making the property readonly. I just pushed a commit that implements that. PTAL; this would be a breaking change to folks that are utilizing ttlCache.

@ghost
Copy link

ghost commented Feb 25, 2018

🚫 CI failed with log

@mjlazar
Copy link
Contributor Author

mjlazar commented Feb 26, 2018

@garrettmoon is there any chance the latest CI failure is due to a flaky test? I can't reproduce locally.

@garrettmoon
Copy link
Collaborator

@mjlazar I wasn't able to repro either so I'll re-run. I've never seen that one fail due to flakiness though…

@@ -280,6 +292,29 @@ PIN_SUBCLASSING_RESTRICTED
*/
- (instancetype)initWithName:(nonnull NSString *)name rootPath:(nonnull NSString *)rootPath serializer:(nullable PINDiskCacheSerializerBlock)serializer deserializer:(nullable PINDiskCacheDeserializerBlock)deserializer operationQueue:(nonnull PINOperationQueue *)operationQueue __attribute__((deprecated));

/**
The designated initializer allowing you to override default NSKeyedArchiver/NSKeyedUnarchiver serialization.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: this is no longer the designated initializer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Removed.

@@ -455,6 +455,7 @@ - (BOOL)_locked_createCacheDirectory
- (void)initializeDiskProperties
{
NSUInteger byteCount = 0;
BOOL hasAtLeastOneAgeLimit = NO;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is no longer true?

}];

[self unlock];
for (NSString *key in expiredObjectKeys) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this for-loop be moved outside the lock? Or the [self lock] below should be changed to [self lockForWriting]?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Moved this outside the lock.

@@ -196,6 +200,27 @@ - (void)trimMemoryToDate:(NSDate *)trimDate
}
}

- (void)removeExpiredObjects
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering which is actually the bug…

trimDiskToSizeByDate and trimToCostLimitByDate both use last access to trim to order which objects they trim first. If ttlCache is enabled or there are age limits, objects which exceed the age limit should be trimmed first and then objects should be trimmed by last access. Does that seem right? In which case the two methods will need to be updated and the diskCache will need to remove its check of object && !self->_ttlCache

@@ -845,8 +898,13 @@ - (BOOL)containsObjectForKey:(NSString *)key
{
[self lock];
if (_metadata[key] != nil || _diskStateKnown == NO) {
BOOL objectExpired = NO;
if (self->_ttlCache && _metadata[key].date != nil) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the assert got removed in a follow up commit?

NSDate *now = [NSDate date];
for (NSString *key in ageLimits) {
NSDate *accessDate = dates[key];
NSTimeInterval ageLimit = [_ageLimits[key] doubleValue] ?: self->_ageLimit;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accessing these ivars is unsafe, gather self->_ageLimit as globalAgeLimit in the lock above and use ageLimits here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -673,7 +697,7 @@ - (void)trimDiskToDate:(NSDate *)trimDate

for (NSString *key in keysSortedByDate) { // oldest files first
NSDate *accessDate = _metadata[key].date;
if (!accessDate)
if (!accessDate || _metadata[key].ageLimit > 0.0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, that makes sense, I think if we address the issue I mention below regarding trimDiskToSizeByDate and trimToCostLimitByDate

mjlazar and others added 3 commits March 17, 2018 12:13
Specifically, this commit does the following:
- Remove designated initializer language from non-designated initializers in header documentation
- Remove `hasAtLeastOneAgeLimit` logic from `-initializeDiskProperties`
- Prune expired objects first in `-[PINDiskCache trimDiskToDate:]` and `-[PINMemoryCache trimToCostLimitByDate:]`.
- Set file modification date regardless if TTL cache or not.
- Fix unsafe ivar access in PINMemoryCache
- In tests that need `ttlCache` set to YES, do so before calling `-removeAllObjects` to avoid race condition.
NSMutableArray<NSString *> *expiredObjectKeys = [NSMutableArray array];
[_metadata enumerateKeysAndObjectsUsingBlock:^(NSString * _Nonnull key, PINDiskCacheMetadata * _Nonnull obj, BOOL * _Nonnull stop) {
NSTimeInterval ageLimit = obj.ageLimit > 0.0 ? obj.ageLimit : self->_ageLimit;
NSDate *expirationDate = [obj.date dateByAddingTimeInterval:ageLimit];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so I think there's a problem: now we're always setting date. It should probably be renamed lastModifiedDate and a new property createdDate should be added? Because objects shouldn't have their expiration lengthened when they're accessed, but they should live longer, in relation to other objects, than other objects which haven't expired but were accessed later? This behavior will have to be added / updated in PINMemoryCache as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. I modified the code using your suggestions. PTAL when you get a chance.

mjlazar and others added 3 commits April 16, 2018 10:06
Specifically, this commit does the following:
- In disk cache, rename 'date' -> 'lastModifiedDate'.
- In in-mem cache, rename 'dates' -> 'accessDates'.
- Add new 'created date' in both in-mem and disk caches.
- Use 'created date' to determine object expiration in both caches.
- Use 'accessDates' when trimming in-mem cache to cost limit by date.
- Use 'lastModifiedDate' when trimming disk cache to size.
@garrettmoon garrettmoon mentioned this pull request Apr 26, 2018
@garrettmoon
Copy link
Collaborator

This is looking great! One final request: would you be willing to update the tests to exercise our new creation / modification logic?

@mjlazar
Copy link
Contributor Author

mjlazar commented May 8, 2018

@garrettmoon, I added some additional tests that exercise the new logic. Let me know what you think. Thanks!

@ghost
Copy link

ghost commented May 8, 2018

🚫 CI failed with log

@garrettmoon
Copy link
Collaborator

@mjlazar looks good to me! Looks like there's an assertion failure in one of the tests though?

@mjlazar
Copy link
Contributor Author

mjlazar commented May 9, 2018

Seems to be a race condition. I'll push a fix shortly.

@mjlazar
Copy link
Contributor Author

mjlazar commented May 9, 2018

@garrettmoon, would you mind trying the tests again? Thanks.

@ghost
Copy link

ghost commented May 9, 2018

1 Warning
⚠️ This is a big PR, please consider splitting it up to ease code review.

Generated by 🚫 Danger

Copy link
Collaborator

@garrettmoon garrettmoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do this! Thanks so much for the contribution and for addressing all my feedback!

@garrettmoon garrettmoon merged commit d886490 into pinterest:master May 11, 2018
@mjlazar
Copy link
Contributor Author

mjlazar commented May 14, 2018

Awesome! Thanks for reviewing.

@mjlazar mjlazar deleted the object-level-ttl-override branch May 14, 2018 13:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants