Skip to content

Conversation

@tlrx
Copy link
Member

@tlrx tlrx commented Jan 30, 2020

Today cache files are identified in cache using a string representing an absolute path to a file on disk. This path is a sub directory of the current shard data path and as such already contains identification bits like the current index id and the shard id. It also contains the snapshot id that is passed at CacheDirectory creation time.

While this has been done for quick prototyping and already been improved in #51520, it feels wrong to rely on a path converted to a string as cache keys. Instead we should have a distinct CacheKey object to identify CacheFile in cache.

Relates #50693 (comment)

@tlrx tlrx added >enhancement :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs labels Jan 30, 2020
@tlrx tlrx requested a review from DaveCTurner January 30, 2020 12:52
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (:Distributed/Snapshot/Restore)

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I left a few minor comments/questions.

private final String fileName;
private final long fileLength;

CacheKey(SnapshotId snapshotId, IndexId indexId, ShardId shardId, Path cacheDir, String fileName, long fileLength) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we have an EqualsHashCodeTestUtils#checkEqualsAndHashCode-based test for this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦‍♂️ Of course!

private final ShardId shardId;
private final Path cacheDir;
private final String fileName;
private final long fileLength;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels strange to have the length as part of the key. I kinda see that it's here to make some plumbing a bit simpler, but I think I'd still rather move it back out again.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I look at it again I agree it feels strange. I reverted this.

private final SnapshotId snapshotId;
private final IndexId indexId;
private final ShardId shardId;
private final Path cacheDir;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly cacheDir. In my mind the thing in the cache is the Lucene file, which is uniquely identified by snapshot/index/shard/filename, and cacheDir is a bit of an implementation detail that I feel should be hidden here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, thanks

@tlrx tlrx requested a review from DaveCTurner January 30, 2020 19:39
@tlrx
Copy link
Member Author

tlrx commented Jan 30, 2020

Thanks @DaveCTurner, I've applied your feedback.

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM; I left one additional optional request.

// directory has been closed.
cacheService.removeFromCache(key -> key.startsWith(cacheDir.toString()));
cacheService.removeFromCache(key ->
Objects.equals(key.getSnapshotId(), snapshotId) &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe move this lambda to a method CacheKey#belongsTo(SnapshotId, IndexId, ShardId) (not sure about the name but that's the best I've got)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. I kept the name and added a test in e8afbe7

@tlrx tlrx merged commit 101c419 into elastic:feature/searchable-snapshots Jan 31, 2020
@tlrx tlrx deleted the add-cache-key branch January 31, 2020 09:52
@tlrx
Copy link
Member Author

tlrx commented Jan 31, 2020

Thanks David!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >enhancement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants