-
Notifications
You must be signed in to change notification settings - Fork 25.7k
Add cache directory low-level instrumentation #51637
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add cache directory low-level instrumentation #51637
Conversation
|
Pinging @elastic/es-distributed (:Distributed/Snapshot/Restore) |
DaveCTurner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. I like the simplicity of tracking these stats directly within the CacheDirectory. I left a few comments and thoughts.
...apshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/IndexInputStats.java
Outdated
Show resolved
Hide resolved
...napshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/CacheDirectory.java
Show resolved
Hide resolved
...napshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/CacheDirectory.java
Show resolved
Hide resolved
...apshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/IndexInputStats.java
Outdated
Show resolved
Hide resolved
|
Thanks @DaveCTurner, I've applied your feedback and pushed some changes. I'm sure you have more comments, can you please have another look? Thanks! |
| final String fileName = cacheFileReference.getFileName(); | ||
| final byte[] copyBuffer = new byte[Math.toIntExact(Math.min(COPY_BUFFER_SIZE, end - start))]; | ||
| try (IndexInput input = in.openInput(cacheFileReference.getFileName(), ioContext)) { | ||
| logger.trace(() -> new ParameterizedMessage("writing range [{}-{}] of file [{}] to cache file", start, end, fileName)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I think this will not include the shard ID so we might struggle to interpret the logs when there are lots of shards to search. I think logging cacheFileReference itself gives us everything we need.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, I pushed 9de8453
...napshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/CacheDirectory.java
Show resolved
Hide resolved
DaveCTurner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One request for a bit more logging detail, otherwise LGTM.
| private int readDirectly(long start, long end, byte[] buffer, int offset) throws IOException { | ||
| final String fileName = cacheFileReference.getFileName(); | ||
| final byte[] copyBuffer = new byte[Math.toIntExact(Math.min(COPY_BUFFER_SIZE, end - start))]; | ||
| logger.trace(() -> new ParameterizedMessage("direct reading of range [{}-{}] from file [{}]", start, end, fileName)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, here too :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
DaveCTurner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
|
@elasticmachine run elasticsearch-ci/2 A CCR test failed previously with what seems a connection issue |
|
Thanks David! |
This commit builds on elastic#51637, adding tracking of the total time spent fetching data from the blob store and a linear regression model for these fetches.
This commit adds a REST API that exposes the various CacheDirectory stats added in #51637. It adds the necessary action, transport action and request and response objects as well as a new qa:rest project for REST tests. The REST endpoint is _searchable_snapshots/stats and can be filtered by index names.
This pull request is a first step to add instrumentation to the
CacheDirectoryadded in #50693.It adds a new mutable
IndexInputStatsobject that allows to track various information about howCacheBufferedIndexInputinteracts with the underlying cache file to satisfy the read operations. It keep tracks of small/large forward/backward seekings as well as the total number of bytes read from theIndexInputand the total number of bytes read/written from/to theCacheFile.Note that the stats do not reflect the exact usage that Lucene does of the
IndexInputs opened through aCacheDirectory:IndexInputStatsis not aware of the read operations that are directly served at a higher level by the internalBufferedIndexInput's buffer. Instead it tracks what really hit the disk which is, I think, what is the most interesting for us at this stage.This pull request does not expose the information through a REST API, this will be done in a follow up pull request once the low level instrumentation is validated.