Improve searchable snapshot mount time#66198
Merged
henningandersen merged 3 commits intoelastic:masterfrom Dec 14, 2020
Merged
Conversation
Reduce the range sizes we fetch during mounting to speed up mount time until shard started. On resource constrained setups (rate limiter, disk or network), the time to mount multiple shards is proportional to the amount of data to fetch and for most files in a snapshot, we need to fetch only a small piece of the files to start the shard.
Collaborator
|
Pinging @elastic/es-distributed (Team:Distributed) |
tlrx
approved these changes
Dec 14, 2020
Member
tlrx
left a comment
There was a problem hiding this comment.
LGTM (sorry for the conflicting file)
DaveCTurner
approved these changes
Dec 14, 2020
Contributor
DaveCTurner
left a comment
There was a problem hiding this comment.
LGTM2, I suggested a couple of comments.
| MAX_SNAPSHOT_CACHE_RANGE_SIZE, // max | ||
| Setting.Property.NodeScope | ||
| ); | ||
| public static final Setting<ByteSizeValue> SNAPSHOT_CACHE_RECOVERY_RANGE_SIZE_SETTING = Setting.byteSizeSetting( |
Contributor
There was a problem hiding this comment.
Suggest a comment so we remember why we are doing this:
Suggested change
| public static final Setting<ByteSizeValue> SNAPSHOT_CACHE_RECOVERY_RANGE_SIZE_SETTING = Setting.byteSizeSetting( | |
| /** | |
| * Starting up a shard involves reading small parts of some files from the repository, independently of the pre-warming process. If we | |
| * expand those ranges using {@link CacheService#SNAPSHOT_CACHE_RANGE_SIZE_SETTING} then we end up reading quite a few 32MB ranges. If | |
| * we read enough of these ranges for the restore throttling rate limiter to kick in then all the read threads will end up waiting on | |
| * the throttle, blocking subsequent reads. By using a smaller read size during restore we avoid clogging up the rate limiter so much. | |
| */ | |
| public static final Setting<ByteSizeValue> SNAPSHOT_CACHE_RECOVERY_RANGE_SIZE_SETTING = Setting.byteSizeSetting( |
Also suggest a similar comment on the other setting since this came up as a question in the investigation that led to this PR.
/**
* If a search needs data from the repository then we expand it to a larger contiguous range whose size is determined by this setting,
* in anticipation of needing nearby data in subsequent reads. Repository reads typically have quite high latency (think ~100ms) and
* the default of 32MB for this setting represents the approximate point at which size starts to matter. In other words, reads of
* ranges smaller than 32MB don't usually happen much quicker, so we may as well expand all the way to 32MB ranges.
*/
public static final Setting<ByteSizeValue> SNAPSHOT_CACHE_RANGE_SIZE_SETTING = Setting.byteSizeSetting(
…snapshot_mount_time
henningandersen
added a commit
that referenced
this pull request
Dec 14, 2020
Reduce the range sizes we fetch during mounting to speed up mount time until shard started. On resource constrained setups (rate limiter, disk or network), the time to mount multiple shards is proportional to the amount of data to fetch and for most files in a snapshot, we need to fetch only a small piece of the files to start the shard.
Contributor
Author
|
Thanks Tanguy and David! |
jasontedor
added a commit
to jasontedor/elasticsearch
that referenced
this pull request
Dec 14, 2020
* elastic/master: (33 commits) Add searchable snapshot cache folder to NodeEnvironment (elastic#66297) [DOCS] Add dynamic runtime fields to docs (elastic#66194) Add HDFS searchable snapshot integration (elastic#66185) Support canceling cross-clusters search requests (elastic#66206) Mute testCacheSurviveRestart (elastic#66289) Fix cat tasks api params in spec and handler (elastic#66272) Snapshot of a searchable snapshot should be empty (elastic#66162) [ML] DFA _explain API should not fail when none field is included (elastic#66281) Add action to decommission legacy monitoring cluster alerts (elastic#64373) move rollup_index param out of RollupActionConfig (elastic#66139) Improve FieldFetcher retrieval of fields (elastic#66160) Remove unsed fields in `RestAnalyzeAction` (elastic#66215) Simplify searchable snapshot CacheKey (elastic#66263) Autoscaling remove feature flags (elastic#65973) Improve searchable snapshot mount time (elastic#66198) [ML] Report cause when datafeed extraction encounters error (elastic#66167) Remove suggest reference in some API specs (elastic#66180) Fix warning when installing a plugin for different ESversion (elastic#66146) [ML] make `xpack.ml.max_ml_node_size` and `xpack.ml.use_auto_machine_memory_percent` dynamically settable (elastic#66132) [DOCS] Add `require_alias` to Bulk API (elastic#66259) ...
henningandersen
added a commit
to henningandersen/elasticsearch
that referenced
this pull request
Dec 21, 2020
In elastic#66198 a setting was introduced to reduce the range size used for searchable snapshots during recovery, unfortunately it was not registered and is therefore not settable.
henningandersen
added a commit
that referenced
this pull request
Dec 29, 2020
In #66198 a setting was introduced to reduce the range size used for searchable snapshots during recovery, unfortunately it was not registered and is therefore not settable.
henningandersen
added a commit
that referenced
this pull request
Dec 29, 2020
In #66198 a setting was introduced to reduce the range size used for searchable snapshots during recovery, unfortunately it was not registered and is therefore not settable.
henningandersen
added a commit
that referenced
this pull request
Dec 29, 2020
In #66198 a setting was introduced to reduce the range size used for searchable snapshots during recovery, unfortunately it was not registered and is therefore not settable.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Reduce the range sizes we fetch during recovery to speed up mount time
until shard started.
On resource constrained setups (rate limiter, disk or network), the time
to mount multiple shards is proportional to the amount of data to fetch
and for most files in a snapshot, we need to fetch only a small piece of
the files to start the shard.