Skip to content

Commit 01fdbfe

Browse files
berglhnik9000
authored andcommitted
Docs disambiguate reindex's requests_per_second (#26185)
Reindex's docs were somewhere between unclear and inaccurate around `requests_per_second`. This makes them much more clear and accurate.
1 parent be59387 commit 01fdbfe

File tree

1 file changed

+15
-8
lines changed

1 file changed

+15
-8
lines changed

docs/reference/docs/reindex.asciidoc

Lines changed: 15 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -534,14 +534,21 @@ shards to become available. Both work exactly how they work in the
534534
<<docs-bulk,Bulk API>>.
535535

536536
`requests_per_second` can be set to any positive decimal number (`1.4`, `6`,
537-
`1000`, etc) and throttles the number of requests per second that the reindex
538-
issues or it can be set to `-1` to disabled throttling. The throttling is done
539-
waiting between bulk batches so that it can manipulate the scroll timeout. The
540-
wait time is the difference between the time it took the batch to complete and
541-
the time `requests_per_second * requests_in_the_batch`. Since the batch isn't
542-
broken into multiple bulk requests large batch sizes will cause Elasticsearch
543-
to create many requests and then wait for a while before starting the next set.
544-
This is "bursty" instead of "smooth". The default is `-1`.
537+
`1000`, etc) and throttles the number of batches that the reindex issues by
538+
padding each batch with a wait time. The throttling can be disabled by
539+
setting `requests_per_second` to `-1`.
540+
541+
The throttling is done waiting between bulk batches so that it can manipulate the
542+
scroll timeout. The wait time is the difference between the request scroll search
543+
size divided by the `requests_per_second` and the `batch_write_time`. By default
544+
the scroll batch size is `1000`, so if the `requests_per_second` is set to `500`:
545+
546+
`target_total_time` = `1000` / `500 per second` = `2 seconds` +
547+
`wait_time` = `target_total_time` - `batch_write_time` = `2 seconds` - `.5 seconds` = `1.5 seconds`
548+
549+
Since the batch isn't broken into multiple bulk requests large batch sizes will
550+
cause Elasticsearch to create many requests and then wait for a while before
551+
starting the next set. This is "bursty" instead of "smooth". The default is `-1`.
545552

546553
[float]
547554
[[docs-reindex-response-body]]

0 commit comments

Comments
 (0)