Skip to content

Commit 0b0dbad

Browse files
committed
Keep cumulative elapsed scroll time in microseconds
Today we internally accumulate elapsed scroll time in nanoseconds. The problem here is that this can reasonably overflow. For example, on a system with scrolls that are open for ten minutes on average, after sixteen million scrolls the largest value that can be represented by a long will be executed. To address this, we switch to internally representing scrolls using microseconds as this enables with the same number of scrolls scrolls that are open for seven days on average, or with the same average elapsed time sixteen billion scrolls which will never happen (executing one scroll a second until sixteen billion have executed would not occur until more than five-hundred years had elapsed). Relates #27068
1 parent ae2a11f commit 0b0dbad

File tree

1 file changed

+9
-2
lines changed

1 file changed

+9
-2
lines changed

core/src/main/java/org/elasticsearch/index/search/stats/ShardSearchStats.java

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -180,12 +180,19 @@ public void onNewScrollContext(SearchContext context) {
180180
public void onFreeScrollContext(SearchContext context) {
181181
totalStats.scrollCurrent.dec();
182182
assert totalStats.scrollCurrent.count() >= 0;
183-
totalStats.scrollMetric.inc(System.nanoTime() - context.getOriginNanoTime());
183+
totalStats.scrollMetric.inc(TimeUnit.NANOSECONDS.toMicros(System.nanoTime() - context.getOriginNanoTime()));
184184
}
185185

186186
static final class StatsHolder {
187187
public final MeanMetric queryMetric = new MeanMetric();
188188
public final MeanMetric fetchMetric = new MeanMetric();
189+
/* We store scroll statistics in microseconds because with nanoseconds we run the risk of overflowing the total stats if there are
190+
* many scrolls. For example, on a system with 2^24 scrolls that have been executed, each executing for 2^10 seconds, then using
191+
* nanoseconds would require a numeric representation that can represent at least 2^24 * 2^10 * 10^9 > 2^24 * 2^10 * 2^29 = 2^63
192+
* which exceeds the largest value that can be represented by a long. By using microseconds, we enable capturing one-thousand
193+
* times as many scrolls (i.e., billions of scrolls which at one per second would take 32 years to occur), or scrolls that execute
194+
* for one-thousand times as long (i.e., scrolls that execute for almost twelve days on average).
195+
*/
189196
public final MeanMetric scrollMetric = new MeanMetric();
190197
public final MeanMetric suggestMetric = new MeanMetric();
191198
public final CounterMetric queryCurrent = new CounterMetric();
@@ -197,7 +204,7 @@ public SearchStats.Stats stats() {
197204
return new SearchStats.Stats(
198205
queryMetric.count(), TimeUnit.NANOSECONDS.toMillis(queryMetric.sum()), queryCurrent.count(),
199206
fetchMetric.count(), TimeUnit.NANOSECONDS.toMillis(fetchMetric.sum()), fetchCurrent.count(),
200-
scrollMetric.count(), TimeUnit.NANOSECONDS.toMillis(scrollMetric.sum()), scrollCurrent.count(),
207+
scrollMetric.count(), TimeUnit.MICROSECONDS.toMillis(scrollMetric.sum()), scrollCurrent.count(),
201208
suggestMetric.count(), TimeUnit.NANOSECONDS.toMillis(suggestMetric.sum()), suggestCurrent.count()
202209
);
203210
}

0 commit comments

Comments
 (0)