Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trigger refresh when shard becomes search active #96321

Conversation

martijnvg
Copy link
Member

@martijnvg martijnvg commented May 24, 2023

This change invokes Engine#maybeRefresh() when a shard is search-idle and becomes search-active in IndexShard#ensureShardSearchActive(...) (used to be named waitShardSearchActive(...)).

Prior to this change shard level search execution is idle until the schedule refresh has been execute. This includes the time it takes for the refresh to be scheduled (which is a full second). This unnecessarily increases the query time of a search request.

Closes #95544

@martijnvg martijnvg added the :StorageEngine/TSDB You know, for Metrics label May 24, 2023
@martijnvg martijnvg force-pushed the tsdb/trigger_refresh_when_becoming_search_active branch from a22b725 to b82b9f5 Compare May 25, 2023 09:29
@martijnvg martijnvg marked this pull request as ready for review May 25, 2023 13:39
@martijnvg martijnvg added the WIP label May 25, 2023
@martijnvg
Copy link
Member Author

(accidentally pushed 'ready for review' on this PR... too many open browser tabs)

@martijnvg martijnvg added the :Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. label May 26, 2023
@martijnvg martijnvg force-pushed the tsdb/trigger_refresh_when_becoming_search_active branch 4 times, most recently from 0b75a31 to b24857e Compare May 26, 2023 19:54
@martijnvg martijnvg force-pushed the tsdb/trigger_refresh_when_becoming_search_active branch from b24857e to 266aa36 Compare May 27, 2023 13:08
@martijnvg
Copy link
Member Author

Did a quick benchmark run to see what the impact this change has using the k8s query benchmark (it tests performance of queries on search idle shards). Depending on the query a 15% to a 75% reduction in query time has been observed:

|                                                Min Throughput |                  cpu_usage_per_pod_15_minutes |      0.251864    |      0.251819    |      -4e-05   |  ops/s |   -0.02% |
|                                               Mean Throughput |                  cpu_usage_per_pod_15_minutes |      0.252179    |      0.252127    |      -5e-05   |  ops/s |   -0.02% |
|                                             Median Throughput |                  cpu_usage_per_pod_15_minutes |      0.25216     |      0.252107    |      -5e-05   |  ops/s |   -0.02% |
|                                                Max Throughput |                  cpu_usage_per_pod_15_minutes |      0.252566    |      0.252504    |      -6e-05   |  ops/s |   -0.02% |
|                                       50th percentile latency |                  cpu_usage_per_pod_15_minutes |   2583.08        |   2184.99        |    -398.082   |     ms |  -15.41% |
|                                       90th percentile latency |                  cpu_usage_per_pod_15_minutes |   2964.72        |   2262.39        |    -702.333   |     ms |  -23.69% |
|                                      100th percentile latency |                  cpu_usage_per_pod_15_minutes |   3004.16        |   2361.76        |    -642.396   |     ms |  -21.38% |
|                                  50th percentile service time |                  cpu_usage_per_pod_15_minutes |   2579.88        |   2181.38        |    -398.499   |     ms |  -15.45% |
|                                  90th percentile service time |                  cpu_usage_per_pod_15_minutes |   2962.34        |   2259.26        |    -703.084   |     ms |  -23.73% |
|                                 100th percentile service time |                  cpu_usage_per_pod_15_minutes |   3000.96        |   2358.62        |    -642.336   |     ms |  -21.40% |
|                                                    error rate |                  cpu_usage_per_pod_15_minutes |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |                     cpu_usage_per_pod_2_hours |      0.25148     |      0.251291    |      -0.00019 |  ops/s |   -0.08% |
|                                               Mean Throughput |                     cpu_usage_per_pod_2_hours |      0.251729    |      0.251508    |      -0.00022 |  ops/s |   -0.09% |
|                                             Median Throughput |                     cpu_usage_per_pod_2_hours |      0.251713    |      0.251494    |      -0.00022 |  ops/s |   -0.09% |
|                                                Max Throughput |                     cpu_usage_per_pod_2_hours |      0.252035    |      0.251773    |      -0.00026 |  ops/s |   -0.10% |
|                                       50th percentile latency |                     cpu_usage_per_pod_2_hours |   2761.73        |   2367.94        |    -393.791   |     ms |  -14.26% |
|                                       90th percentile latency |                     cpu_usage_per_pod_2_hours |   3198.37        |   2419.47        |    -778.897   |     ms |  -24.35% |
|                                      100th percentile latency |                     cpu_usage_per_pod_2_hours |   3374           |   2437.34        |    -936.666   |     ms |  -27.76% |
|                                  50th percentile service time |                     cpu_usage_per_pod_2_hours |   2759.26        |   2365.13        |    -394.134   |     ms |  -14.28% |
|                                  90th percentile service time |                     cpu_usage_per_pod_2_hours |   3195.96        |   2416.62        |    -779.344   |     ms |  -24.39% |
|                                 100th percentile service time |                     cpu_usage_per_pod_2_hours |   3372.06        |   2433.49        |    -938.57    |     ms |  -27.83% |
|                                                    error rate |                     cpu_usage_per_pod_2_hours |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |                    cpu_usage_per_pod_24_hours |      0.22164     |      0.234675    |       0.01304 |  ops/s |   +5.88% |
|                                               Mean Throughput |                    cpu_usage_per_pod_24_hours |      0.222373    |      0.235279    |       0.01291 |  ops/s |   +5.80% |
|                                             Median Throughput |                    cpu_usage_per_pod_24_hours |      0.222413    |      0.235353    |       0.01294 |  ops/s |   +5.82% |
|                                                Max Throughput |                    cpu_usage_per_pod_24_hours |      0.222829    |      0.235666    |       0.01284 |  ops/s |   +5.76% |
|                                       50th percentile latency |                    cpu_usage_per_pod_24_hours |  33997.8         |  18731.7         |  -15266.1     |     ms |  -44.90% |
|                                       90th percentile latency |                    cpu_usage_per_pod_24_hours |  38302.6         |  20389.4         |  -17913.1     |     ms |  -46.77% |
|                                      100th percentile latency |                    cpu_usage_per_pod_24_hours |  40222.6         |  20725.8         |  -19496.8     |     ms |  -48.47% |
|                                  50th percentile service time |                    cpu_usage_per_pod_24_hours |   4382.54        |   4202.18        |    -180.364   |     ms |   -4.12% |
|                                  90th percentile service time |                    cpu_usage_per_pod_24_hours |   5327.91        |   4288.09        |   -1039.82    |     ms |  -19.52% |
|                                 100th percentile service time |                    cpu_usage_per_pod_24_hours |   5554.2         |   4362.8         |   -1191.4     |     ms |  -21.45% |
|                                                    error rate |                    cpu_usage_per_pod_24_hours |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |               memory_usage_per_pod_15_minutes |      0.251437    |      0.251316    |      -0.00012 |  ops/s |   -0.05% |
|                                               Mean Throughput |               memory_usage_per_pod_15_minutes |      0.25168     |      0.251538    |      -0.00014 |  ops/s |   -0.06% |
|                                             Median Throughput |               memory_usage_per_pod_15_minutes |      0.251664    |      0.251524    |      -0.00014 |  ops/s |   -0.06% |
|                                                Max Throughput |               memory_usage_per_pod_15_minutes |      0.251977    |      0.25181     |      -0.00017 |  ops/s |   -0.07% |
|                                       50th percentile latency |               memory_usage_per_pod_15_minutes |   2663.03        |   2080.63        |    -582.403   |     ms |  -21.87% |
|                                       90th percentile latency |               memory_usage_per_pod_15_minutes |   3049.68        |   2258.18        |    -791.502   |     ms |  -25.95% |
|                                      100th percentile latency |               memory_usage_per_pod_15_minutes |   3099.55        |   2264.2         |    -835.359   |     ms |  -26.95% |
|                                  50th percentile service time |               memory_usage_per_pod_15_minutes |   2660.11        |   2077.28        |    -582.833   |     ms |  -21.91% |
|                                  90th percentile service time |               memory_usage_per_pod_15_minutes |   3046.56        |   2255.07        |    -791.498   |     ms |  -25.98% |
|                                 100th percentile service time |               memory_usage_per_pod_15_minutes |   3096.97        |   2260.76        |    -836.204   |     ms |  -27.00% |
|                                                    error rate |               memory_usage_per_pod_15_minutes |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |                  memory_usage_per_pod_2_hours |      0.250517    |      0.251418    |       0.0009  |  ops/s |   +0.36% |
|                                               Mean Throughput |                  memory_usage_per_pod_2_hours |      0.250604    |      0.251657    |       0.00105 |  ops/s |   +0.42% |
|                                             Median Throughput |                  memory_usage_per_pod_2_hours |      0.250598    |      0.251642    |       0.00104 |  ops/s |   +0.42% |
|                                                Max Throughput |                  memory_usage_per_pod_2_hours |      0.250711    |      0.251951    |       0.00124 |  ops/s |   +0.49% |
|                                       50th percentile latency |                  memory_usage_per_pod_2_hours |   2821.01        |   2302.84        |    -518.175   |     ms |  -18.37% |
|                                       90th percentile latency |                  memory_usage_per_pod_2_hours |   3246.79        |   2465.99        |    -780.8     |     ms |  -24.05% |
|                                      100th percentile latency |                  memory_usage_per_pod_2_hours |   3386.94        |   2483.41        |    -903.533   |     ms |  -26.68% |
|                                  50th percentile service time |                  memory_usage_per_pod_2_hours |   2818.02        |   2299.62        |    -518.4     |     ms |  -18.40% |
|                                  90th percentile service time |                  memory_usage_per_pod_2_hours |   3244.63        |   2463.18        |    -781.452   |     ms |  -24.08% |
|                                 100th percentile service time |                  memory_usage_per_pod_2_hours |   3384.75        |   2480.59        |    -904.166   |     ms |  -26.71% |
|                                                    error rate |                  memory_usage_per_pod_2_hours |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |                 memory_usage_per_pod_24_hours |      0.201418    |      0.227171    |       0.02575 |  ops/s |  +12.79% |
|                                               Mean Throughput |                 memory_usage_per_pod_24_hours |      0.202045    |      0.228049    |       0.026   |  ops/s |  +12.87% |
|                                             Median Throughput |                 memory_usage_per_pod_24_hours |      0.202065    |      0.228158    |       0.02609 |  ops/s |  +12.91% |
|                                                Max Throughput |                 memory_usage_per_pod_24_hours |      0.202751    |      0.228862    |       0.02611 |  ops/s |  +12.88% |
|                                       50th percentile latency |                 memory_usage_per_pod_24_hours |  61491.4         |  26922.9         |  -34568.5     |     ms |  -56.22% |
|                                       90th percentile latency |                 memory_usage_per_pod_24_hours |  70293.7         |  29227.1         |  -41066.6     |     ms |  -58.42% |
|                                      100th percentile latency |                 memory_usage_per_pod_24_hours |  72404.6         |  29826.8         |  -42577.8     |     ms |  -58.81% |
|                                  50th percentile service time |                 memory_usage_per_pod_24_hours |   5302.49        |   4255.55        |   -1046.94    |     ms |  -19.74% |
|                                  90th percentile service time |                 memory_usage_per_pod_24_hours |   5437.01        |   4480.18        |    -956.828   |     ms |  -17.60% |
|                                 100th percentile service time |                 memory_usage_per_pod_24_hours |   5506.58        |   4497.5         |   -1009.09    |     ms |  -18.33% |
|                                                    error rate |                 memory_usage_per_pod_24_hours |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |                     status_per_pod_15_minutes |      0.25043     |      0.251278    |       0.00085 |  ops/s |   +0.34% |
|                                               Mean Throughput |                     status_per_pod_15_minutes |      0.250503    |      0.251493    |       0.00099 |  ops/s |   +0.40% |
|                                             Median Throughput |                     status_per_pod_15_minutes |      0.250499    |      0.251479    |       0.00098 |  ops/s |   +0.39% |
|                                                Max Throughput |                     status_per_pod_15_minutes |      0.250591    |      0.251757    |       0.00117 |  ops/s |   +0.47% |
|                                       50th percentile latency |                     status_per_pod_15_minutes |   2670.72        |   2149.71        |    -521.004   |     ms |  -19.51% |
|                                       90th percentile latency |                     status_per_pod_15_minutes |   3222.46        |   2259.39        |    -963.071   |     ms |  -29.89% |
|                                      100th percentile latency |                     status_per_pod_15_minutes |   3355.41        |   2357.84        |    -997.577   |     ms |  -29.73% |
|                                  50th percentile service time |                     status_per_pod_15_minutes |   2668.16        |   2146.27        |    -521.885   |     ms |  -19.56% |
|                                  90th percentile service time |                     status_per_pod_15_minutes |   3219.56        |   2256.43        |    -963.133   |     ms |  -29.91% |
|                                 100th percentile service time |                     status_per_pod_15_minutes |   3353.64        |   2354.74        |    -998.901   |     ms |  -29.79% |
|                                                    error rate |                     status_per_pod_15_minutes |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |                        status_per_pod_2_hours |      0.251233    |      0.251149    |      -8e-05   |  ops/s |   -0.03% |
|                                               Mean Throughput |                        status_per_pod_2_hours |      0.25144     |      0.251343    |      -0.0001  |  ops/s |   -0.04% |
|                                             Median Throughput |                        status_per_pod_2_hours |      0.251427    |      0.25133     |      -0.0001  |  ops/s |   -0.04% |
|                                                Max Throughput |                        status_per_pod_2_hours |      0.251696    |      0.25158     |      -0.00012 |  ops/s |   -0.05% |
|                                       50th percentile latency |                        status_per_pod_2_hours |   2706.18        |   2171.04        |    -535.146   |     ms |  -19.77% |
|                                       90th percentile latency |                        status_per_pod_2_hours |   3205.59        |   2289.01        |    -916.578   |     ms |  -28.59% |
|                                      100th percentile latency |                        status_per_pod_2_hours |   3422.65        |   2363.43        |   -1059.22    |     ms |  -30.95% |
|                                  50th percentile service time |                        status_per_pod_2_hours |   2702.69        |   2167.76        |    -534.929   |     ms |  -19.79% |
|                                  90th percentile service time |                        status_per_pod_2_hours |   3203.12        |   2285.58        |    -917.537   |     ms |  -28.65% |
|                                 100th percentile service time |                        status_per_pod_2_hours |   3420.84        |   2359.42        |   -1061.42    |     ms |  -31.03% |
|                                                    error rate |                        status_per_pod_2_hours |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |                       status_per_pod_24_hours |      0.250983    |      0.251456    |       0.00047 |  ops/s |   +0.19% |
|                                               Mean Throughput |                       status_per_pod_24_hours |      0.251149    |      0.251702    |       0.00055 |  ops/s |   +0.22% |
|                                             Median Throughput |                       status_per_pod_24_hours |      0.251138    |      0.251687    |       0.00055 |  ops/s |   +0.22% |
|                                                Max Throughput |                       status_per_pod_24_hours |      0.251352    |      0.252003    |       0.00065 |  ops/s |   +0.26% |
|                                       50th percentile latency |                       status_per_pod_24_hours |   2786.62        |   2280.12        |    -506.506   |     ms |  -18.18% |
|                                       90th percentile latency |                       status_per_pod_24_hours |   3219.01        |   2352.98        |    -866.026   |     ms |  -26.90% |
|                                      100th percentile latency |                       status_per_pod_24_hours |   3440.12        |   2461.41        |    -978.709   |     ms |  -28.45% |
|                                  50th percentile service time |                       status_per_pod_24_hours |   2784.09        |   2276.69        |    -507.403   |     ms |  -18.23% |
|                                  90th percentile service time |                       status_per_pod_24_hours |   3216.36        |   2350.07        |    -866.288   |     ms |  -26.93% |
|                                 100th percentile service time |                       status_per_pod_24_hours |   3438.24        |   2458.86        |    -979.384   |     ms |  -28.49% |
|                                                    error rate |                       status_per_pod_24_hours |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |           tx_network_usage_per_pod_15_minutes |      0.251105    |      0.251311    |       0.00021 |  ops/s |   +0.08% |
|                                               Mean Throughput |           tx_network_usage_per_pod_15_minutes |      0.25129     |      0.251532    |       0.00024 |  ops/s |   +0.10% |
|                                             Median Throughput |           tx_network_usage_per_pod_15_minutes |      0.251278    |      0.251518    |       0.00024 |  ops/s |   +0.10% |
|                                                Max Throughput |           tx_network_usage_per_pod_15_minutes |      0.251517    |      0.251803    |       0.00029 |  ops/s |   +0.11% |
|                                       50th percentile latency |           tx_network_usage_per_pod_15_minutes |   3221.24        |   2368.32        |    -852.926   |     ms |  -26.48% |
|                                       90th percentile latency |           tx_network_usage_per_pod_15_minutes |   3522.43        |   2485.3         |   -1037.13    |     ms |  -29.44% |
|                                      100th percentile latency |           tx_network_usage_per_pod_15_minutes |   3734.87        |   2563.46        |   -1171.41    |     ms |  -31.36% |
|                                  50th percentile service time |           tx_network_usage_per_pod_15_minutes |   3218.83        |   2364.76        |    -854.07    |     ms |  -26.53% |
|                                  90th percentile service time |           tx_network_usage_per_pod_15_minutes |   3520.64        |   2482.31        |   -1038.33    |     ms |  -29.49% |
|                                 100th percentile service time |           tx_network_usage_per_pod_15_minutes |   3732.54        |   2560.59        |   -1171.95    |     ms |  -31.40% |
|                                                    error rate |           tx_network_usage_per_pod_15_minutes |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |              tx_network_usage_per_pod_2_hours |      0.236835    |      0.250111    |       0.01328 |  ops/s |   +5.61% |
|                                               Mean Throughput |              tx_network_usage_per_pod_2_hours |      0.237451    |      0.250128    |       0.01268 |  ops/s |   +5.34% |
|                                             Median Throughput |              tx_network_usage_per_pod_2_hours |      0.23746     |      0.250127    |       0.01267 |  ops/s |   +5.33% |
|                                                Max Throughput |              tx_network_usage_per_pod_2_hours |      0.238222    |      0.25015     |       0.01193 |  ops/s |   +5.01% |
|                                       50th percentile latency |              tx_network_usage_per_pod_2_hours |  17126.2         |   2765.37        |  -14360.9     |     ms |  -83.85% |
|                                       90th percentile latency |              tx_network_usage_per_pod_2_hours |  19627.8         |   3100.87        |  -16526.9     |     ms |  -84.20% |
|                                      100th percentile latency |              tx_network_usage_per_pod_2_hours |  20132.7         |   3158.21        |  -16974.5     |     ms |  -84.31% |
|                                  50th percentile service time |              tx_network_usage_per_pod_2_hours |   4323.14        |   2763.48        |   -1559.66    |     ms |  -36.08% |
|                                  90th percentile service time |              tx_network_usage_per_pod_2_hours |   4494.92        |   3098.53        |   -1396.39    |     ms |  -31.07% |
|                                 100th percentile service time |              tx_network_usage_per_pod_2_hours |   4587.42        |   3155           |   -1432.42    |     ms |  -31.22% |
|                                                    error rate |              tx_network_usage_per_pod_2_hours |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |             tx_network_usage_per_pod_24_hours |      0.14825     |      0.185135    |       0.03688 |  ops/s |  +24.88% |
|                                               Mean Throughput |             tx_network_usage_per_pod_24_hours |      0.149072    |      0.185535    |       0.03646 |  ops/s |  +24.46% |
|                                             Median Throughput |             tx_network_usage_per_pod_24_hours |      0.149053    |      0.185585    |       0.03653 |  ops/s |  +24.51% |
|                                                Max Throughput |             tx_network_usage_per_pod_24_hours |      0.149848    |      0.185771    |       0.03592 |  ops/s |  +23.97% |
|                                       50th percentile latency |             tx_network_usage_per_pod_24_hours | 167261           |  87792.7         |  -79468.1     |     ms |  -47.51% |
|                                       90th percentile latency |             tx_network_usage_per_pod_24_hours | 186835           |  99058.8         |  -87775.8     |     ms |  -46.98% |
|                                      100th percentile latency |             tx_network_usage_per_pod_24_hours | 190669           | 101284           |  -89384.4     |     ms |  -46.88% |
|                                  50th percentile service time |             tx_network_usage_per_pod_24_hours |   6473.44        |   5390.16        |   -1083.29    |     ms |  -16.73% |
|                                  90th percentile service time |             tx_network_usage_per_pod_24_hours |   6640.46        |   5562.26        |   -1078.2     |     ms |  -16.24% |
|                                 100th percentile service time |             tx_network_usage_per_pod_24_hours |   6876.85        |   5688.9         |   -1187.95    |     ms |  -17.27% |
|                                                    error rate |             tx_network_usage_per_pod_24_hours |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |              average_container_cpu_15_minutes |      0.252175    |      0.25307     |       0.00089 |  ops/s |   +0.35% |
|                                               Mean Throughput |              average_container_cpu_15_minutes |      0.252544    |      0.253594    |       0.00105 |  ops/s |   +0.42% |
|                                             Median Throughput |              average_container_cpu_15_minutes |      0.252521    |      0.253561    |       0.00104 |  ops/s |   +0.41% |
|                                                Max Throughput |              average_container_cpu_15_minutes |      0.252996    |      0.254234    |       0.00124 |  ops/s |   +0.49% |
|                                       50th percentile latency |              average_container_cpu_15_minutes |    886.277       |    297.352       |    -588.926   |     ms |  -66.45% |
|                                       90th percentile latency |              average_container_cpu_15_minutes |   1281.68        |    320.909       |    -960.771   |     ms |  -74.96% |
|                                      100th percentile latency |              average_container_cpu_15_minutes |   1320.95        |    344.926       |    -976.022   |     ms |  -73.89% |
|                                  50th percentile service time |              average_container_cpu_15_minutes |    883.169       |    294.133       |    -589.036   |     ms |  -66.70% |
|                                  90th percentile service time |              average_container_cpu_15_minutes |   1278.34        |    317.579       |    -960.762   |     ms |  -75.16% |
|                                 100th percentile service time |              average_container_cpu_15_minutes |   1318.13        |    341.646       |    -976.486   |     ms |  -74.08% |
|                                                    error rate |              average_container_cpu_15_minutes |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |                 average_container_cpu_2_hours |      0.252408    |      0.253231    |       0.00082 |  ops/s |   +0.33% |
|                                               Mean Throughput |                 average_container_cpu_2_hours |      0.252816    |      0.253783    |       0.00097 |  ops/s |   +0.38% |
|                                             Median Throughput |                 average_container_cpu_2_hours |      0.25279     |      0.253748    |       0.00096 |  ops/s |   +0.38% |
|                                                Max Throughput |                 average_container_cpu_2_hours |      0.253317    |      0.254458    |       0.00114 |  ops/s |   +0.45% |
|                                       50th percentile latency |                 average_container_cpu_2_hours |    865.976       |    404.275       |    -461.701   |     ms |  -53.32% |
|                                       90th percentile latency |                 average_container_cpu_2_hours |   1306.58        |    446.614       |    -859.967   |     ms |  -65.82% |
|                                      100th percentile latency |                 average_container_cpu_2_hours |   1427.34        |    454.313       |    -973.025   |     ms |  -68.17% |
|                                  50th percentile service time |                 average_container_cpu_2_hours |    862.771       |    401.192       |    -461.579   |     ms |  -53.50% |
|                                  90th percentile service time |                 average_container_cpu_2_hours |   1303.56        |    443.351       |    -860.211   |     ms |  -65.99% |
|                                 100th percentile service time |                 average_container_cpu_2_hours |   1424.58        |    450.492       |    -974.091   |     ms |  -68.38% |
|                                                    error rate |                 average_container_cpu_2_hours |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |                average_container_cpu_24_hours |      0.252256    |      0.252456    |       0.0002  |  ops/s |   +0.08% |
|                                               Mean Throughput |                average_container_cpu_24_hours |      0.252638    |      0.252873    |       0.00023 |  ops/s |   +0.09% |
|                                             Median Throughput |                average_container_cpu_24_hours |      0.252614    |      0.252846    |       0.00023 |  ops/s |   +0.09% |
|                                                Max Throughput |                average_container_cpu_24_hours |      0.253105    |      0.253382    |       0.00028 |  ops/s |   +0.11% |
|                                       50th percentile latency |                average_container_cpu_24_hours |   1818.13        |   1289.9         |    -528.229   |     ms |  -29.05% |
|                                       90th percentile latency |                average_container_cpu_24_hours |   2212.73        |   1312.99        |    -899.744   |     ms |  -40.66% |
|                                      100th percentile latency |                average_container_cpu_24_hours |   2287.82        |   1337.76        |    -950.057   |     ms |  -41.53% |
|                                  50th percentile service time |                average_container_cpu_24_hours |   1814.41        |   1286.96        |    -527.448   |     ms |  -29.07% |
|                                  90th percentile service time |                average_container_cpu_24_hours |   2209.02        |   1309.88        |    -899.138   |     ms |  -40.70% |
|                                 100th percentile service time |                average_container_cpu_24_hours |   2284.33        |   1334.9         |    -949.429   |     ms |  -41.56% |
|                                                    error rate |                average_container_cpu_24_hours |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |     average_container_memory_usage_15_minutes |      0.252848    |      0.25314     |       0.00029 |  ops/s |   +0.12% |
|                                               Mean Throughput |     average_container_memory_usage_15_minutes |      0.253332    |      0.253675    |       0.00034 |  ops/s |   +0.14% |
|                                             Median Throughput |     average_container_memory_usage_15_minutes |      0.253301    |      0.253641    |       0.00034 |  ops/s |   +0.13% |
|                                                Max Throughput |     average_container_memory_usage_15_minutes |      0.253925    |      0.254331    |       0.00041 |  ops/s |   +0.16% |
|                                       50th percentile latency |     average_container_memory_usage_15_minutes |    860.337       |    306.072       |    -554.265   |     ms |  -64.42% |
|                                       90th percentile latency |     average_container_memory_usage_15_minutes |   1263.74        |    354.066       |    -909.679   |     ms |  -71.98% |
|                                      100th percentile latency |     average_container_memory_usage_15_minutes |   1280.12        |    363.9         |    -916.217   |     ms |  -71.57% |
|                                  50th percentile service time |     average_container_memory_usage_15_minutes |    857.434       |    303.19        |    -554.244   |     ms |  -64.64% |
|                                  90th percentile service time |     average_container_memory_usage_15_minutes |   1260.62        |    351.062       |    -909.559   |     ms |  -72.15% |
|                                 100th percentile service time |     average_container_memory_usage_15_minutes |   1277.07        |    361.073       |    -915.999   |     ms |  -71.73% |
|                                                    error rate |     average_container_memory_usage_15_minutes |      0           |      0           |       0       |      % |    0.00% |
|                                                    error rate |                     touch-container-2-2_hours |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |        average_container_memory_usage_2_hours |      0.252496    |      0.253194    |       0.0007  |  ops/s |   +0.28% |
|                                               Mean Throughput |        average_container_memory_usage_2_hours |      0.252919    |      0.253738    |       0.00082 |  ops/s |   +0.32% |
|                                             Median Throughput |        average_container_memory_usage_2_hours |      0.252892    |      0.253703    |       0.00081 |  ops/s |   +0.32% |
|                                                Max Throughput |        average_container_memory_usage_2_hours |      0.253438    |      0.254405    |       0.00097 |  ops/s |   +0.38% |
|                                       50th percentile latency |        average_container_memory_usage_2_hours |    888.903       |    422.546       |    -466.357   |     ms |  -52.46% |
|                                       90th percentile latency |        average_container_memory_usage_2_hours |   1293.86        |    441.364       |    -852.492   |     ms |  -65.89% |
|                                      100th percentile latency |        average_container_memory_usage_2_hours |   1347.94        |    442.487       |    -905.458   |     ms |  -67.17% |
|                                  50th percentile service time |        average_container_memory_usage_2_hours |    885.332       |    418.892       |    -466.44    |     ms |  -52.69% |
|                                  90th percentile service time |        average_container_memory_usage_2_hours |   1290.66        |    438.092       |    -852.571   |     ms |  -66.06% |
|                                 100th percentile service time |        average_container_memory_usage_2_hours |   1345.03        |    439.718       |    -905.311   |     ms |  -67.31% |
|                                                    error rate |        average_container_memory_usage_2_hours |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |       average_container_memory_usage_24_hours |      0.251839    |      0.252336    |       0.0005  |  ops/s |   +0.20% |
|                                               Mean Throughput |       average_container_memory_usage_24_hours |      0.252149    |      0.252732    |       0.00058 |  ops/s |   +0.23% |
|                                             Median Throughput |       average_container_memory_usage_24_hours |      0.25213     |      0.252706    |       0.00058 |  ops/s |   +0.23% |
|                                                Max Throughput |       average_container_memory_usage_24_hours |      0.252529    |      0.253216    |       0.00069 |  ops/s |   +0.27% |
|                                       50th percentile latency |       average_container_memory_usage_24_hours |   1847.43        |   1285.11        |    -562.323   |     ms |  -30.44% |
|                                       90th percentile latency |       average_container_memory_usage_24_hours |   2248.64        |   1329.95        |    -918.689   |     ms |  -40.86% |
|                                      100th percentile latency |       average_container_memory_usage_24_hours |   2342.96        |   1344.27        |    -998.685   |     ms |  -42.62% |
|                                  50th percentile service time |       average_container_memory_usage_24_hours |   1843.34        |   1281.95        |    -561.386   |     ms |  -30.45% |
|                                  90th percentile service time |       average_container_memory_usage_24_hours |   2245.16        |   1326.77        |    -918.397   |     ms |  -40.91% |
|                                 100th percentile service time |       average_container_memory_usage_24_hours |   2339.35        |   1339.61        |    -999.74    |     ms |  -42.74% |
|                                                    error rate |       average_container_memory_usage_24_hours |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |            cpu_usage_per_container_15_minutes |      0.253262    |      0.252845    |      -0.00042 |  ops/s |   -0.16% |
|                                               Mean Throughput |            cpu_usage_per_container_15_minutes |      0.253819    |      0.253329    |      -0.00049 |  ops/s |   -0.19% |
|                                             Median Throughput |            cpu_usage_per_container_15_minutes |      0.253783    |      0.253299    |      -0.00048 |  ops/s |   -0.19% |
|                                                Max Throughput |            cpu_usage_per_container_15_minutes |      0.2545      |      0.253922    |      -0.00058 |  ops/s |   -0.23% |
|                                       50th percentile latency |            cpu_usage_per_container_15_minutes |    907.57        |    302.183       |    -605.387   |     ms |  -66.70% |
|                                       90th percentile latency |            cpu_usage_per_container_15_minutes |   1284.38        |    332.503       |    -951.878   |     ms |  -74.11% |
|                                      100th percentile latency |            cpu_usage_per_container_15_minutes |   1329.71        |    340.18        |    -989.534   |     ms |  -74.42% |
|                                  50th percentile service time |            cpu_usage_per_container_15_minutes |    904.298       |    299.2         |    -605.099   |     ms |  -66.91% |
|                                  90th percentile service time |            cpu_usage_per_container_15_minutes |   1280.96        |    328.693       |    -952.265   |     ms |  -74.34% |
|                                 100th percentile service time |            cpu_usage_per_container_15_minutes |   1326.24        |    337.015       |    -989.226   |     ms |  -74.59% |
|                                                    error rate |            cpu_usage_per_container_15_minutes |      0           |      0           |       0       |      % |    0.00% 
|                                                Min Throughput |               cpu_usage_per_container_2_hours |      0.253181    |      0.253238    |       6e-05   |  ops/s |   +0.02% |
|                                               Mean Throughput |               cpu_usage_per_container_2_hours |      0.253723    |      0.25379     |       7e-05   |  ops/s |   +0.03% |
|                                             Median Throughput |               cpu_usage_per_container_2_hours |      0.253689    |      0.253755    |       7e-05   |  ops/s |   +0.03% |
|                                                Max Throughput |               cpu_usage_per_container_2_hours |      0.254387    |      0.254467    |       8e-05   |  ops/s |   +0.03% |
|                                       50th percentile latency |               cpu_usage_per_container_2_hours |    825.138       |    429.022       |    -396.116   |     ms |  -48.01% |
|                                       90th percentile latency |               cpu_usage_per_container_2_hours |   1201.43        |    441.922       |    -759.513   |     ms |  -63.22% |
|                                      100th percentile latency |               cpu_usage_per_container_2_hours |   1232.65        |    447.399       |    -785.256   |     ms |  -63.70% |
|                                  50th percentile service time |               cpu_usage_per_container_2_hours |    822.643       |    425.594       |    -397.049   |     ms |  -48.27% |
|                                  90th percentile service time |               cpu_usage_per_container_2_hours |   1199.24        |    438.72        |    -760.524   |     ms |  -63.42% |
|                                 100th percentile service time |               cpu_usage_per_container_2_hours |   1229.44        |    444.341       |    -785.096   |     ms |  -63.86% |
|                                                    error rate |               cpu_usage_per_container_2_hours |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |              cpu_usage_per_container_24_hours |      0.251532    |      0.252392    |       0.00086 |  ops/s |   +0.34% |
|                                               Mean Throughput |              cpu_usage_per_container_24_hours |      0.25179     |      0.252798    |       0.00101 |  ops/s |   +0.40% |
|                                             Median Throughput |              cpu_usage_per_container_24_hours |      0.251773    |      0.252772    |       0.001   |  ops/s |   +0.40% |
|                                                Max Throughput |              cpu_usage_per_container_24_hours |      0.252107    |      0.253295    |       0.00119 |  ops/s |   +0.47% |
|                                       50th percentile latency |              cpu_usage_per_container_24_hours |   1856.69        |   1312.19        |    -544.506   |     ms |  -29.33% |
|                                       90th percentile latency |              cpu_usage_per_container_24_hours |   2331.52        |   1345.49        |    -986.027   |     ms |  -42.29% |
|                                      100th percentile latency |              cpu_usage_per_container_24_hours |   2363.02        |   1380.2         |    -982.823   |     ms |  -41.59% |
|                                  50th percentile service time |              cpu_usage_per_container_24_hours |   1853.28        |   1308.19        |    -545.093   |     ms |  -29.41% |
|                                  90th percentile service time |              cpu_usage_per_container_24_hours |   2328.74        |   1342.04        |    -986.707   |     ms |  -42.37% |
|                                 100th percentile service time |              cpu_usage_per_container_24_hours |   2360.13        |   1377.04        |    -983.094   |     ms |  -41.65% |
|                                                    error rate |              cpu_usage_per_container_24_hours |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |         memory_usage_per_container_15_minutes |      0.25254     |      0.25331     |       0.00077 |  ops/s |   +0.31% |
|                                               Mean Throughput |         memory_usage_per_container_15_minutes |      0.252971    |      0.253875    |       0.0009  |  ops/s |   +0.36% |
|                                             Median Throughput |         memory_usage_per_container_15_minutes |      0.252944    |      0.253839    |       0.0009  |  ops/s |   +0.35% |
|                                                Max Throughput |         memory_usage_per_container_15_minutes |      0.2535      |      0.254566    |       0.00107 |  ops/s |   +0.42% |
|                                       50th percentile latency |         memory_usage_per_container_15_minutes |    758.164       |    313.34        |    -444.824   |     ms |  -58.67% |
|                                       90th percentile latency |         memory_usage_per_container_15_minutes |   1162.07        |    344.578       |    -817.496   |     ms |  -70.35% |
|                                      100th percentile latency |         memory_usage_per_container_15_minutes |   1196.3         |    348.813       |    -847.484   |     ms |  -70.84% |
|                                  50th percentile service time |         memory_usage_per_container_15_minutes |    754.9         |    311.079       |    -443.821   |     ms |  -58.79% |
|                                  90th percentile service time |         memory_usage_per_container_15_minutes |   1158.3         |    341.361       |    -816.943   |     ms |  -70.53% |
|                                 100th percentile service time |         memory_usage_per_container_15_minutes |   1193.44        |    347.17        |    -846.271   |     ms |  -70.91% |
|                                                    error rate |         memory_usage_per_container_15_minutes |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |            memory_usage_per_container_2_hours |      0.253013    |      0.253186    |       0.00017 |  ops/s |   +0.07% |
|                                               Mean Throughput |            memory_usage_per_container_2_hours |      0.253526    |      0.253728    |       0.0002  |  ops/s |   +0.08% |
|                                             Median Throughput |            memory_usage_per_container_2_hours |      0.253494    |      0.253693    |       0.0002  |  ops/s |   +0.08% |
|                                                Max Throughput |            memory_usage_per_container_2_hours |      0.254155    |      0.254393    |       0.00024 |  ops/s |   +0.09% |
|                                       50th percentile latency |            memory_usage_per_container_2_hours |   1001.99        |    433.239       |    -568.747   |     ms |  -56.76% |
|                                       90th percentile latency |            memory_usage_per_container_2_hours |   1428.45        |    441.926       |    -986.527   |     ms |  -69.06% |
|                                      100th percentile latency |            memory_usage_per_container_2_hours |   1469.76        |    442.126       |   -1027.63    |     ms |  -69.92% |
|                                  50th percentile service time |            memory_usage_per_container_2_hours |    998.843       |    430.401       |    -568.442   |     ms |  -56.91% |
|                                  90th percentile service time |            memory_usage_per_container_2_hours |   1426.08        |    438.602       |    -987.473   |     ms |  -69.24% |
|                                 100th percentile service time |            memory_usage_per_container_2_hours |   1466.65        |    439.358       |   -1027.29    |     ms |  -70.04% |
|                                                    error rate |            memory_usage_per_container_2_hours |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |           memory_usage_per_container_24_hours |      0.25223     |      0.252362    |       0.00013 |  ops/s |   +0.05% |
|                                               Mean Throughput |           memory_usage_per_container_24_hours |      0.252608    |      0.252762    |       0.00015 |  ops/s |   +0.06% |
|                                             Median Throughput |           memory_usage_per_container_24_hours |      0.252583    |      0.252735    |       0.00015 |  ops/s |   +0.06% |
|                                                Max Throughput |           memory_usage_per_container_24_hours |      0.253072    |      0.253252    |       0.00018 |  ops/s |   +0.07% |
|                                       50th percentile latency |           memory_usage_per_container_24_hours |   1782.58        |   1262.98        |    -519.597   |     ms |  -29.15% |
|                                       90th percentile latency |           memory_usage_per_container_24_hours |   2268.7         |   1314.38        |    -954.318   |     ms |  -42.06% |
|                                      100th percentile latency |           memory_usage_per_container_24_hours |   2409.73        |   1330.83        |   -1078.9     |     ms |  -44.77% |
|                                  50th percentile service time |           memory_usage_per_container_24_hours |   1779.05        |   1259.46        |    -519.588   |     ms |  -29.21% |
|                                  90th percentile service time |           memory_usage_per_container_24_hours |   2265.67        |   1310.93        |    -954.748   |     ms |  -42.14% |
|                                 100th percentile service time |           memory_usage_per_container_24_hours |   2406.34        |   1327.95        |   -1078.39    |     ms |  -44.81% |
|                                                    error rate |           memory_usage_per_container_24_hours |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |            unique_deployment_count_15_minutes |      0.252456    |      0.253325    |       0.00087 |  ops/s |   +0.34% |
|                                               Mean Throughput |            unique_deployment_count_15_minutes |      0.252873    |      0.253893    |       0.00102 |  ops/s |   +0.40% |
|                                             Median Throughput |            unique_deployment_count_15_minutes |      0.252846    |      0.253857    |       0.00101 |  ops/s |   +0.40% |
|                                                Max Throughput |            unique_deployment_count_15_minutes |      0.253384    |      0.254587    |       0.0012  |  ops/s |   +0.47% |
|                                       50th percentile latency |            unique_deployment_count_15_minutes |    776.125       |    272.052       |    -504.074   |     ms |  -64.95% |
|                                       90th percentile latency |            unique_deployment_count_15_minutes |   1228.58        |    312.831       |    -915.752   |     ms |  -74.54% |
|                                      100th percentile latency |            unique_deployment_count_15_minutes |   1316.96        |    325.933       |    -991.025   |     ms |  -75.25% |
|                                  50th percentile service time |            unique_deployment_count_15_minutes |    772.745       |    268.975       |    -503.77    |     ms |  -65.19% |
|                                  90th percentile service time |            unique_deployment_count_15_minutes |   1225.15        |    310.09        |    -915.059   |     ms |  -74.69% |
|                                 100th percentile service time |            unique_deployment_count_15_minutes |   1313.57        |    322.715       |    -990.86    |     ms |  -75.43% |
|                                                    error rate |            unique_deployment_count_15_minutes |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |               unique_deployment_count_2_hours |      0.252706    |      0.252908    |       0.0002  |  ops/s |   +0.08% |
|                                               Mean Throughput |               unique_deployment_count_2_hours |      0.253167    |      0.253404    |       0.00024 |  ops/s |   +0.09% |
|                                             Median Throughput |               unique_deployment_count_2_hours |      0.253137    |      0.253372    |       0.00024 |  ops/s |   +0.09% |
|                                                Max Throughput |               unique_deployment_count_2_hours |      0.25373     |      0.25401     |       0.00028 |  ops/s |   +0.11% |
|                                       50th percentile latency |               unique_deployment_count_2_hours |    747.126       |    281.38        |    -465.746   |     ms |  -62.34% |
|                                       90th percentile latency |               unique_deployment_count_2_hours |   1169.22        |    321.966       |    -847.256   |     ms |  -72.46% |
|                                      100th percentile latency |               unique_deployment_count_2_hours |   1295.91        |    326.926       |    -968.986   |     ms |  -74.77% |
|                                  50th percentile service time |               unique_deployment_count_2_hours |    743.84        |    277.928       |    -465.912   |     ms |  -62.64% |
|                                  90th percentile service time |               unique_deployment_count_2_hours |   1166.06        |    318.631       |    -847.429   |     ms |  -72.67% |
|                                 100th percentile service time |               unique_deployment_count_2_hours |   1293.22        |    324.07        |    -969.153   |     ms |  -74.94% |
|                                                    error rate |               unique_deployment_count_2_hours |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |              unique_deployment_count_24_hours |      0.2526      |      0.253301    |       0.0007  |  ops/s |   +0.28% |
|                                               Mean Throughput |              unique_deployment_count_24_hours |      0.253041    |      0.253864    |       0.00082 |  ops/s |   +0.33% |
|                                             Median Throughput |              unique_deployment_count_24_hours |      0.253012    |      0.253829    |       0.00082 |  ops/s |   +0.32% |
|                                                Max Throughput |              unique_deployment_count_24_hours |      0.253579    |      0.254553    |       0.00097 |  ops/s |   +0.38% |
|                                       50th percentile latency |              unique_deployment_count_24_hours |    886.754       |    444.61        |    -442.144   |     ms |  -49.86% |
|                                       90th percentile latency |              unique_deployment_count_24_hours |   1348.74        |    451.835       |    -896.91    |     ms |  -66.50% |
|                                      100th percentile latency |              unique_deployment_count_24_hours |   1394.04        |    454.048       |    -939.996   |     ms |  -67.43% |
|                                  50th percentile service time |              unique_deployment_count_24_hours |    883.133       |    441.901       |    -441.232   |     ms |  -49.96% |
|                                  90th percentile service time |              unique_deployment_count_24_hours |   1345.91        |    448.906       |    -897       |     ms |  -66.65% |
|                                 100th percentile service time |              unique_deployment_count_24_hours |   1390.68        |    450.67        |    -940.011   |     ms |  -67.59% |
|                                                    error rate |              unique_deployment_count_24_hours |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput | percentile_cpu_usage_per_container_15_minutes |      0.252941    |      0.253293    |       0.00035 |  ops/s |   +0.14% |
|                                               Mean Throughput | percentile_cpu_usage_per_container_15_minutes |      0.253441    |      0.253855    |       0.00041 |  ops/s |   +0.16% |
|                                             Median Throughput | percentile_cpu_usage_per_container_15_minutes |      0.25341     |      0.253819    |       0.00041 |  ops/s |   +0.16% |
|                                                Max Throughput | percentile_cpu_usage_per_container_15_minutes |      0.254054    |      0.254543    |       0.00049 |  ops/s |   +0.19% |
|                                       50th percentile latency | percentile_cpu_usage_per_container_15_minutes |    923.589       |    481.065       |    -442.524   |     ms |  -47.91% |
|                                       90th percentile latency | percentile_cpu_usage_per_container_15_minutes |   1422.86        |    509.918       |    -912.942   |     ms |  -64.16% |
|                                      100th percentile latency | percentile_cpu_usage_per_container_15_minutes |   1539.82        |    519.568       |   -1020.25    |     ms |  -66.26% |
|                                  50th percentile service time | percentile_cpu_usage_per_container_15_minutes |    919.393       |    471.503       |    -447.891   |     ms |  -48.72% |
|                                  90th percentile service time | percentile_cpu_usage_per_container_15_minutes |   1419.31        |    501.1         |    -918.208   |     ms |  -64.69% |
|                                 100th percentile service time | percentile_cpu_usage_per_container_15_minutes |   1536.02        |    511.188       |   -1024.83    |     ms |  -66.72% |
|                                                    error rate | percentile_cpu_usage_per_container_15_minutes |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |    percentile_cpu_usage_per_container_2_hours |      0.250952    |      0.251612    |       0.00066 |  ops/s |   +0.26% |
|                                               Mean Throughput |    percentile_cpu_usage_per_container_2_hours |      0.251112    |      0.251885    |       0.00077 |  ops/s |   +0.31% |
|                                             Median Throughput |    percentile_cpu_usage_per_container_2_hours |      0.251102    |      0.251866    |       0.00076 |  ops/s |   +0.30% |
|                                                Max Throughput |    percentile_cpu_usage_per_container_2_hours |      0.251308    |      0.252219    |       0.00091 |  ops/s |   +0.36% |
|                                       50th percentile latency |    percentile_cpu_usage_per_container_2_hours |   2485.09        |   1747.13        |    -737.959   |     ms |  -29.70% |
|                                       90th percentile latency |    percentile_cpu_usage_per_container_2_hours |   2944.03        |   1773.05        |   -1170.97    |     ms |  -39.77% |
|                                      100th percentile latency |    percentile_cpu_usage_per_container_2_hours |   2964.93        |   1807.24        |   -1157.68    |     ms |  -39.05% |
|                                  50th percentile service time |    percentile_cpu_usage_per_container_2_hours |   2482.02        |   1743.35        |    -738.662   |     ms |  -29.76% |
|                                  90th percentile service time |    percentile_cpu_usage_per_container_2_hours |   2928.58        |   1769.7         |   -1158.88    |     ms |  -39.57% |
|                                 100th percentile service time |    percentile_cpu_usage_per_container_2_hours |   2940.87        |   1803.39        |   -1137.48    |     ms |  -38.68% |
|                                                    error rate |    percentile_cpu_usage_per_container_2_hours |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |   percentile_cpu_usage_per_container_24_hours |      0.0418876   |      0.0485755   |       0.00669 |  ops/s |  +15.97% |
|                                               Mean Throughput |   percentile_cpu_usage_per_container_24_hours |      0.0419224   |      0.0486642   |       0.00674 |  ops/s |  +16.08% |
|                                             Median Throughput |   percentile_cpu_usage_per_container_24_hours |      0.0419186   |      0.0486697   |       0.00675 |  ops/s |  +16.11% |
|                                                Max Throughput |   percentile_cpu_usage_per_container_24_hours |      0.0419766   |      0.0487154   |       0.00674 |  ops/s |  +16.05% |
|                                       50th percentile latency |   percentile_cpu_usage_per_container_24_hours |      1.20657e+06 |      1.0054e+06  | -201174       |     ms |  -16.67% |
|                                       90th percentile latency |   percentile_cpu_usage_per_container_24_hours |      1.36512e+06 |      1.13851e+06 | -226615       |     ms |  -16.60% |
|                                      100th percentile latency |   percentile_cpu_usage_per_container_24_hours |      1.39627e+06 |      1.16332e+06 | -232946       |     ms |  -16.68% |
|                                  50th percentile service time |   percentile_cpu_usage_per_container_24_hours |  24190.1         |  20773.7         |   -3416.4     |     ms |  -14.12% |
|                                  90th percentile service time |   percentile_cpu_usage_per_container_24_hours |  24906.6         |  20943.4         |   -3963.24    |     ms |  -15.91% |
|                                 100th percentile service time |   percentile_cpu_usage_per_container_24_hours |  25055.8         |  21652.2         |   -3403.58    |     ms |  -13.58% |
|                                                    error rate |   percentile_cpu_usage_per_container_24_hours |      0           |      0           |       0       |      % |    0.00% |

assertFalse(primary.scheduledRefresh());
assertEquals(lastSearchAccess, primary.getLastSearcherAccess());
// wait until the thread-pool has moved the timestamp otherwise we can't assert on this below
assertBusy(() -> assertThat(primary.getThreadPool().relativeTimeInMillis(), greaterThan(lastSearchAccess)));
CountDownLatch latch = new CountDownLatch(10);
for (int i = 0; i < 10; i++) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because makeShardSearchActive(...) now triggers a refresh, we can no longer invoke this method many times without a refresh to happen. Before this was possible in this test and then just invoke scheduleRefresh(...) and all the listeners were invoked.

Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks fine.

I do worry slightly about a small storm of such searches causing additional refreshes. But that seems unlikely enough to not add more protection against.

@@ -3831,14 +3831,17 @@ public void afterRefresh(boolean didRefresh) {
* @param listener the listener to invoke once the pending refresh location is visible. The listener will be called with
* <code>true</code> if the listener was registered to wait for a refresh.
*/
public final void awaitShardSearchActive(Consumer<Boolean> listener) {
public final void makeShardSearchActive(Consumer<Boolean> listener) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd probably call this ensureShardSearchActive, though not too important.

markSearcherAccessed(); // move the shard into non-search idle
final Translog.Location location = pendingRefreshLocation.get();
if (location != null) {
addRefreshListener(location, (result) -> {
pendingRefreshLocation.compareAndSet(location, null);
listener.accept(true);
});
// TODO: maybe just invoke getEngine().maybeRefresh(...) here?
// (schedule refresh does a few more things that I don't is necessary here?)
scheduledRefresh();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it will work fine, but one problem with maybeRefresh is that it will not provoke a refresh if there is an ongoing one. But there is no certainty that another ongoing refresh will contain the location. However, such a refresh would have to be an explicit one. I find the risk small and regardless an improvement over current state.

I think we can switch to maybeRefresh instead, though I'd like to check that pendingRefreshLocation is still location then, i.e.:

Suggested change
scheduledRefresh();
if (pendingRefreshLocation.get() == location) {
getEngine().maybeRefresh();
}

both to handle racy cases and the case where there are no more refresh listener slots.

Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot one thing.

@martijnvg martijnvg added >enhancement and removed WIP labels Jun 1, 2023
@elasticsearchmachine elasticsearchmachine added Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) labels Jun 1, 2023
@elasticsearchmachine
Copy link
Collaborator

Hi @martijnvg, I've created a changelog YAML for you.

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (Team:Analytics)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@martijnvg martijnvg requested a review from henningandersen June 7, 2023 08:22
Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

markSearcherAccessed(); // move the shard into non-search idle
final Translog.Location location = pendingRefreshLocation.get();
if (location != null) {
addRefreshListener(location, (result) -> {
pendingRefreshLocation.compareAndSet(location, null);
listener.accept(true);
});
// trigger a refresh to avoid waiting for scheduledRefresh(...) to be invoked from index level refresh scheduler.
// (The if statement should avoid doing an additional refresh if scheduled refresh was invoked between getting
// the current refresh location and adding a refresh listener.)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In particular, addRefreshListener might have performed the refresh already in edge cases.

Co-authored-by: Henning Andersen <[email protected]>
@martijnvg
Copy link
Member Author

@elasticmachine update branch

Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. >enhancement :StorageEngine/TSDB You know, for Metrics Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. v8.9.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Trigger a refresh when a shard becomes search active instead of waiting for it.
4 participants