Trigger refresh when shard becomes search active #96321

martijnvg · 2023-05-24T11:44:23Z

This change invokes Engine#maybeRefresh() when a shard is search-idle and becomes search-active in IndexShard#ensureShardSearchActive(...) (used to be named waitShardSearchActive(...)).

Prior to this change shard level search execution is idle until the schedule refresh has been execute. This includes the time it takes for the refresh to be scheduled (which is a full second). This unnecessarily increases the query time of a search request.

Closes #95544

martijnvg · 2023-05-25T13:41:29Z

(accidentally pushed 'ready for review' on this PR... too many open browser tabs)

…ot register refresh listener.

martijnvg · 2023-05-30T07:05:50Z

Did a quick benchmark run to see what the impact this change has using the k8s query benchmark (it tests performance of queries on search idle shards). Depending on the query a 15% to a 75% reduction in query time has been observed:

|                                                Min Throughput |                  cpu_usage_per_pod_15_minutes |      0.251864    |      0.251819    |      -4e-05   |  ops/s |   -0.02% |
|                                               Mean Throughput |                  cpu_usage_per_pod_15_minutes |      0.252179    |      0.252127    |      -5e-05   |  ops/s |   -0.02% |
|                                             Median Throughput |                  cpu_usage_per_pod_15_minutes |      0.25216     |      0.252107    |      -5e-05   |  ops/s |   -0.02% |
|                                                Max Throughput |                  cpu_usage_per_pod_15_minutes |      0.252566    |      0.252504    |      -6e-05   |  ops/s |   -0.02% |
|                                       50th percentile latency |                  cpu_usage_per_pod_15_minutes |   2583.08        |   2184.99        |    -398.082   |     ms |  -15.41% |
|                                       90th percentile latency |                  cpu_usage_per_pod_15_minutes |   2964.72        |   2262.39        |    -702.333   |     ms |  -23.69% |
|                                      100th percentile latency |                  cpu_usage_per_pod_15_minutes |   3004.16        |   2361.76        |    -642.396   |     ms |  -21.38% |
|                                  50th percentile service time |                  cpu_usage_per_pod_15_minutes |   2579.88        |   2181.38        |    -398.499   |     ms |  -15.45% |
|                                  90th percentile service time |                  cpu_usage_per_pod_15_minutes |   2962.34        |   2259.26        |    -703.084   |     ms |  -23.73% |
|                                 100th percentile service time |                  cpu_usage_per_pod_15_minutes |   3000.96        |   2358.62        |    -642.336   |     ms |  -21.40% |
|                                                    error rate |                  cpu_usage_per_pod_15_minutes |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |                     cpu_usage_per_pod_2_hours |      0.25148     |      0.251291    |      -0.00019 |  ops/s |   -0.08% |
|                                               Mean Throughput |                     cpu_usage_per_pod_2_hours |      0.251729    |      0.251508    |      -0.00022 |  ops/s |   -0.09% |
|                                             Median Throughput |                     cpu_usage_per_pod_2_hours |      0.251713    |      0.251494    |      -0.00022 |  ops/s |   -0.09% |
|                                                Max Throughput |                     cpu_usage_per_pod_2_hours |      0.252035    |      0.251773    |      -0.00026 |  ops/s |   -0.10% |
|                                       50th percentile latency |                     cpu_usage_per_pod_2_hours |   2761.73        |   2367.94        |    -393.791   |     ms |  -14.26% |
|                                       90th percentile latency |                     cpu_usage_per_pod_2_hours |   3198.37        |   2419.47        |    -778.897   |     ms |  -24.35% |
|                                      100th percentile latency |                     cpu_usage_per_pod_2_hours |   3374           |   2437.34        |    -936.666   |     ms |  -27.76% |
|                                  50th percentile service time |                     cpu_usage_per_pod_2_hours |   2759.26        |   2365.13        |    -394.134   |     ms |  -14.28% |
|                                  90th percentile service time |                     cpu_usage_per_pod_2_hours |   3195.96        |   2416.62        |    -779.344   |     ms |  -24.39% |
|                                 100th percentile service time |                     cpu_usage_per_pod_2_hours |   3372.06        |   2433.49        |    -938.57    |     ms |  -27.83% |
|                                                    error rate |                     cpu_usage_per_pod_2_hours |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |                    cpu_usage_per_pod_24_hours |      0.22164     |      0.234675    |       0.01304 |  ops/s |   +5.88% |
|                                               Mean Throughput |                    cpu_usage_per_pod_24_hours |      0.222373    |      0.235279    |       0.01291 |  ops/s |   +5.80% |
|                                             Median Throughput |                    cpu_usage_per_pod_24_hours |      0.222413    |      0.235353    |       0.01294 |  ops/s |   +5.82% |
|                                                Max Throughput |                    cpu_usage_per_pod_24_hours |      0.222829    |      0.235666    |       0.01284 |  ops/s |   +5.76% |
|                                       50th percentile latency |                    cpu_usage_per_pod_24_hours |  33997.8         |  18731.7         |  -15266.1     |     ms |  -44.90% |
|                                       90th percentile latency |                    cpu_usage_per_pod_24_hours |  38302.6         |  20389.4         |  -17913.1     |     ms |  -46.77% |
|                                      100th percentile latency |                    cpu_usage_per_pod_24_hours |  40222.6         |  20725.8         |  -19496.8     |     ms |  -48.47% |
|                                  50th percentile service time |                    cpu_usage_per_pod_24_hours |   4382.54        |   4202.18        |    -180.364   |     ms |   -4.12% |
|                                  90th percentile service time |                    cpu_usage_per_pod_24_hours |   5327.91        |   4288.09        |   -1039.82    |     ms |  -19.52% |
|                                 100th percentile service time |                    cpu_usage_per_pod_24_hours |   5554.2         |   4362.8         |   -1191.4     |     ms |  -21.45% |
|                                                    error rate |                    cpu_usage_per_pod_24_hours |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |               memory_usage_per_pod_15_minutes |      0.251437    |      0.251316    |      -0.00012 |  ops/s |   -0.05% |
|                                               Mean Throughput |               memory_usage_per_pod_15_minutes |      0.25168     |      0.251538    |      -0.00014 |  ops/s |   -0.06% |
|                                             Median Throughput |               memory_usage_per_pod_15_minutes |      0.251664    |      0.251524    |      -0.00014 |  ops/s |   -0.06% |
|                                                Max Throughput |               memory_usage_per_pod_15_minutes |      0.251977    |      0.25181     |      -0.00017 |  ops/s |   -0.07% |
|                                       50th percentile latency |               memory_usage_per_pod_15_minutes |   2663.03        |   2080.63        |    -582.403   |     ms |  -21.87% |
|                                       90th percentile latency |               memory_usage_per_pod_15_minutes |   3049.68        |   2258.18        |    -791.502   |     ms |  -25.95% |
|                                      100th percentile latency |               memory_usage_per_pod_15_minutes |   3099.55        |   2264.2         |    -835.359   |     ms |  -26.95% |
|                                  50th percentile service time |               memory_usage_per_pod_15_minutes |   2660.11        |   2077.28        |    -582.833   |     ms |  -21.91% |
|                                  90th percentile service time |               memory_usage_per_pod_15_minutes |   3046.56        |   2255.07        |    -791.498   |     ms |  -25.98% |
|                                 100th percentile service time |               memory_usage_per_pod_15_minutes |   3096.97        |   2260.76        |    -836.204   |     ms |  -27.00% |
|                                                    error rate |               memory_usage_per_pod_15_minutes |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |                  memory_usage_per_pod_2_hours |      0.250517    |      0.251418    |       0.0009  |  ops/s |   +0.36% |
|                                               Mean Throughput |                  memory_usage_per_pod_2_hours |      0.250604    |      0.251657    |       0.00105 |  ops/s |   +0.42% |
|                                             Median Throughput |                  memory_usage_per_pod_2_hours |      0.250598    |      0.251642    |       0.00104 |  ops/s |   +0.42% |
|                                                Max Throughput |                  memory_usage_per_pod_2_hours |      0.250711    |      0.251951    |       0.00124 |  ops/s |   +0.49% |
|                                       50th percentile latency |                  memory_usage_per_pod_2_hours |   2821.01        |   2302.84        |    -518.175   |     ms |  -18.37% |
|                                       90th percentile latency |                  memory_usage_per_pod_2_hours |   3246.79        |   2465.99        |    -780.8     |     ms |  -24.05% |
|                                      100th percentile latency |                  memory_usage_per_pod_2_hours |   3386.94        |   2483.41        |    -903.533   |     ms |  -26.68% |
|                                  50th percentile service time |                  memory_usage_per_pod_2_hours |   2818.02        |   2299.62        |    -518.4     |     ms |  -18.40% |
|                                  90th percentile service time |                  memory_usage_per_pod_2_hours |   3244.63        |   2463.18        |    -781.452   |     ms |  -24.08% |
|                                 100th percentile service time |                  memory_usage_per_pod_2_hours |   3384.75        |   2480.59        |    -904.166   |     ms |  -26.71% |
|                                                    error rate |                  memory_usage_per_pod_2_hours |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |                 memory_usage_per_pod_24_hours |      0.201418    |      0.227171    |       0.02575 |  ops/s |  +12.79% |
|                                               Mean Throughput |                 memory_usage_per_pod_24_hours |      0.202045    |      0.228049    |       0.026   |  ops/s |  +12.87% |
|                                             Median Throughput |                 memory_usage_per_pod_24_hours |      0.202065    |      0.228158    |       0.02609 |  ops/s |  +12.91% |
|                                                Max Throughput |                 memory_usage_per_pod_24_hours |      0.202751    |      0.228862    |       0.02611 |  ops/s |  +12.88% |
|                                       50th percentile latency |                 memory_usage_per_pod_24_hours |  61491.4         |  26922.9         |  -34568.5     |     ms |  -56.22% |
|                                       90th percentile latency |                 memory_usage_per_pod_24_hours |  70293.7         |  29227.1         |  -41066.6     |     ms |  -58.42% |
|                                      100th percentile latency |                 memory_usage_per_pod_24_hours |  72404.6         |  29826.8         |  -42577.8     |     ms |  -58.81% |
|                                  50th percentile service time |                 memory_usage_per_pod_24_hours |   5302.49        |   4255.55        |   -1046.94    |     ms |  -19.74% |
|                                  90th percentile service time |                 memory_usage_per_pod_24_hours |   5437.01        |   4480.18        |    -956.828   |     ms |  -17.60% |
|                                 100th percentile service time |                 memory_usage_per_pod_24_hours |   5506.58        |   4497.5         |   -1009.09    |     ms |  -18.33% |
|                                                    error rate |                 memory_usage_per_pod_24_hours |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |                     status_per_pod_15_minutes |      0.25043     |      0.251278    |       0.00085 |  ops/s |   +0.34% |
|                                               Mean Throughput |                     status_per_pod_15_minutes |      0.250503    |      0.251493    |       0.00099 |  ops/s |   +0.40% |
|                                             Median Throughput |                     status_per_pod_15_minutes |      0.250499    |      0.251479    |       0.00098 |  ops/s |   +0.39% |
|                                                Max Throughput |                     status_per_pod_15_minutes |      0.250591    |      0.251757    |       0.00117 |  ops/s |   +0.47% |
|                                       50th percentile latency |                     status_per_pod_15_minutes |   2670.72        |   2149.71        |    -521.004   |     ms |  -19.51% |
|                                       90th percentile latency |                     status_per_pod_15_minutes |   3222.46        |   2259.39        |    -963.071   |     ms |  -29.89% |
|                                      100th percentile latency |                     status_per_pod_15_minutes |   3355.41        |   2357.84        |    -997.577   |     ms |  -29.73% |
|                                  50th percentile service time |                     status_per_pod_15_minutes |   2668.16        |   2146.27        |    -521.885   |     ms |  -19.56% |
|                                  90th percentile service time |                     status_per_pod_15_minutes |   3219.56        |   2256.43        |    -963.133   |     ms |  -29.91% |
|                                 100th percentile service time |                     status_per_pod_15_minutes |   3353.64        |   2354.74        |    -998.901   |     ms |  -29.79% |
|                                                    error rate |                     status_per_pod_15_minutes |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |                        status_per_pod_2_hours |      0.251233    |      0.251149    |      -8e-05   |  ops/s |   -0.03% |
|                                               Mean Throughput |                        status_per_pod_2_hours |      0.25144     |      0.251343    |      -0.0001  |  ops/s |   -0.04% |
|                                             Median Throughput |                        status_per_pod_2_hours |      0.251427    |      0.25133     |      -0.0001  |  ops/s |   -0.04% |
|                                                Max Throughput |                        status_per_pod_2_hours |      0.251696    |      0.25158     |      -0.00012 |  ops/s |   -0.05% |
|                                       50th percentile latency |                        status_per_pod_2_hours |   2706.18        |   2171.04        |    -535.146   |     ms |  -19.77% |
|                                       90th percentile latency |                        status_per_pod_2_hours |   3205.59        |   2289.01        |    -916.578   |     ms |  -28.59% |
|                                      100th percentile latency |                        status_per_pod_2_hours |   3422.65        |   2363.43        |   -1059.22    |     ms |  -30.95% |
|                                  50th percentile service time |                        status_per_pod_2_hours |   2702.69        |   2167.76        |    -534.929   |     ms |  -19.79% |
|                                  90th percentile service time |                        status_per_pod_2_hours |   3203.12        |   2285.58        |    -917.537   |     ms |  -28.65% |
|                                 100th percentile service time |                        status_per_pod_2_hours |   3420.84        |   2359.42        |   -1061.42    |     ms |  -31.03% |
|                                                    error rate |                        status_per_pod_2_hours |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |                       status_per_pod_24_hours |      0.250983    |      0.251456    |       0.00047 |  ops/s |   +0.19% |
|                                               Mean Throughput |                       status_per_pod_24_hours |      0.251149    |      0.251702    |       0.00055 |  ops/s |   +0.22% |
|                                             Median Throughput |                       status_per_pod_24_hours |      0.251138    |      0.251687    |       0.00055 |  ops/s |   +0.22% |
|                                                Max Throughput |                       status_per_pod_24_hours |      0.251352    |      0.252003    |       0.00065 |  ops/s |   +0.26% |
|                                       50th percentile latency |                       status_per_pod_24_hours |   2786.62        |   2280.12        |    -506.506   |     ms |  -18.18% |
|                                       90th percentile latency |                       status_per_pod_24_hours |   3219.01        |   2352.98        |    -866.026   |     ms |  -26.90% |
|                                      100th percentile latency |                       status_per_pod_24_hours |   3440.12        |   2461.41        |    -978.709   |     ms |  -28.45% |
|                                  50th percentile service time |                       status_per_pod_24_hours |   2784.09        |   2276.69        |    -507.403   |     ms |  -18.23% |
|                                  90th percentile service time |                       status_per_pod_24_hours |   3216.36        |   2350.07        |    -866.288   |     ms |  -26.93% |
|                                 100th percentile service time |                       status_per_pod_24_hours |   3438.24        |   2458.86        |    -979.384   |     ms |  -28.49% |
|                                                    error rate |                       status_per_pod_24_hours |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |           tx_network_usage_per_pod_15_minutes |      0.251105    |      0.251311    |       0.00021 |  ops/s |   +0.08% |
|                                               Mean Throughput |           tx_network_usage_per_pod_15_minutes |      0.25129     |      0.251532    |       0.00024 |  ops/s |   +0.10% |
|                                             Median Throughput |           tx_network_usage_per_pod_15_minutes |      0.251278    |      0.251518    |       0.00024 |  ops/s |   +0.10% |
|                                                Max Throughput |           tx_network_usage_per_pod_15_minutes |      0.251517    |      0.251803    |       0.00029 |  ops/s |   +0.11% |
|                                       50th percentile latency |           tx_network_usage_per_pod_15_minutes |   3221.24        |   2368.32        |    -852.926   |     ms |  -26.48% |
|                                       90th percentile latency |           tx_network_usage_per_pod_15_minutes |   3522.43        |   2485.3         |   -1037.13    |     ms |  -29.44% |
|                                      100th percentile latency |           tx_network_usage_per_pod_15_minutes |   3734.87        |   2563.46        |   -1171.41    |     ms |  -31.36% |
|                                  50th percentile service time |           tx_network_usage_per_pod_15_minutes |   3218.83        |   2364.76        |    -854.07    |     ms |  -26.53% |
|                                  90th percentile service time |           tx_network_usage_per_pod_15_minutes |   3520.64        |   2482.31        |   -1038.33    |     ms |  -29.49% |
|                                 100th percentile service time |           tx_network_usage_per_pod_15_minutes |   3732.54        |   2560.59        |   -1171.95    |     ms |  -31.40% |
|                                                    error rate |           tx_network_usage_per_pod_15_minutes |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |              tx_network_usage_per_pod_2_hours |      0.236835    |      0.250111    |       0.01328 |  ops/s |   +5.61% |
|                                               Mean Throughput |              tx_network_usage_per_pod_2_hours |      0.237451    |      0.250128    |       0.01268 |  ops/s |   +5.34% |
|                                             Median Throughput |              tx_network_usage_per_pod_2_hours |      0.23746     |      0.250127    |       0.01267 |  ops/s |   +5.33% |
|                                                Max Throughput |              tx_network_usage_per_pod_2_hours |      0.238222    |      0.25015     |       0.01193 |  ops/s |   +5.01% |
|                                       50th percentile latency |              tx_network_usage_per_pod_2_hours |  17126.2         |   2765.37        |  -14360.9     |     ms |  -83.85% |
|                                       90th percentile latency |              tx_network_usage_per_pod_2_hours |  19627.8         |   3100.87        |  -16526.9     |     ms |  -84.20% |
|                                      100th percentile latency |              tx_network_usage_per_pod_2_hours |  20132.7         |   3158.21        |  -16974.5     |     ms |  -84.31% |
|                                  50th percentile service time |              tx_network_usage_per_pod_2_hours |   4323.14        |   2763.48        |   -1559.66    |     ms |  -36.08% |
|                                  90th percentile service time |              tx_network_usage_per_pod_2_hours |   4494.92        |   3098.53        |   -1396.39    |     ms |  -31.07% |
|                                 100th percentile service time |              tx_network_usage_per_pod_2_hours |   4587.42        |   3155           |   -1432.42    |     ms |  -31.22% |
|                                                    error rate |              tx_network_usage_per_pod_2_hours |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |             tx_network_usage_per_pod_24_hours |      0.14825     |      0.185135    |       0.03688 |  ops/s |  +24.88% |
|                                               Mean Throughput |             tx_network_usage_per_pod_24_hours |      0.149072    |      0.185535    |       0.03646 |  ops/s |  +24.46% |
|                                             Median Throughput |             tx_network_usage_per_pod_24_hours |      0.149053    |      0.185585    |       0.03653 |  ops/s |  +24.51% |
|                                                Max Throughput |             tx_network_usage_per_pod_24_hours |      0.149848    |      0.185771    |       0.03592 |  ops/s |  +23.97% |
|                                       50th percentile latency |             tx_network_usage_per_pod_24_hours | 167261           |  87792.7         |  -79468.1     |     ms |  -47.51% |
|                                       90th percentile latency |             tx_network_usage_per_pod_24_hours | 186835           |  99058.8         |  -87775.8     |     ms |  -46.98% |
|                                      100th percentile latency |             tx_network_usage_per_pod_24_hours | 190669           | 101284           |  -89384.4     |     ms |  -46.88% |
|                                  50th percentile service time |             tx_network_usage_per_pod_24_hours |   6473.44        |   5390.16        |   -1083.29    |     ms |  -16.73% |
|                                  90th percentile service time |             tx_network_usage_per_pod_24_hours |   6640.46        |   5562.26        |   -1078.2     |     ms |  -16.24% |
|                                 100th percentile service time |             tx_network_usage_per_pod_24_hours |   6876.85        |   5688.9         |   -1187.95    |     ms |  -17.27% |
|                                                    error rate |             tx_network_usage_per_pod_24_hours |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |              average_container_cpu_15_minutes |      0.252175    |      0.25307     |       0.00089 |  ops/s |   +0.35% |
|                                               Mean Throughput |              average_container_cpu_15_minutes |      0.252544    |      0.253594    |       0.00105 |  ops/s |   +0.42% |
|                                             Median Throughput |              average_container_cpu_15_minutes |      0.252521    |      0.253561    |       0.00104 |  ops/s |   +0.41% |
|                                                Max Throughput |              average_container_cpu_15_minutes |      0.252996    |      0.254234    |       0.00124 |  ops/s |   +0.49% |
|                                       50th percentile latency |              average_container_cpu_15_minutes |    886.277       |    297.352       |    -588.926   |     ms |  -66.45% |
|                                       90th percentile latency |              average_container_cpu_15_minutes |   1281.68        |    320.909       |    -960.771   |     ms |  -74.96% |
|                                      100th percentile latency |              average_container_cpu_15_minutes |   1320.95        |    344.926       |    -976.022   |     ms |  -73.89% |
|                                  50th percentile service time |              average_container_cpu_15_minutes |    883.169       |    294.133       |    -589.036   |     ms |  -66.70% |
|                                  90th percentile service time |              average_container_cpu_15_minutes |   1278.34        |    317.579       |    -960.762   |     ms |  -75.16% |
|                                 100th percentile service time |              average_container_cpu_15_minutes |   1318.13        |    341.646       |    -976.486   |     ms |  -74.08% |
|                                                    error rate |              average_container_cpu_15_minutes |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |                 average_container_cpu_2_hours |      0.252408    |      0.253231    |       0.00082 |  ops/s |   +0.33% |
|                                               Mean Throughput |                 average_container_cpu_2_hours |      0.252816    |      0.253783    |       0.00097 |  ops/s |   +0.38% |
|                                             Median Throughput |                 average_container_cpu_2_hours |      0.25279     |      0.253748    |       0.00096 |  ops/s |   +0.38% |
|                                                Max Throughput |                 average_container_cpu_2_hours |      0.253317    |      0.254458    |       0.00114 |  ops/s |   +0.45% |
|                                       50th percentile latency |                 average_container_cpu_2_hours |    865.976       |    404.275       |    -461.701   |     ms |  -53.32% |
|                                       90th percentile latency |                 average_container_cpu_2_hours |   1306.58        |    446.614       |    -859.967   |     ms |  -65.82% |
|                                      100th percentile latency |                 average_container_cpu_2_hours |   1427.34        |    454.313       |    -973.025   |     ms |  -68.17% |
|                                  50th percentile service time |                 average_container_cpu_2_hours |    862.771       |    401.192       |    -461.579   |     ms |  -53.50% |
|                                  90th percentile service time |                 average_container_cpu_2_hours |   1303.56        |    443.351       |    -860.211   |     ms |  -65.99% |
|                                 100th percentile service time |                 average_container_cpu_2_hours |   1424.58        |    450.492       |    -974.091   |     ms |  -68.38% |
|                                                    error rate |                 average_container_cpu_2_hours |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |                average_container_cpu_24_hours |      0.252256    |      0.252456    |       0.0002  |  ops/s |   +0.08% |
|                                               Mean Throughput |                average_container_cpu_24_hours |      0.252638    |      0.252873    |       0.00023 |  ops/s |   +0.09% |
|                                             Median Throughput |                average_container_cpu_24_hours |      0.252614    |      0.252846    |       0.00023 |  ops/s |   +0.09% |
|                                                Max Throughput |                average_container_cpu_24_hours |      0.253105    |      0.253382    |       0.00028 |  ops/s |   +0.11% |
|                                       50th percentile latency |                average_container_cpu_24_hours |   1818.13        |   1289.9         |    -528.229   |     ms |  -29.05% |
|                                       90th percentile latency |                average_container_cpu_24_hours |   2212.73        |   1312.99        |    -899.744   |     ms |  -40.66% |
|                                      100th percentile latency |                average_container_cpu_24_hours |   2287.82        |   1337.76        |    -950.057   |     ms |  -41.53% |
|                                  50th percentile service time |                average_container_cpu_24_hours |   1814.41        |   1286.96        |    -527.448   |     ms |  -29.07% |
|                                  90th percentile service time |                average_container_cpu_24_hours |   2209.02        |   1309.88        |    -899.138   |     ms |  -40.70% |
|                                 100th percentile service time |                average_container_cpu_24_hours |   2284.33        |   1334.9         |    -949.429   |     ms |  -41.56% |
|                                                    error rate |                average_container_cpu_24_hours |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |     average_container_memory_usage_15_minutes |      0.252848    |      0.25314     |       0.00029 |  ops/s |   +0.12% |
|                                               Mean Throughput |     average_container_memory_usage_15_minutes |      0.253332    |      0.253675    |       0.00034 |  ops/s |   +0.14% |
|                                             Median Throughput |     average_container_memory_usage_15_minutes |      0.253301    |      0.253641    |       0.00034 |  ops/s |   +0.13% |
|                                                Max Throughput |     average_container_memory_usage_15_minutes |      0.253925    |      0.254331    |       0.00041 |  ops/s |   +0.16% |
|                                       50th percentile latency |     average_container_memory_usage_15_minutes |    860.337       |    306.072       |    -554.265   |     ms |  -64.42% |
|                                       90th percentile latency |     average_container_memory_usage_15_minutes |   1263.74        |    354.066       |    -909.679   |     ms |  -71.98% |
|                                      100th percentile latency |     average_container_memory_usage_15_minutes |   1280.12        |    363.9         |    -916.217   |     ms |  -71.57% |
|                                  50th percentile service time |     average_container_memory_usage_15_minutes |    857.434       |    303.19        |    -554.244   |     ms |  -64.64% |
|                                  90th percentile service time |     average_container_memory_usage_15_minutes |   1260.62        |    351.062       |    -909.559   |     ms |  -72.15% |
|                                 100th percentile service time |     average_container_memory_usage_15_minutes |   1277.07        |    361.073       |    -915.999   |     ms |  -71.73% |
|                                                    error rate |     average_container_memory_usage_15_minutes |      0           |      0           |       0       |      % |    0.00% |
|                                                    error rate |                     touch-container-2-2_hours |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |        average_container_memory_usage_2_hours |      0.252496    |      0.253194    |       0.0007  |  ops/s |   +0.28% |
|                                               Mean Throughput |        average_container_memory_usage_2_hours |      0.252919    |      0.253738    |       0.00082 |  ops/s |   +0.32% |
|                                             Median Throughput |        average_container_memory_usage_2_hours |      0.252892    |      0.253703    |       0.00081 |  ops/s |   +0.32% |
|                                                Max Throughput |        average_container_memory_usage_2_hours |      0.253438    |      0.254405    |       0.00097 |  ops/s |   +0.38% |
|                                       50th percentile latency |        average_container_memory_usage_2_hours |    888.903       |    422.546       |    -466.357   |     ms |  -52.46% |
|                                       90th percentile latency |        average_container_memory_usage_2_hours |   1293.86        |    441.364       |    -852.492   |     ms |  -65.89% |
|                                      100th percentile latency |        average_container_memory_usage_2_hours |   1347.94        |    442.487       |    -905.458   |     ms |  -67.17% |
|                                  50th percentile service time |        average_container_memory_usage_2_hours |    885.332       |    418.892       |    -466.44    |     ms |  -52.69% |
|                                  90th percentile service time |        average_container_memory_usage_2_hours |   1290.66        |    438.092       |    -852.571   |     ms |  -66.06% |
|                                 100th percentile service time |        average_container_memory_usage_2_hours |   1345.03        |    439.718       |    -905.311   |     ms |  -67.31% |
|                                                    error rate |        average_container_memory_usage_2_hours |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |       average_container_memory_usage_24_hours |      0.251839    |      0.252336    |       0.0005  |  ops/s |   +0.20% |
|                                               Mean Throughput |       average_container_memory_usage_24_hours |      0.252149    |      0.252732    |       0.00058 |  ops/s |   +0.23% |
|                                             Median Throughput |       average_container_memory_usage_24_hours |      0.25213     |      0.252706    |       0.00058 |  ops/s |   +0.23% |
|                                                Max Throughput |       average_container_memory_usage_24_hours |      0.252529    |      0.253216    |       0.00069 |  ops/s |   +0.27% |
|                                       50th percentile latency |       average_container_memory_usage_24_hours |   1847.43        |   1285.11        |    -562.323   |     ms |  -30.44% |
|                                       90th percentile latency |       average_container_memory_usage_24_hours |   2248.64        |   1329.95        |    -918.689   |     ms |  -40.86% |
|                                      100th percentile latency |       average_container_memory_usage_24_hours |   2342.96        |   1344.27        |    -998.685   |     ms |  -42.62% |
|                                  50th percentile service time |       average_container_memory_usage_24_hours |   1843.34        |   1281.95        |    -561.386   |     ms |  -30.45% |
|                                  90th percentile service time |       average_container_memory_usage_24_hours |   2245.16        |   1326.77        |    -918.397   |     ms |  -40.91% |
|                                 100th percentile service time |       average_container_memory_usage_24_hours |   2339.35        |   1339.61        |    -999.74    |     ms |  -42.74% |
|                                                    error rate |       average_container_memory_usage_24_hours |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |            cpu_usage_per_container_15_minutes |      0.253262    |      0.252845    |      -0.00042 |  ops/s |   -0.16% |
|                                               Mean Throughput |            cpu_usage_per_container_15_minutes |      0.253819    |      0.253329    |      -0.00049 |  ops/s |   -0.19% |
|                                             Median Throughput |            cpu_usage_per_container_15_minutes |      0.253783    |      0.253299    |      -0.00048 |  ops/s |   -0.19% |
|                                                Max Throughput |            cpu_usage_per_container_15_minutes |      0.2545      |      0.253922    |      -0.00058 |  ops/s |   -0.23% |
|                                       50th percentile latency |            cpu_usage_per_container_15_minutes |    907.57        |    302.183       |    -605.387   |     ms |  -66.70% |
|                                       90th percentile latency |            cpu_usage_per_container_15_minutes |   1284.38        |    332.503       |    -951.878   |     ms |  -74.11% |
|                                      100th percentile latency |            cpu_usage_per_container_15_minutes |   1329.71        |    340.18        |    -989.534   |     ms |  -74.42% |
|                                  50th percentile service time |            cpu_usage_per_container_15_minutes |    904.298       |    299.2         |    -605.099   |     ms |  -66.91% |
|                                  90th percentile service time |            cpu_usage_per_container_15_minutes |   1280.96        |    328.693       |    -952.265   |     ms |  -74.34% |
|                                 100th percentile service time |            cpu_usage_per_container_15_minutes |   1326.24        |    337.015       |    -989.226   |     ms |  -74.59% |
|                                                    error rate |            cpu_usage_per_container_15_minutes |      0           |      0           |       0       |      % |    0.00% 
|                                                Min Throughput |               cpu_usage_per_container_2_hours |      0.253181    |      0.253238    |       6e-05   |  ops/s |   +0.02% |
|                                               Mean Throughput |               cpu_usage_per_container_2_hours |      0.253723    |      0.25379     |       7e-05   |  ops/s |   +0.03% |
|                                             Median Throughput |               cpu_usage_per_container_2_hours |      0.253689    |      0.253755    |       7e-05   |  ops/s |   +0.03% |
|                                                Max Throughput |               cpu_usage_per_container_2_hours |      0.254387    |      0.254467    |       8e-05   |  ops/s |   +0.03% |
|                                       50th percentile latency |               cpu_usage_per_container_2_hours |    825.138       |    429.022       |    -396.116   |     ms |  -48.01% |
|                                       90th percentile latency |               cpu_usage_per_container_2_hours |   1201.43        |    441.922       |    -759.513   |     ms |  -63.22% |
|                                      100th percentile latency |               cpu_usage_per_container_2_hours |   1232.65        |    447.399       |    -785.256   |     ms |  -63.70% |
|                                  50th percentile service time |               cpu_usage_per_container_2_hours |    822.643       |    425.594       |    -397.049   |     ms |  -48.27% |
|                                  90th percentile service time |               cpu_usage_per_container_2_hours |   1199.24        |    438.72        |    -760.524   |     ms |  -63.42% |
|                                 100th percentile service time |               cpu_usage_per_container_2_hours |   1229.44        |    444.341       |    -785.096   |     ms |  -63.86% |
|                                                    error rate |               cpu_usage_per_container_2_hours |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |              cpu_usage_per_container_24_hours |      0.251532    |      0.252392    |       0.00086 |  ops/s |   +0.34% |
|                                               Mean Throughput |              cpu_usage_per_container_24_hours |      0.25179     |      0.252798    |       0.00101 |  ops/s |   +0.40% |
|                                             Median Throughput |              cpu_usage_per_container_24_hours |      0.251773    |      0.252772    |       0.001   |  ops/s |   +0.40% |
|                                                Max Throughput |              cpu_usage_per_container_24_hours |      0.252107    |      0.253295    |       0.00119 |  ops/s |   +0.47% |
|                                       50th percentile latency |              cpu_usage_per_container_24_hours |   1856.69        |   1312.19        |    -544.506   |     ms |  -29.33% |
|                                       90th percentile latency |              cpu_usage_per_container_24_hours |   2331.52        |   1345.49        |    -986.027   |     ms |  -42.29% |
|                                      100th percentile latency |              cpu_usage_per_container_24_hours |   2363.02        |   1380.2         |    -982.823   |     ms |  -41.59% |
|                                  50th percentile service time |              cpu_usage_per_container_24_hours |   1853.28        |   1308.19        |    -545.093   |     ms |  -29.41% |
|                                  90th percentile service time |              cpu_usage_per_container_24_hours |   2328.74        |   1342.04        |    -986.707   |     ms |  -42.37% |
|                                 100th percentile service time |              cpu_usage_per_container_24_hours |   2360.13        |   1377.04        |    -983.094   |     ms |  -41.65% |
|                                                    error rate |              cpu_usage_per_container_24_hours |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |         memory_usage_per_container_15_minutes |      0.25254     |      0.25331     |       0.00077 |  ops/s |   +0.31% |
|                                               Mean Throughput |         memory_usage_per_container_15_minutes |      0.252971    |      0.253875    |       0.0009  |  ops/s |   +0.36% |
|                                             Median Throughput |         memory_usage_per_container_15_minutes |      0.252944    |      0.253839    |       0.0009  |  ops/s |   +0.35% |
|                                                Max Throughput |         memory_usage_per_container_15_minutes |      0.2535      |      0.254566    |       0.00107 |  ops/s |   +0.42% |
|                                       50th percentile latency |         memory_usage_per_container_15_minutes |    758.164       |    313.34        |    -444.824   |     ms |  -58.67% |
|                                       90th percentile latency |         memory_usage_per_container_15_minutes |   1162.07        |    344.578       |    -817.496   |     ms |  -70.35% |
|                                      100th percentile latency |         memory_usage_per_container_15_minutes |   1196.3         |    348.813       |    -847.484   |     ms |  -70.84% |
|                                  50th percentile service time |         memory_usage_per_container_15_minutes |    754.9         |    311.079       |    -443.821   |     ms |  -58.79% |
|                                  90th percentile service time |         memory_usage_per_container_15_minutes |   1158.3         |    341.361       |    -816.943   |     ms |  -70.53% |
|                                 100th percentile service time |         memory_usage_per_container_15_minutes |   1193.44        |    347.17        |    -846.271   |     ms |  -70.91% |
|                                                    error rate |         memory_usage_per_container_15_minutes |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |            memory_usage_per_container_2_hours |      0.253013    |      0.253186    |       0.00017 |  ops/s |   +0.07% |
|                                               Mean Throughput |            memory_usage_per_container_2_hours |      0.253526    |      0.253728    |       0.0002  |  ops/s |   +0.08% |
|                                             Median Throughput |            memory_usage_per_container_2_hours |      0.253494    |      0.253693    |       0.0002  |  ops/s |   +0.08% |
|                                                Max Throughput |            memory_usage_per_container_2_hours |      0.254155    |      0.254393    |       0.00024 |  ops/s |   +0.09% |
|                                       50th percentile latency |            memory_usage_per_container_2_hours |   1001.99        |    433.239       |    -568.747   |     ms |  -56.76% |
|                                       90th percentile latency |            memory_usage_per_container_2_hours |   1428.45        |    441.926       |    -986.527   |     ms |  -69.06% |
|                                      100th percentile latency |            memory_usage_per_container_2_hours |   1469.76        |    442.126       |   -1027.63    |     ms |  -69.92% |
|                                  50th percentile service time |            memory_usage_per_container_2_hours |    998.843       |    430.401       |    -568.442   |     ms |  -56.91% |
|                                  90th percentile service time |            memory_usage_per_container_2_hours |   1426.08        |    438.602       |    -987.473   |     ms |  -69.24% |
|                                 100th percentile service time |            memory_usage_per_container_2_hours |   1466.65        |    439.358       |   -1027.29    |     ms |  -70.04% |
|                                                    error rate |            memory_usage_per_container_2_hours |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |           memory_usage_per_container_24_hours |      0.25223     |      0.252362    |       0.00013 |  ops/s |   +0.05% |
|                                               Mean Throughput |           memory_usage_per_container_24_hours |      0.252608    |      0.252762    |       0.00015 |  ops/s |   +0.06% |
|                                             Median Throughput |           memory_usage_per_container_24_hours |      0.252583    |      0.252735    |       0.00015 |  ops/s |   +0.06% |
|                                                Max Throughput |           memory_usage_per_container_24_hours |      0.253072    |      0.253252    |       0.00018 |  ops/s |   +0.07% |
|                                       50th percentile latency |           memory_usage_per_container_24_hours |   1782.58        |   1262.98        |    -519.597   |     ms |  -29.15% |
|                                       90th percentile latency |           memory_usage_per_container_24_hours |   2268.7         |   1314.38        |    -954.318   |     ms |  -42.06% |
|                                      100th percentile latency |           memory_usage_per_container_24_hours |   2409.73        |   1330.83        |   -1078.9     |     ms |  -44.77% |
|                                  50th percentile service time |           memory_usage_per_container_24_hours |   1779.05        |   1259.46        |    -519.588   |     ms |  -29.21% |
|                                  90th percentile service time |           memory_usage_per_container_24_hours |   2265.67        |   1310.93        |    -954.748   |     ms |  -42.14% |
|                                 100th percentile service time |           memory_usage_per_container_24_hours |   2406.34        |   1327.95        |   -1078.39    |     ms |  -44.81% |
|                                                    error rate |           memory_usage_per_container_24_hours |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |            unique_deployment_count_15_minutes |      0.252456    |      0.253325    |       0.00087 |  ops/s |   +0.34% |
|                                               Mean Throughput |            unique_deployment_count_15_minutes |      0.252873    |      0.253893    |       0.00102 |  ops/s |   +0.40% |
|                                             Median Throughput |            unique_deployment_count_15_minutes |      0.252846    |      0.253857    |       0.00101 |  ops/s |   +0.40% |
|                                                Max Throughput |            unique_deployment_count_15_minutes |      0.253384    |      0.254587    |       0.0012  |  ops/s |   +0.47% |
|                                       50th percentile latency |            unique_deployment_count_15_minutes |    776.125       |    272.052       |    -504.074   |     ms |  -64.95% |
|                                       90th percentile latency |            unique_deployment_count_15_minutes |   1228.58        |    312.831       |    -915.752   |     ms |  -74.54% |
|                                      100th percentile latency |            unique_deployment_count_15_minutes |   1316.96        |    325.933       |    -991.025   |     ms |  -75.25% |
|                                  50th percentile service time |            unique_deployment_count_15_minutes |    772.745       |    268.975       |    -503.77    |     ms |  -65.19% |
|                                  90th percentile service time |            unique_deployment_count_15_minutes |   1225.15        |    310.09        |    -915.059   |     ms |  -74.69% |
|                                 100th percentile service time |            unique_deployment_count_15_minutes |   1313.57        |    322.715       |    -990.86    |     ms |  -75.43% |
|                                                    error rate |            unique_deployment_count_15_minutes |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |               unique_deployment_count_2_hours |      0.252706    |      0.252908    |       0.0002  |  ops/s |   +0.08% |
|                                               Mean Throughput |               unique_deployment_count_2_hours |      0.253167    |      0.253404    |       0.00024 |  ops/s |   +0.09% |
|                                             Median Throughput |               unique_deployment_count_2_hours |      0.253137    |      0.253372    |       0.00024 |  ops/s |   +0.09% |
|                                                Max Throughput |               unique_deployment_count_2_hours |      0.25373     |      0.25401     |       0.00028 |  ops/s |   +0.11% |
|                                       50th percentile latency |               unique_deployment_count_2_hours |    747.126       |    281.38        |    -465.746   |     ms |  -62.34% |
|                                       90th percentile latency |               unique_deployment_count_2_hours |   1169.22        |    321.966       |    -847.256   |     ms |  -72.46% |
|                                      100th percentile latency |               unique_deployment_count_2_hours |   1295.91        |    326.926       |    -968.986   |     ms |  -74.77% |
|                                  50th percentile service time |               unique_deployment_count_2_hours |    743.84        |    277.928       |    -465.912   |     ms |  -62.64% |
|                                  90th percentile service time |               unique_deployment_count_2_hours |   1166.06        |    318.631       |    -847.429   |     ms |  -72.67% |
|                                 100th percentile service time |               unique_deployment_count_2_hours |   1293.22        |    324.07        |    -969.153   |     ms |  -74.94% |
|                                                    error rate |               unique_deployment_count_2_hours |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |              unique_deployment_count_24_hours |      0.2526      |      0.253301    |       0.0007  |  ops/s |   +0.28% |
|                                               Mean Throughput |              unique_deployment_count_24_hours |      0.253041    |      0.253864    |       0.00082 |  ops/s |   +0.33% |
|                                             Median Throughput |              unique_deployment_count_24_hours |      0.253012    |      0.253829    |       0.00082 |  ops/s |   +0.32% |
|                                                Max Throughput |              unique_deployment_count_24_hours |      0.253579    |      0.254553    |       0.00097 |  ops/s |   +0.38% |
|                                       50th percentile latency |              unique_deployment_count_24_hours |    886.754       |    444.61        |    -442.144   |     ms |  -49.86% |
|                                       90th percentile latency |              unique_deployment_count_24_hours |   1348.74        |    451.835       |    -896.91    |     ms |  -66.50% |
|                                      100th percentile latency |              unique_deployment_count_24_hours |   1394.04        |    454.048       |    -939.996   |     ms |  -67.43% |
|                                  50th percentile service time |              unique_deployment_count_24_hours |    883.133       |    441.901       |    -441.232   |     ms |  -49.96% |
|                                  90th percentile service time |              unique_deployment_count_24_hours |   1345.91        |    448.906       |    -897       |     ms |  -66.65% |
|                                 100th percentile service time |              unique_deployment_count_24_hours |   1390.68        |    450.67        |    -940.011   |     ms |  -67.59% |
|                                                    error rate |              unique_deployment_count_24_hours |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput | percentile_cpu_usage_per_container_15_minutes |      0.252941    |      0.253293    |       0.00035 |  ops/s |   +0.14% |
|                                               Mean Throughput | percentile_cpu_usage_per_container_15_minutes |      0.253441    |      0.253855    |       0.00041 |  ops/s |   +0.16% |
|                                             Median Throughput | percentile_cpu_usage_per_container_15_minutes |      0.25341     |      0.253819    |       0.00041 |  ops/s |   +0.16% |
|                                                Max Throughput | percentile_cpu_usage_per_container_15_minutes |      0.254054    |      0.254543    |       0.00049 |  ops/s |   +0.19% |
|                                       50th percentile latency | percentile_cpu_usage_per_container_15_minutes |    923.589       |    481.065       |    -442.524   |     ms |  -47.91% |
|                                       90th percentile latency | percentile_cpu_usage_per_container_15_minutes |   1422.86        |    509.918       |    -912.942   |     ms |  -64.16% |
|                                      100th percentile latency | percentile_cpu_usage_per_container_15_minutes |   1539.82        |    519.568       |   -1020.25    |     ms |  -66.26% |
|                                  50th percentile service time | percentile_cpu_usage_per_container_15_minutes |    919.393       |    471.503       |    -447.891   |     ms |  -48.72% |
|                                  90th percentile service time | percentile_cpu_usage_per_container_15_minutes |   1419.31        |    501.1         |    -918.208   |     ms |  -64.69% |
|                                 100th percentile service time | percentile_cpu_usage_per_container_15_minutes |   1536.02        |    511.188       |   -1024.83    |     ms |  -66.72% |
|                                                    error rate | percentile_cpu_usage_per_container_15_minutes |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |    percentile_cpu_usage_per_container_2_hours |      0.250952    |      0.251612    |       0.00066 |  ops/s |   +0.26% |
|                                               Mean Throughput |    percentile_cpu_usage_per_container_2_hours |      0.251112    |      0.251885    |       0.00077 |  ops/s |   +0.31% |
|                                             Median Throughput |    percentile_cpu_usage_per_container_2_hours |      0.251102    |      0.251866    |       0.00076 |  ops/s |   +0.30% |
|                                                Max Throughput |    percentile_cpu_usage_per_container_2_hours |      0.251308    |      0.252219    |       0.00091 |  ops/s |   +0.36% |
|                                       50th percentile latency |    percentile_cpu_usage_per_container_2_hours |   2485.09        |   1747.13        |    -737.959   |     ms |  -29.70% |
|                                       90th percentile latency |    percentile_cpu_usage_per_container_2_hours |   2944.03        |   1773.05        |   -1170.97    |     ms |  -39.77% |
|                                      100th percentile latency |    percentile_cpu_usage_per_container_2_hours |   2964.93        |   1807.24        |   -1157.68    |     ms |  -39.05% |
|                                  50th percentile service time |    percentile_cpu_usage_per_container_2_hours |   2482.02        |   1743.35        |    -738.662   |     ms |  -29.76% |
|                                  90th percentile service time |    percentile_cpu_usage_per_container_2_hours |   2928.58        |   1769.7         |   -1158.88    |     ms |  -39.57% |
|                                 100th percentile service time |    percentile_cpu_usage_per_container_2_hours |   2940.87        |   1803.39        |   -1137.48    |     ms |  -38.68% |
|                                                    error rate |    percentile_cpu_usage_per_container_2_hours |      0           |      0           |       0       |      % |    0.00% |
|                                                Min Throughput |   percentile_cpu_usage_per_container_24_hours |      0.0418876   |      0.0485755   |       0.00669 |  ops/s |  +15.97% |
|                                               Mean Throughput |   percentile_cpu_usage_per_container_24_hours |      0.0419224   |      0.0486642   |       0.00674 |  ops/s |  +16.08% |
|                                             Median Throughput |   percentile_cpu_usage_per_container_24_hours |      0.0419186   |      0.0486697   |       0.00675 |  ops/s |  +16.11% |
|                                                Max Throughput |   percentile_cpu_usage_per_container_24_hours |      0.0419766   |      0.0487154   |       0.00674 |  ops/s |  +16.05% |
|                                       50th percentile latency |   percentile_cpu_usage_per_container_24_hours |      1.20657e+06 |      1.0054e+06  | -201174       |     ms |  -16.67% |
|                                       90th percentile latency |   percentile_cpu_usage_per_container_24_hours |      1.36512e+06 |      1.13851e+06 | -226615       |     ms |  -16.60% |
|                                      100th percentile latency |   percentile_cpu_usage_per_container_24_hours |      1.39627e+06 |      1.16332e+06 | -232946       |     ms |  -16.68% |
|                                  50th percentile service time |   percentile_cpu_usage_per_container_24_hours |  24190.1         |  20773.7         |   -3416.4     |     ms |  -14.12% |
|                                  90th percentile service time |   percentile_cpu_usage_per_container_24_hours |  24906.6         |  20943.4         |   -3963.24    |     ms |  -15.91% |
|                                 100th percentile service time |   percentile_cpu_usage_per_container_24_hours |  25055.8         |  21652.2         |   -3403.58    |     ms |  -13.58% |
|                                                    error rate |   percentile_cpu_usage_per_container_24_hours |      0           |      0           |       0       |      % |    0.00% |

…_becoming_search_active

of adding a new method to IndexShard

server/src/main/java/org/elasticsearch/index/shard/IndexShard.java

…_becoming_search_active

updated the testScheduledRefresh(...) test.

martijnvg · 2023-06-01T09:32:43Z

server/src/test/java/org/elasticsearch/index/shard/IndexShardTests.java

        assertFalse(primary.scheduledRefresh());
        assertEquals(lastSearchAccess, primary.getLastSearcherAccess());
        // wait until the thread-pool has moved the timestamp otherwise we can't assert on this below
        assertBusy(() -> assertThat(primary.getThreadPool().relativeTimeInMillis(), greaterThan(lastSearchAccess)));
-        CountDownLatch latch = new CountDownLatch(10);
-        for (int i = 0; i < 10; i++) {


Because makeShardSearchActive(...) now triggers a refresh, we can no longer invoke this method many times without a refresh to happen. Before this was possible in this test and then just invoke scheduleRefresh(...) and all the listeners were invoked.

henningandersen

I think this looks fine.

I do worry slightly about a small storm of such searches causing additional refreshes. But that seems unlikely enough to not add more protection against.

henningandersen · 2023-06-01T11:07:36Z

server/src/main/java/org/elasticsearch/index/shard/IndexShard.java

@@ -3831,14 +3831,17 @@ public void afterRefresh(boolean didRefresh) {
     * @param listener the listener to invoke once the pending refresh location is visible. The listener will be called with
     *                 <code>true</code> if the listener was registered to wait for a refresh.
     */
-    public final void awaitShardSearchActive(Consumer<Boolean> listener) {
+    public final void makeShardSearchActive(Consumer<Boolean> listener) {


I'd probably call this ensureShardSearchActive, though not too important.

henningandersen · 2023-06-01T11:15:49Z

server/src/main/java/org/elasticsearch/index/shard/IndexShard.java

        markSearcherAccessed(); // move the shard into non-search idle
        final Translog.Location location = pendingRefreshLocation.get();
        if (location != null) {
            addRefreshListener(location, (result) -> {
                pendingRefreshLocation.compareAndSet(location, null);
                listener.accept(true);
            });
+            // TODO: maybe just invoke getEngine().maybeRefresh(...) here?
+            // (schedule refresh does a few more things that I don't is necessary here?)
+            scheduledRefresh();


I think it will work fine, but one problem with maybeRefresh is that it will not provoke a refresh if there is an ongoing one. But there is no certainty that another ongoing refresh will contain the location. However, such a refresh would have to be an explicit one. I find the risk small and regardless an improvement over current state.

I think we can switch to maybeRefresh instead, though I'd like to check that pendingRefreshLocation is still location then, i.e.:

Suggested change

scheduledRefresh();

if (pendingRefreshLocation.get() == location) {

getEngine().maybeRefresh();

}

both to handle racy cases and the case where there are no more refresh listener slots.

henningandersen

Forgot one thing.

server/src/main/java/org/elasticsearch/index/shard/IndexShard.java

elasticsearchmachine · 2023-06-01T13:51:08Z

Hi @martijnvg, I've created a changelog YAML for you.

elasticsearchmachine · 2023-06-01T13:51:08Z

Pinging @elastic/es-analytics-geo (Team:Analytics)

elasticsearchmachine · 2023-06-01T13:51:09Z

Pinging @elastic/es-distributed (Team:Distributed)

…_becoming_search_active

henningandersen

LGTM.

henningandersen · 2023-06-09T13:04:39Z

server/src/main/java/org/elasticsearch/index/shard/IndexShard.java

        markSearcherAccessed(); // move the shard into non-search idle
        final Translog.Location location = pendingRefreshLocation.get();
        if (location != null) {
            addRefreshListener(location, (result) -> {
                pendingRefreshLocation.compareAndSet(location, null);
                listener.accept(true);
            });
+            // trigger a refresh to avoid waiting for scheduledRefresh(...) to be invoked from index level refresh scheduler.
+            // (The if statement should avoid doing an additional refresh if scheduled refresh was invoked between getting
+            // the current refresh location and adding a refresh listener.)


In particular, addRefreshListener might have performed the refresh already in edge cases.

server/src/test/java/org/elasticsearch/index/shard/IndexShardTests.java

Co-authored-by: Henning Andersen <[email protected]>

martijnvg · 2023-06-09T13:37:07Z

@elasticmachine update branch

…tive

…_becoming_search_active

henningandersen

LGTM.

martijnvg added the :StorageEngine/TSDB You know, for Metrics label May 24, 2023

elasticsearchmachine added the v8.9.0 label May 24, 2023

martijnvg force-pushed the tsdb/trigger_refresh_when_becoming_search_active branch from a22b725 to b82b9f5 Compare May 25, 2023 09:29

martijnvg marked this pull request as ready for review May 25, 2023 13:39

martijnvg added the WIP label May 25, 2023

martijnvg added the :Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. label May 26, 2023

iter

40a5b2a

martijnvg force-pushed the tsdb/trigger_refresh_when_becoming_search_active branch 4 times, most recently from 0b75a31 to b24857e Compare May 26, 2023 19:54

different approach: perform refersh forcefully on search thread and n…

266aa36

…ot register refresh listener.

martijnvg force-pushed the tsdb/trigger_refresh_when_becoming_search_active branch from b24857e to 266aa36 Compare May 27, 2023 13:08

martijnvg added 5 commits May 30, 2023 09:13

iter

cb27754

Merge remote-tracking branch 'es/main' into tsdb/trigger_refresh_when…

a6e0180

…_becoming_search_active

also check for pending refreshes

7a3f342

muted test

60d92ce

just keep using awaitShardSearchActive(...) instead

d2f4b43

of adding a new method to IndexShard

martijnvg commented May 31, 2023

View reviewed changes

server/src/main/java/org/elasticsearch/index/shard/IndexShard.java Outdated Show resolved Hide resolved

martijnvg added 2 commits June 1, 2023 10:59

Merge remote-tracking branch 'es/main' into tsdb/trigger_refresh_when…

c9a8192

…_becoming_search_active

Renamed awaitShardSearchActive(...) to makeShardSearchActive(...) and

055c31b

updated the testScheduledRefresh(...) test.

martijnvg commented Jun 1, 2023

View reviewed changes

henningandersen reviewed Jun 1, 2023

View reviewed changes

server/src/main/java/org/elasticsearch/index/shard/IndexShard.java Show resolved Hide resolved

martijnvg added 2 commits June 1, 2023 15:04

Renamed makeShardSearchActive(...) to ensureShardSearchActive(...).

14a5a44

update jdocs and use maybeRefresh() instead of scheduledRefresh()

3eb8075

martijnvg added >enhancement and removed WIP labels Jun 1, 2023

elasticsearchmachine added Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) labels Jun 1, 2023

Update docs/changelog/96321.yaml

fb9ebfe

martijnvg added 5 commits June 1, 2023 16:01

fix changelog

329620d

added assertion that only allows search and get threads.

d427641

Merge remote-tracking branch 'es/main' into tsdb/trigger_refresh_when…

a8e18c3

…_becoming_search_active

always perform the refresh with a refresh thread.

803832e

Merge remote-tracking branch 'es/main' into tsdb/trigger_refresh_when…

109d335

…_becoming_search_active

martijnvg requested a review from henningandersen June 7, 2023 08:22

henningandersen approved these changes Jun 9, 2023

View reviewed changes

fix spelling

00329a6

Co-authored-by: Henning Andersen <[email protected]>

elasticmachine and others added 5 commits June 9, 2023 23:37

Merge branch 'main' into tsdb/trigger_refresh_when_becoming_search_ac…

c405461

…tive

Merge remote-tracking branch 'es/main' into tsdb/trigger_refresh_when…

afef5e1

…_becoming_search_active

Merge remote-tracking branch 'es/main' into tsdb/trigger_refresh_when…

410e6fb

…_becoming_search_active

Merge remote-tracking branch 'es/main' into tsdb/trigger_refresh_when…

7134f1d

…_becoming_search_active

update docs to reflect the new idle shard refresh behaviour

7dc8173

henningandersen approved these changes Jun 14, 2023

View reviewed changes

henningandersen mentioned this pull request Jun 14, 2023

Increase concurrent request of opening point-in-time #96782

Merged

martijnvg merged commit 31a4786 into elastic:main Jun 15, 2023

kingherc mentioned this pull request Jun 27, 2023

[CI] IndexShardTests testScheduledRefresh failing #96920

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trigger refresh when shard becomes search active #96321

Trigger refresh when shard becomes search active #96321

martijnvg commented May 24, 2023 •

edited

Loading

martijnvg commented May 25, 2023

martijnvg commented May 30, 2023

martijnvg Jun 1, 2023

henningandersen left a comment

henningandersen Jun 1, 2023

henningandersen Jun 1, 2023

henningandersen left a comment

elasticsearchmachine commented Jun 1, 2023

elasticsearchmachine commented Jun 1, 2023

elasticsearchmachine commented Jun 1, 2023

henningandersen left a comment

henningandersen Jun 9, 2023

martijnvg commented Jun 9, 2023

henningandersen left a comment

-            scheduledRefresh();
+            if (pendingRefreshLocation.get() == location) {
+               getEngine().maybeRefresh();
+           }

Trigger refresh when shard becomes search active #96321

Trigger refresh when shard becomes search active #96321

Conversation

martijnvg commented May 24, 2023 • edited Loading

martijnvg commented May 25, 2023

martijnvg commented May 30, 2023

martijnvg Jun 1, 2023

Choose a reason for hiding this comment

henningandersen left a comment

Choose a reason for hiding this comment

henningandersen Jun 1, 2023

Choose a reason for hiding this comment

henningandersen Jun 1, 2023

Choose a reason for hiding this comment

henningandersen left a comment

Choose a reason for hiding this comment

elasticsearchmachine commented Jun 1, 2023

elasticsearchmachine commented Jun 1, 2023

elasticsearchmachine commented Jun 1, 2023

henningandersen left a comment

Choose a reason for hiding this comment

henningandersen Jun 9, 2023

Choose a reason for hiding this comment

martijnvg commented Jun 9, 2023

henningandersen left a comment

Choose a reason for hiding this comment

martijnvg commented May 24, 2023 •

edited

Loading