Framed thread pool utilization benchmark hacking by nicktindall · Pull Request #2 · mhl-b/elasticsearch

nicktindall · 2025-07-28T08:59:48Z

I micro-ized the benchmark to try and see what effect concurrency had on the time taken to call startTask, endTask and previousFrameTime concurrently from multiple threads.

Play around with

 ./gradlew -p benchmarks run --args 'ThreadPoolUtilizationBenchmark -t $NUM_THREADS'

I think it looks very cheap. If you turn on sampling you see some outliers in the order of 10ms with 12 threads.

Also the amount of contention is probably a lot more than what we'd see in the real world. I tried to add some "work" in between calls, that's what callIntervalTicks does. But even worst case we're only adding in 0.2ms work between calls.

I'm not sure what are the largest core counts we see but it probably makes sense to see what happens on a bigger machine, perhaps increasing callIntervalTicks to something representative (you can see how long it takes with baseline)

(12-thread run on my machine, you have to deduct the baseline for the corresponding callIntervalTicks)

Benchmark                                                      (callIntervalTicks)  (utilizationIntervalMs)  Mode  Cnt    Score    Error  Units
ThreadPoolUtilizationBenchmark.JustWrite                                         0                       10  avgt    5    1.819 ±  0.126  us/op
ThreadPoolUtilizationBenchmark.JustWrite                                     10000                       10  avgt    5   20.762 ±  0.927  us/op
ThreadPoolUtilizationBenchmark.JustWrite                                    100000                       10  avgt    5  202.935 ± 10.831  us/op
ThreadPoolUtilizationBenchmark.ReadAndWrite                                      0                       10  avgt    5    0.939 ±  0.252  us/op
ThreadPoolUtilizationBenchmark.ReadAndWrite:readPrevious                         0                       10  avgt    5    1.097 ±  0.387  us/op
ThreadPoolUtilizationBenchmark.ReadAndWrite:startAndStopTasks                    0                       10  avgt    5    0.780 ±  0.160  us/op
ThreadPoolUtilizationBenchmark.ReadAndWrite                                  10000                       10  avgt    5   20.605 ±  0.838  us/op
ThreadPoolUtilizationBenchmark.ReadAndWrite:readPrevious                     10000                       10  avgt    5   20.518 ±  1.397  us/op
ThreadPoolUtilizationBenchmark.ReadAndWrite:startAndStopTasks                10000                       10  avgt    5   20.693 ±  0.632  us/op
ThreadPoolUtilizationBenchmark.ReadAndWrite                                 100000                       10  avgt    5  201.309 ±  8.873  us/op
ThreadPoolUtilizationBenchmark.ReadAndWrite:readPrevious                    100000                       10  avgt    5  200.402 ±  9.548  us/op
ThreadPoolUtilizationBenchmark.ReadAndWrite:startAndStopTasks               100000                       10  avgt    5  202.216 ± 11.474  us/op
ThreadPoolUtilizationBenchmark.baseline                                          0                       10  avgt    5    0.002 ±  0.001  us/op
ThreadPoolUtilizationBenchmark.baseline                                      10000                       10  avgt    5   20.078 ±  0.845  us/op
ThreadPoolUtilizationBenchmark.baseline                                     100000                       10  avgt    5  201.026 ± 11.471  us/op

You can also run for specific callIntervalTicks. e.g.

./gradlew -p benchmarks run --args 'ThreadPoolUtilizationBenchmark -t 12 -pcallIntervalTicks=1000000'

…UpdateIT testDenseVectorMappingUpdate {initialType=flat updateType=bbq_disk #2} elastic#132130

…UpdateIT testDenseVectorMappingUpdate {initialType=bbq_hnsw updateType=bbq_disk #2} elastic#132152

…UpdateIT testDenseVectorMappingUpdate {initialType=bbq_flat updateType=bbq_disk #2} elastic#132184

…UpdateIT testDenseVectorMappingUpdate {initialType=int8_flat updateType=bbq_disk #2} elastic#132189

…UpdateIT testDenseVectorMappingUpdate {initialType=int8_hnsw updateType=bbq_disk #2} elastic#132213

…UpdateIT testDenseVectorMappingUpdate {initialType=int4_hnsw updateType=bbq_disk #2} elastic#132228

…UpdateIT testDenseVectorMappingUpdate {initialType=int4_flat updateType=bbq_disk #2} elastic#132234

…tic#140027) This PR fixes the issue where `INLINE STATS GROUP BY null` was being incorrectly pruned by `PruneLeftJoinOnNullMatchingField`. Fixes elastic#139887 ## Problem For query: ``` FROM employees | INLINE STATS c = COUNT(*) BY n = null | KEEP c, n | LIMIT 3 ``` During `LogicalPlanOptimizer`: ``` Limit[3[INTEGER],false,false] \_EsqlProject[[c{r}#2, n{r}elastic#4]] \_InlineJoin[LEFT,[n{r}elastic#4],[n{r}elastic#4]] |_Eval[[null[NULL] AS n#4]] | \_EsRelation[employees][<no-fields>{r$}elastic#7] \_Aggregate[[n{r}elastic#4],[COUNT(*[KEYWORD],true[BOOLEAN],PT0S[TIME_DURATION]) AS c#2, n{r}elastic#4]] \_StubRelation[[<no-fields>{r$}elastic#7, n{r}elastic#4]] ``` The following join node: ``` InlineJoin[LEFT,[n{r}elastic#4],[n{r}elastic#4]] |_Eval[[null[NULL] AS n#4]] | \_EsRelation[employees][<no-fields>{r$}elastic#7] \_Aggregate[[n{r}elastic#4],[COUNT(*[KEYWORD],true[BOOLEAN],PT0S[TIME_DURATION]) AS c#2, n{r}elastic#4]] \_StubRelation[[<no-fields>{r$}elastic#7, n{r}elastic#4]] ``` should NOT have `PruneLeftJoinOnNullMatchingField` applied, because the right side is an `Aggregate` (originating from `INLINE STATS`). Since `STATS` supports `GROUP BY null`, the join key being null is a valid use case. Pruning this join would incorrectly eliminate the aggregation results, changing the query semantics. During `LocalLogicalPlanOptimizer`: ``` ProjectExec[[c{r}#2, n{r}elastic#4]] \_LimitExec[3[INTEGER],null] \_ExchangeExec[[c{r}#2, n{r}elastic#4],false] \_FragmentExec[filter=null, estimatedRowSize=0, reducer=[], fragment=[<> Project[[c{r}#2, n{r}elastic#4]] \_Limit[3[INTEGER],false,false] \_InlineJoin[LEFT,[n{r}elastic#4],[n{r}elastic#4]] |_Eval[[null[NULL] AS n#4]] | \_EsRelation[employees][<no-fields>{r$}elastic#7] \_LocalRelation[[c{r}#2, n{r}elastic#4],Page{blocks=[LongVectorBlock[vector=ConstantLongVector[positions=1, value=100]], ConstantNullBlock[positions=1]]}]<>]] ``` The following join node: ``` InlineJoin[LEFT,[n{r}elastic#4],[n{r}elastic#4]] |_Eval[[null[NULL] AS n#4]] | \_EsRelation[employees][<no-fields>{r$}elastic#7] \_LocalRelation[[c{r}#2, n{r}elastic#4],Page{blocks=[LongVectorBlock[vector=ConstantLongVector[positions=1, value=100]], ConstantNullBlock[positions=1]]}] ``` should NOT have `PruneLeftJoinOnNullMatchingField` applied, because the right side is a `LocalRelation` (the `Aggregate` was optimized into a `LocalRelation` containing the pre-computed aggregation results). Pruning this join when the join key is null would discard the valid aggregation results stored in the `LocalRelation`, incorrectly producing null values instead of the expected count. ## Solution The fix ensures that `PruneLeftJoinOnNullMatchingField` only applies to `LOOKUP JOIN` nodes, where `join.right()` is an `EsRelation`. For `INLINE STATS` joins, the right side can be: - `Aggregate` (before optimization), or - `LocalRelation` (after the aggregate is optimized) By checking `join.right() instanceof EsRelation`, we correctly skip the pruning optimization for `INLINE STATS` joins, preserving the expected query results when grouping by null.

…tic#140027) (elastic#141095) This PR fixes the issue where `INLINE STATS GROUP BY null` was being incorrectly pruned by `PruneLeftJoinOnNullMatchingField`. Fixes elastic#139887 ## Problem For query: ``` FROM employees | INLINE STATS c = COUNT(*) BY n = null | KEEP c, n | LIMIT 3 ``` During `LogicalPlanOptimizer`: ``` Limit[3[INTEGER],false,false] \_EsqlProject[[c{r}#2, n{r}elastic#4]] \_InlineJoin[LEFT,[n{r}elastic#4],[n{r}elastic#4]] |_Eval[[null[NULL] AS n#4]] | \_EsRelation[employees][<no-fields>{r$}elastic#7] \_Aggregate[[n{r}elastic#4],[COUNT(*[KEYWORD],true[BOOLEAN],PT0S[TIME_DURATION]) AS c#2, n{r}elastic#4]] \_StubRelation[[<no-fields>{r$}elastic#7, n{r}elastic#4]] ``` The following join node: ``` InlineJoin[LEFT,[n{r}elastic#4],[n{r}elastic#4]] |_Eval[[null[NULL] AS n#4]] | \_EsRelation[employees][<no-fields>{r$}elastic#7] \_Aggregate[[n{r}elastic#4],[COUNT(*[KEYWORD],true[BOOLEAN],PT0S[TIME_DURATION]) AS c#2, n{r}elastic#4]] \_StubRelation[[<no-fields>{r$}elastic#7, n{r}elastic#4]] ``` should NOT have `PruneLeftJoinOnNullMatchingField` applied, because the right side is an `Aggregate` (originating from `INLINE STATS`). Since `STATS` supports `GROUP BY null`, the join key being null is a valid use case. Pruning this join would incorrectly eliminate the aggregation results, changing the query semantics. During `LocalLogicalPlanOptimizer`: ``` ProjectExec[[c{r}#2, n{r}elastic#4]] \_LimitExec[3[INTEGER],null] \_ExchangeExec[[c{r}#2, n{r}elastic#4],false] \_FragmentExec[filter=null, estimatedRowSize=0, reducer=[], fragment=[<> Project[[c{r}#2, n{r}elastic#4]] \_Limit[3[INTEGER],false,false] \_InlineJoin[LEFT,[n{r}elastic#4],[n{r}elastic#4]] |_Eval[[null[NULL] AS n#4]] | \_EsRelation[employees][<no-fields>{r$}elastic#7] \_LocalRelation[[c{r}#2, n{r}elastic#4],Page{blocks=[LongVectorBlock[vector=ConstantLongVector[positions=1, value=100]], ConstantNullBlock[positions=1]]}]<>]] ``` The following join node: ``` InlineJoin[LEFT,[n{r}elastic#4],[n{r}elastic#4]] |_Eval[[null[NULL] AS n#4]] | \_EsRelation[employees][<no-fields>{r$}elastic#7] \_LocalRelation[[c{r}#2, n{r}elastic#4],Page{blocks=[LongVectorBlock[vector=ConstantLongVector[positions=1, value=100]], ConstantNullBlock[positions=1]]}] ``` should NOT have `PruneLeftJoinOnNullMatchingField` applied, because the right side is a `LocalRelation` (the `Aggregate` was optimized into a `LocalRelation` containing the pre-computed aggregation results). Pruning this join when the join key is null would discard the valid aggregation results stored in the `LocalRelation`, incorrectly producing null values instead of the expected count. ## Solution The fix ensures that `PruneLeftJoinOnNullMatchingField` only applies to `LOOKUP JOIN` nodes, where `join.right()` is an `EsRelation`. For `INLINE STATS` joins, the right side can be: - `Aggregate` (before optimization), or - `LocalRelation` (after the aggregate is optimized) By checking `join.right() instanceof EsRelation`, we correctly skip the pruning optimization for `INLINE STATS` joins, preserving the expected query results when grouping by null. (cherry picked from commit f3ccb70) Co-authored-by: kanoshiou <uiaao@tuta.io>

nicktindall added 2 commits July 28, 2025 18:34

Micro-ize the benchmark

3d66e86

Use average instead of sample

efa48b4

nicktindall changed the base branch from main to framed-thread-pool-utilization July 28, 2025 09:00

mhl-b merged commit 872d7cd into mhl-b:framed-thread-pool-utilization Jul 29, 2025
4 checks passed

mhl-b pushed a commit that referenced this pull request Jul 29, 2025

Mute org.elasticsearch.index.mapper.vectors.DenseVectorFieldIndexType…

83b3c67

…UpdateIT testDenseVectorMappingUpdate {initialType=flat updateType=bbq_disk #2} elastic#132130

mhl-b pushed a commit that referenced this pull request Jul 30, 2025

Mute org.elasticsearch.index.mapper.vectors.DenseVectorFieldIndexType…

def9fa6

…UpdateIT testDenseVectorMappingUpdate {initialType=bbq_hnsw updateType=bbq_disk #2} elastic#132152

mhl-b pushed a commit that referenced this pull request Jul 30, 2025

Mute org.elasticsearch.index.mapper.vectors.DenseVectorFieldIndexType…

8b3857b

…UpdateIT testDenseVectorMappingUpdate {initialType=bbq_flat updateType=bbq_disk #2} elastic#132184

mhl-b pushed a commit that referenced this pull request Jul 30, 2025

Mute org.elasticsearch.index.mapper.vectors.DenseVectorFieldIndexType…

0b0f4f8

…UpdateIT testDenseVectorMappingUpdate {initialType=int8_flat updateType=bbq_disk #2} elastic#132189

mhl-b pushed a commit that referenced this pull request Jul 30, 2025

Mute org.elasticsearch.index.mapper.vectors.DenseVectorFieldIndexType…

b15f972

…UpdateIT testDenseVectorMappingUpdate {initialType=int8_hnsw updateType=bbq_disk #2} elastic#132213

mhl-b pushed a commit that referenced this pull request Jul 31, 2025

Mute org.elasticsearch.index.mapper.vectors.DenseVectorFieldIndexType…

321b106

…UpdateIT testDenseVectorMappingUpdate {initialType=int4_hnsw updateType=bbq_disk #2} elastic#132228

mhl-b pushed a commit that referenced this pull request Aug 6, 2025

Mute org.elasticsearch.index.mapper.vectors.DenseVectorFieldIndexType…

1bb9d2b

…UpdateIT testDenseVectorMappingUpdate {initialType=int4_flat updateType=bbq_disk #2} elastic#132234

nicktindall deleted the framed-thread-pool-utilization_bm branch September 3, 2025 04:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Framed thread pool utilization benchmark hacking#2

Framed thread pool utilization benchmark hacking#2
mhl-b merged 2 commits intomhl-b:framed-thread-pool-utilizationfrom
nicktindall:framed-thread-pool-utilization_bm

nicktindall commented Jul 28, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

nicktindall commented Jul 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

nicktindall commented Jul 28, 2025 •

edited

Loading