Use high speed strategy for LuceneTopNSourceOperator#142128
Conversation
|
Pinging @elastic/es-analytical-engine (Team:Analytics) |
|
Pinging @elastic/es-search-relevance (Team:Search Relevance) |
|
Drive by, I don't know how the per-segment collection parallelism to work, but the query path will combine very tiny segments together as the cost of simply using threading for those tiny chunks of work isn't worth it. Is something like that at work here? If not, expect the |
@benwtrent there is indeed a grouping of smaller segments, please check DataPartitioning: /**
* Make one partition per shard. This is generally the slowest option, but it
* has the lowest CPU overhead.
*/
SHARD,
/**
* Partition on segment boundaries, this doesn't allow forking to as many CPUs
* as {@link #DOC} but it has much lower overhead.
* <p>
* It packs segments smaller than {@link LuceneSliceQueue#MAX_DOCS_PER_SLICE}
* docs together into a partition. Larger segments get their own partition.
* Each slice contains no more than {@link LuceneSliceQueue#MAX_SEGMENTS_PER_SLICE}.
*/
SEGMENT,
/**
* Partitions into dynamic-sized slices to improve CPU utilization while keeping overhead low.
* This approach is more flexible than {@link #SEGMENT} and works as follows:
*
* <ol>
* <li>The slice size starts from a desired size based on {@code task_concurrency} but is capped
* at around {@link LuceneSliceQueue#MAX_DOCS_PER_SLICE}. This prevents poor CPU usage when
* matching documents are clustered together.</li>
* <li>For small and medium segments (less than five times the desired slice size), it uses a
* slightly different {@link #SEGMENT} strategy, which also splits segments that are larger
* than the desired size. See {@link org.apache.lucene.search.IndexSearcher#slices(List, int, int, boolean)}.</li>
* <li>For very large segments, multiple segments are not combined into a single slice. This allows
* one driver to process an entire large segment until other drivers steal the work after finishing
* their own tasks. See {@link LuceneSliceQueue#nextSlice(LuceneSlice)}.</li>
* </ol>
*/
DOC; |
nik9000
left a comment
There was a problem hiding this comment.
If this is faster for y'all I'm all for it.
It's probably worth explaining you why never use the lowOverheadAutoStrategy. You don't want it because you have to scan all the documents with topn.
| private static final Logger logger = LogManager.getLogger(EsPhysicalOperationProviders.class); | ||
|
|
||
| // LuceneTopNSourceOperator auto strategy | ||
| private static final DataPartitioning.AutoStrategy TOP_N_AUTO_STRATEGY = unusedLimit -> { |
There was a problem hiding this comment.
I'd probably make this a static method and reference it.
There was a problem hiding this comment.
Done and documented using high speed in 45f70c3
…ne-top-n-data-partition-strategy
…p-n-data-partition-strategy' into enhancement/esql-lucene-top-n-data-partition-strategy
…lastic#142128)" This reverts commit f1ed358.
…lastic#142128)" (elastic#142453) This reverts commit f1ed358.
…142128) * First version, use highSpeedAutoStrategy * Use highSpeedAutoStrategy * Fix tests to take into account new partitioning * Fix tests * [CI] Auto commit changes from spotless * Use a static method for strategy and document it --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
Currently, LuceneTopNSourceOperator uses
SHARDas auto strategy. This makes performance worse than the Query DSL when multiple segments are used, as SHARD does not parallelize queries.This change uses
LuceneSourceOperator::highSpeedAutoStrategyto allow parallelism based on the rules for high speed strategy described there:Closes #141770