Add retriever to the query rewrite phase#110482
Conversation
|
Pinging @elastic/es-search (Team:Search) |
| if (request.source() != null && request.source().pointInTimeBuilder() != null) { | ||
| if (request.source() != null | ||
| && request.source().pointInTimeBuilder() != null | ||
| && request.source().pointInTimeBuilder().singleSession() == false) { |
There was a problem hiding this comment.
Is it safe to rely on the keepAlive to determine explicit releasing in this case? Could this possible cause any side effects with any user provided requests with -1 as keep_alive ?
There was a problem hiding this comment.
This is just about returning the pit id in the response. The logic to release the pit automatically is restricted to the retriever builder case. It might be surprising for existing pit users though since -1 is a valid keep-alive value. We should be able to restrict this behavior to retrievers entirely.
|
Thanks so much for this @jimczi ! Added some minor initial comments! |
| @Override | ||
| public void onResponse(SearchResponse response) { | ||
| String query = response.getHits().getAt(0).field(QUERY_FIELD).getValue(); | ||
| StandardRetrieverBuilder standard = new StandardRetrieverBuilder(); |
There was a problem hiding this comment.
I assume in the main code we want retrievers to be rewritten into some kind of ScoreDocQuery, but not to another retriever?
There was a problem hiding this comment.
Retrievers will always rewrite to a retriever, that's enforced by the design. Using some kind of ScoreDocQuery will be internally done by the rewritten retriever when extractToSourceBuilder is called (when the retriever is guaranteed to be fully rewritten).
mayya-sharipova
left a comment
There was a problem hiding this comment.
@jimczi Thanks Jim, makes sense
javanna
left a comment
There was a problem hiding this comment.
I left some nits and suggestions around splitting this change up and making it a bit easier to follow.
This change moves the handling of the retriever to the rewrite phase. It also adds validation of the search source builder after extracting the retriever into the source builder. Relates elastic#110482
This change moves the handling of the retriever to the rewrite phase. It also adds validation of the search source builder after extracting the retriever into the source builder. Relates #110482
|
Hey @javanna! I'm picking up this PR, so whenever you have some time I'd appreciate some feedback after the latest changes :) |
|
@pmpailis I haven't forgotten about this. If you think its ready, I will go through a review. I will likely ask for more tests :) |
|
Pinging @elastic/es-search-relevance (Team:Search Relevance) |
|
Closing as a fix for this has been merged through #111153. |
|
Please ignore the above! Wrong PR to close 😅 |
Hey @benwtrent ; added a take on this in 29266a7. The idea now is that if we specify a compound retriever (which would require opening a PIT) and explicitly define We haven't updated PIT's logic yet, so if any shards are missing we'd throw an exception, the message of which has also been slightly updated to not suggest to use |
|
@elasticmachine update branch |
| @Override | ||
| protected void doCheckNoMissingShards( | ||
| String phaseName, | ||
| SearchRequest request, | ||
| GroupShardsIterator<SearchShardIterator> shardsIts | ||
| ) { | ||
| final StringBuilder missingShards = new StringBuilder(); | ||
| // Fail-fast verification of all shards being available | ||
| for (int index = 0; index < shardsIts.size(); index++) { | ||
| final SearchShardIterator shardRoutings = shardsIts.get(index); | ||
| if (shardRoutings.size() == 0) { | ||
| if (missingShards.isEmpty() == false) { | ||
| missingShards.append(", "); | ||
| } | ||
| missingShards.append(shardRoutings.shardId()); | ||
| } | ||
| } | ||
| if (missingShards.isEmpty() == false) { | ||
| // Status red - shard is missing all copies and would produce partial results for an index search | ||
| final String msg = "Cannot execute [open_point_in_time] action due to missing shards [" + missingShards + "]."; | ||
| throw new SearchPhaseExecutionException(phaseName, msg, null, ShardSearchFailure.EMPTY_ARRAY); | ||
| } | ||
| } |
There was a problem hiding this comment.
woah, I am unsure about this. This is a fairly big change in behavior for PIT isn't it?
There was a problem hiding this comment.
Tbh I think that the change is not that big since we're overriding the (previously static)SearchPhase#doCheckNoMissingShards with just a couple of changes:
- we don't have the
if (request.allowPartialSearchResults() == false)check anymore as this is explicitly set by the PIT action (inTransportOpenPointInTimeAction#doExecute) - we have a more specific error message, w/o suggesting the
allow_partial_search_resultsas a solution anymore
The only reason for this addition was to update the error message, so if you think that this isn't needed right now (as this is something that we plan to revisit soon-ish) I can omit this from the PR.
There was a problem hiding this comment.
@pmpailis I would prefer this change be in a separate, focused PR. This way foundations, et. al. can focus their review on it.
| public ActionRequestValidationException validate() { | ||
| ActionRequestValidationException validationException = null; | ||
| boolean scroll = scroll() != null; | ||
| boolean allowPartialSearchResults = allowPartialSearchResults() != null && allowPartialSearchResults(); |
|
@elasticmachine update branch |
|
@elasticmachine update branch |
This change moves the handling of the retriever to the rewrite phase. It also adds validation of the search source builder after extracting the retriever into the source builder. Relates #110482
This change moves the handling of the retriever to the rewrite phase. It also adds validation of the search source builder after extracting the retriever into the source builder. Relates #110482
This change moves the extraction of retrievers into the search source builder during the coordinator rewrite phase.
By delaying the extraction, retrievers can apply asynchronous actions during the rewrite before executing the search request.
Additionally, a compound mode for retrievers is introduced, creating an internal point in time for the duration of the request. This allows retrievers to perform queries during the rewrite on the same reader that will be used for the main search request.
In a subsequent PR, this capability will be used to implement retrievers that combine multiple retrievers.
This PR consists of three commits. The last commit contains the main change to the search action, which involves creating and deleting a PIT before search execution if a compound retriever is present.