Skip to content

Add rerank phase in coordinate node after reduced #60946

@demonatic

Description

@demonatic

We currently run a friend search on our self-developed search engine by our company, and now we want to try migrating it to elasticsearch. But we encounter a problem seems no way to solve without modifying elasticsearch's source code.

The basic idea of our friend search is simple, we have some external kv pairs imply friend id and corresponding intimacy value that can pass to elasticsearch. We want to query first and get some match documents with corresponding text relevance score in descending order, then we select top-3 friend with highest intimacy from top-N match docuements by using external kv-pairs as recommendation items, sort top-3 items by previous text relevance score, and then sort other none-recommendation documents also by previous text relevance score and put these documents behind 3 recommendation items.

We have developed some plugins for our own search engine, I'll describe the basic logic in elasticsearch's counterpart way to claim why I think elasticsearch can't achieve that. After sort the documents by text relevance on data node, we use a rescore script to lift the score of 3 documents within N window_size that has the highest intimacy, suppose the origin text relevance score is xxx, we lift the 3 doc's _score to 100000xxx, and other non-recommend docs's score remain the same, so after rescore the shard result is: the docs with top-3 highest intimacy are put on top-3 result with text relevance in descending order, and others also rank by text relevance in descending order. On coordinate node side since each shard(suppose we have 4 shard) has lifted 3 docs, 3x4-3 candidate recommend docs will lose after merge and need to downgrade their _score and resort by previous text-relevance. We have implemented this logic in our own engine, but we read into the elasticsearch's source code and it finds out that elasticsearch seems to run all kinds of query scripts only on data node, and there's no way to interfere with coordinate node's reduce and merge process(resultConsumer.reduce()) to downgrade the losed recommendation candidate items' score. So adding a rerank phase on coordinate node (just as our current search engine do) would be nice.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions