Skip to content

Deprecate the LegacyBM25Similarity class and default to BM25Similarity #17315

@prudhvigodithi

Description

@prudhvigodithi

Is your feature request related to a problem? Please describe

Coming from LegacyBM25Similarity and from @msfroh comment here #17241 (comment), the class already includes a note advising users to use Lucene's BM25Similarity. Since the work for the 3.0.0 release has begun, this is the perfect opportunity to default the scoring to BM25Similarity.

The change is focused on moving from a legacy implementation to the current Lucene BM25Similarity, which provides a cleaner and more standardized way scoring. The switch from LegacyBM25Similarity to BM25Similarity doesn't change the core ranking/scoring behavior in a way that would significantly impact search results.

More details on apache/lucene#9609.

Describe the solution you'd like

Default to Lucene's BM25Similarity, while allowing users to choose LegacyBM25Similarity if required.
Example

curl -X PUT "http://localhost:9200/test-index" -H 'Content-Type: application/json' -d'
{
  "settings": {
    "index": {
      "similarity": {
        "default": {
          "type": "LegacyBM25",
          "k1": 1.2,
          "b": 0.75
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "content": {
        "type": "text"
      }
    }
  }
}
'

Related component

Search:Performance

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Labels

Search:PerformanceenhancementEnhancement or improvement to existing feature or requestv3.0.0Issues and PRs related to version 3.0.0

Type

No type

Projects

Status

No status

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions