Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 16 additions & 5 deletions docs/reference/search/rank-eval.asciidoc
Original file line number Diff line number Diff line change
@@ -1,14 +1,16 @@
[[search-rank-eval]]
== Ranking Evaluation API

experimental[The ranking evaluation API is experimental and may be changed or removed completely in a future release,
as well as change in non-backwards compatible ways on minor versions updates. Elastic will take a best effort
approach to fix any issues, but experimental features are not subject to the support SLA of official GA features.]

The ranking evaluation API allows to evaluate the quality of ranked search
results over a set of typical search queries. Given this set of queries and a
list or manually rated documents, the `_rank_eval` endpoint calculates and
returns typical information retrieval metrics like _mean reciprocal rank_,
_precision_ or _discounted cumulative gain_.

experimental[The ranking evaluation API is new and may change in non-backwards compatible ways in the future, even on minor versions updates.]

[float]
=== Overview

Expand Down Expand Up @@ -41,7 +43,7 @@ GET /my_index/_rank_eval
{
"requests": [ ... ], <1>
"metric": { <2>
"reciprocal_rank": { ... } <3>
"mean_reciprocal_rank": { ... } <3>
}
}
------------------------------
Expand Down Expand Up @@ -85,7 +87,7 @@ The request section contains several search requests typical to your application
<3> a list of document ratings, each entry containing the documents `_index` and `_id` together with
the rating of the documents relevance with regards to this search request

A document `rating` can be any integer value that expresses the relevance of the document on a user defined scale. For some of the metrics, just giving a binary rating (e.g. `0` for irrelevant and `1` for relevant) will be sufficient, other metrics can use a more fine grained scale.
A document `rating` can be any integer value that expresses the relevance of the document on a user defined scale. For some of the metrics, just giving a binary rating (e.g. `0` for irrelevant and `1` for relevant) will be sufficient, other metrics can use a more fine grained scale.

[float]
=== Template based ranking evaluation
Expand Down Expand Up @@ -158,6 +160,7 @@ GET /twitter/_rank_eval
}],
"metric": {
"precision": {
"k" : 20,
"relevant_rating_threshold": 1,
"ignore_unlabeled": false
}
Expand All @@ -172,7 +175,9 @@ The `precision` metric takes the following optional parameters
[cols="<,<",options="header",]
|=======================================================================
|Parameter |Description
|`relevant_rating_threshold` |Sets the rating threshold above which documents are considered to be
|`k` |sets the maximum number of documents retrieved per query. This value will act in place of the usual `size` parameter
in the query. Defaults to 10.
|`relevant_rating_threshold` |sets the rating threshold above which documents are considered to be
"relevant". Defaults to `1`.
|`ignore_unlabeled` |controls how unlabeled documents in the search results are counted.
If set to 'true', unlabeled documents are ignored and neither count as relevant or irrelevant. Set to 'false' (the default), they are treated as irrelevant.
Expand All @@ -198,6 +203,7 @@ GET /twitter/_rank_eval
}],
"metric": {
"mean_reciprocal_rank": {
"k" : 20,
"relevant_rating_threshold" : 1
}
}
Expand All @@ -211,6 +217,8 @@ The `mean_reciprocal_rank` metric takes the following optional parameters
[cols="<,<",options="header",]
|=======================================================================
|Parameter |Description
|`k` |sets the maximum number of documents retrieved per query. This value will act in place of the usual `size` parameter
in the query. Defaults to 10.
|`relevant_rating_threshold` |Sets the rating threshold above which documents are considered to be
"relevant". Defaults to `1`.
|=======================================================================
Expand All @@ -234,6 +242,7 @@ GET /twitter/_rank_eval
}],
"metric": {
"dcg": {
"k" : 20,
"normalize": false
}
}
Expand All @@ -247,6 +256,8 @@ The `dcg` metric takes the following optional parameters:
[cols="<,<",options="header",]
|=======================================================================
|Parameter |Description
|`k` |sets the maximum number of documents retrieved per query. This value will act in place of the usual `size` parameter
in the query. Defaults to 10.
|`normalize` | If set to `true`, this metric will calculate the https://en.wikipedia.org/wiki/Discounted_cumulative_gain#Normalized_DCG[Normalized DCG].
|=======================================================================

Expand Down