Skip to content
24 changes: 24 additions & 0 deletions docs/reference/glossary.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,16 @@
`object`. The mapping also allows you to define (amongst other things)
how the value for a field should be analyzed.

[[glossary-filter]] filter ::

A filter is a query. It is a kind of query which does not give a score, so it is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you can join the first two sentences to make it more fluent, e.g. "A filter is a <<glossary-query,scoring queries>> that ..." and then join the second sentence.
Maybe use something like "produces a score" or "scores documents" instead of "give a score"?

known as a "non-scoring" query.It is only concerned about answering the question -
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: whitespace between "query.It". Although the difference won't show up in the docs I think, it improves readability here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @cbuescher I will work on your review. As a beginner to Elasticsearch myself, I appreciate the importance of getting the glossary right. I'll be happy to get it right.

"Does this document match?". The answer is always a simple, binary yes or no. This kind of query is said to be made
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you wrap this (and the following lines) roughly at the same position as the previous lines? We have different line lengths in the docs throughout the documentation but at least in one paragraph it would be nice if it was roughly similar. Maybe it makes sense to do the wrapping in the end just before the PR is ready to be merged though, to just go through the trouble once.

in a "filtering" context (e.g. Is the created date in the range 2013 - 2014?), hence it is called a filter. Filters are
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should link to https://www.elastic.co/guide/en/elasticsearch/reference/current/query-filter-context.html when mentioning the filter context. Also it would be great if it would be clearer that "filter context" defines specific locations of a query clause in a combination of queries. If this is too much detail for the glossary, I think linking to the "filter context" paragraph and removing the details here is fine as well.

simple checks for set inclusion or exclusion. The goal of filtering is to reduce the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"The goal of filtering is to reduce the number of documents"

This might be true most of the time, but e.g. constant_score isn't really about reducing the document set, its just about saying you don't care about the score. I'd be okay with adding "most of the time" somewhere, maybe also leave out the "instead of..." part, since reducing document set size might also happen in scoring queries.

number of documents that have to be examined, instead of what happens in the case of
<<glossary-query,scoring queries>>.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't undestand the reference to scoring queries in this sentence, maybe you can explain or rephrase to make this clearer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal of filtering is to reduce the number of documents that have to be examined, instead of what happens in the case of <<glossary-query,scoring queries>>.

The reference is also meant to provide a link to the reader for comparism


[[glossary-index]] index ::

An index is like a _table_ in a relational database. It has a
Expand Down Expand Up @@ -105,6 +115,20 @@
+
See also <<glossary-routing,routing>>

[[glossary-query]] query ::

A query is the basic component of a search. A search can be defined by one or more queries
which can be mixed and matched in endless combinations. The term Query refers to all queries
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd scratch this statement, I think of "Query" as a superset of both "scoring queries" and filters, so I wouldn't introduce a distinction here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, I will like to clarify this. So you mean whenever the term Query is used it can either mean "filters" or "scoring queries". So, unless it is clarified it can be either (or I guess, even both).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Others might disagree, but my understanding is that we only have "queries" in elasticsearch now (there used to be special filters in earlier versions). Some queries, if used in a "filtering context" as described in the link mentioned above, are also refered to as "filters" or "non-scoring queries" as oposed to "scoring queries" (when the queries are producing scores for documents). Whether something is a filter or not depends on their location. e.g. a term_query is generally speaking alway a query, but when it is used e.g. in a boolean queries filter clause it is also refered to as a filter in our documentation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above explanation might be a bit verbose for the glossary however, so making it a bit more succinct would be the goal here.

which not only determine if a document matches, but also calculate how well the document matches.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe say "Queries that calculate how well the document matches are also known as "scoring queries", as oposed to queries that only determine if a document matches, which are also called <<glossary-filter,filters>>.

This calculation is refered to as scoring, hence these queries are also known as "scoring queries".
A scoring query calculates how relevant each document is to the query, and assigns
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would leave out the notion of "relevance" here, this is opening yet another term that we'd need to refine more carefully I think. I would suggest using "how well a document mathces a query", "score" and "sort by score" instead.

it a relevance score, which is later used to sort matching documents by relevance.
This concept of relevance is well suited to full-text search, where there is seldom a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just leave out "concept of relevance" if you agree with my former proposal.

completely “correct” answer. These queries are takes more resources than
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use "Scoring queries" at the beginning of this sentence here.

<<glossary-filter,non scoring queries>> and their query results are not cacheable.
As a general rule, use query clauses for full-text search or for any condition that should
affect the relevance score, and use filters for everything else.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe user "condition that requires scoring" if you agree on leaving you the notion of relevance here.


[[glossary-replica-shard]] replica shard ::

Each <<glossary-primary-shard,primary shard>> can have zero or more
Expand Down