Skip to content
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -55,9 +55,7 @@ For information about machine learning settings, see [ML Commons cluster setting

## Neural Search plugin settings

The Security Analytics plugin supports the following settings:

- `plugins.neural_search.hybrid_search_disabled` (Dynamic, Boolean): Disables hybrid search. Default is `false`.
For information about Neural Search plugin settings, see [Neural Search plugin settings]({{site.url}}{{site.baseurl}}/vector-search/settings/#neural-search-plugin-settings).

## Notifications plugin settings

Expand Down
187 changes: 185 additions & 2 deletions _vector-search/ai-search/hybrid-search/collapse.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,7 @@

When using `collapse` in a hybrid query, note the following considerations:

- Inner hits are not supported.
- Performance may be impacted when working with large result sets.
- Performance may be impacted when working with large result sets. Starting with OpenSearch 3.2, the index-level [`index.neural_search.hybrid_collapse_docs_per_group_per_subquery`]({{site.url}}{{site.baseurl}}/vector-search/settings/#hybrid-collapse-docs-per-group) setting controls how many documents are stored per group per subquery.

Check failure on line 20 in _vector-search/ai-search/hybrid-search/collapse.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.SpacingPunctuation] There should be no space before and one space after the punctuation mark in 'sets. Starting'. Raw Output: {"message": "[OpenSearch.SpacingPunctuation] There should be no space before and one space after the punctuation mark in 'sets. Starting'.", "location": {"path": "_vector-search/ai-search/hybrid-search/collapse.md", "range": {"start": {"line": 20, "column": 62}}}, "severity": "ERROR"}
- Aggregations run on pre-collapsed results, not the final output.
- Pagination behavior changes: Because `collapse` reduces the total number of results, it can affect how results are distributed across pages. To retrieve more results, consider increasing the pagination depth.
- Results may differ from those returned by the [`collapse` response processor]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/collapse-processor/), which applies collapse logic after the query is executed.
Expand Down Expand Up @@ -528,3 +527,187 @@
]
}
```

## Retrieving inner hits for collapsed hybrid query results
**Introduced 3.2**
{: .label .label-purple }

You can use the `inner_hits` parameter within the `collapse` parameter to retrieve additional documents from each collapsed group.

The following example uses the `bakery-items` index created earlier. It searches for cake items, collapses (groups) the results by the `item` field, and returns the two cheapest items for each collapsed value:

```json
GET /bakery-items/_search?search_pipeline=norm-pipeline
{
"query": {
"hybrid": {
"queries": [
{
"match": {
"item": "Chocolate Cake"
}
},
{
"bool": {
"must": {
"match": {
"category": "cakes"
}
}
}
}
]
}
},
"collapse": {
"field": "item",
"inner_hits": [
{
"name": "cheapest_items",
"size": 2,
"sort": ["price"]
}
]
}
}
```
{% include copy-curl.html %}

In the response, the main `hits` contain the top-scoring document from each collapsed group. The `inner_hits` contain the two cheapest items from each group:

<details open markdown="block">
<summary>
Response
</summary>
{: .text-delta}

```json
{
...
"hits": {
"total": {
"value": 5,
"relation": "eq"
},
"max_score": 1,
"hits": [
{
"_index": "bakery-items",
"_id": "bIe6e5gBAB5HT6ixTd4F",
"_score": 1,
"_source": {
"item": "Chocolate Cake",
"category": "cakes",
"price": 15,
"baked_date": "2023-07-01T00:00:00Z"
},
"fields": {
"item": [
"Chocolate Cake"
]
},
"inner_hits": {
"cheapest_items": {
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": null,
"hits": [
{
"_index": "bakery-items",
"_id": "bIe6e5gBAB5HT6ixTd4F",
"_score": null,
"_source": {
"item": "Chocolate Cake",
"category": "cakes",
"price": 15,
"baked_date": "2023-07-01T00:00:00Z"
},
"sort": [
15
]
},
{
"_index": "bakery-items",
"_id": "bYe6e5gBAB5HT6ixTd4F",
"_score": null,
"_source": {
"item": "Chocolate Cake",
"category": "cakes",
"price": 18,
"baked_date": "2023-07-04T00:00:00Z"
},
"sort": [
18
]
}
]
}
}
}
},
{
"_index": "bakery-items",
"_id": "boe6e5gBAB5HT6ixTd4F",
"_score": 0.5005,
"_source": {
"item": "Vanilla Cake",
"category": "cakes",
"price": 12,
"baked_date": "2023-07-02T00:00:00Z"
},
"fields": {
"item": [
"Vanilla Cake"
]
},
"inner_hits": {
"cheapest_items": {
"hits": {
"total": {
"value": 3,
"relation": "eq"
},
"max_score": null,
"hits": [
{
"_index": "bakery-items",
"_id": "boe6e5gBAB5HT6ixTd4F",
"_score": null,
"_source": {
"item": "Vanilla Cake",
"category": "cakes",
"price": 12,
"baked_date": "2023-07-02T00:00:00Z"
},
"sort": [
12
]
},
{
"_index": "bakery-items",
"_id": "b4e6e5gBAB5HT6ixTd4F",
"_score": null,
"_source": {
"item": "Vanilla Cake",
"category": "cakes",
"price": 16,
"baked_date": "2023-07-03T00:00:00Z"
},
"sort": [
16
]
}
]
}
}
}
}
]
}
}
```

</details>
22 changes: 22 additions & 0 deletions _vector-search/settings.md
Original file line number Diff line number Diff line change
Expand Up @@ -98,3 +98,25 @@
{% include copy.html %}

You can reload the secure settings without restarting the node by using the [Nodes Reload Secure Setings API]({{site.url}}{{site.baseurl}}/api-reference/nodes-apis/nodes-reload-secure/).

## Neural Search plugin settings

The Neural Search plugin supports the following settings.

### Cluster settings

The following Neural Search plugin settings apply at the cluster level:

- `plugins.neural_search.hybrid_search_disabled` (Dynamic, Boolean): Disables hybrid search. Default is `false`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's not mention this setting, it's on the path to deprecation. It has been added at the very early stage when hybrid query has been added as experimental

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks


- `plugins.neural_search.stats_enabled` (Dynamic, Boolean): Enables the [Neural Search Stats API]({{site.url}}{{site.baseurl}}/vector-search/api/neural/#stats). Default is `false`.

### Index settings

The following Neural Search plugin settings apply at the index level:

- `index.neural_search.semantic_ingest_batch_size` (Dynamic, integer): Specifies the number of documents batched together when generating embeddings for `semantic` fields during ingestion. Default is `10`.

<p id="hybrid-collapse-docs-per-group"></p>

- `index.neural_search.hybrid_collapse_docs_per_group_per_subquery` (Dynamic, integer): Controls how many documents are stored per group per subquery. By default, the value is set to the `size` parameter specified in the query. Lowering values prioritize latency, while higher values increase recall. Valid values are `0`--`1000`, inclusive. A value of `0` uses the `size` parameter from the query, not zero documents.

Check failure on line 122 in _vector-search/settings.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.SpacingPunctuation] There should be no space before and one space after the punctuation mark in 'recall. Valid'. Raw Output: {"message": "[OpenSearch.SpacingPunctuation] There should be no space before and one space after the punctuation mark in 'recall. Valid'.", "location": {"path": "_vector-search/settings.md", "range": {"start": {"line": 122, "column": 294}}}, "severity": "ERROR"}
Loading