opensearch-project · aetter · May 6, 2021 · May 5, 2021
@@ -1,6 +1,6 @@
 ---
 layout: default
-title: Boolean Queries
+title: Boolean queries
 parent: OpenSearch
 nav_order: 11
 ---

@@ -1,6 +1,6 @@
 ---
 layout: default
-title: Cluster Formation
+title: Cluster formation
 parent: OpenSearch
 nav_order: 2
 ---
@@ -13,17 +13,17 @@ OpenSearch can operate as a single-node or multi-node cluster. The steps to conf
 
 To create and deploy an OpenSearch cluster according to your requirements, it’s important to understand how node discovery and cluster formation work and what settings govern them.
 
-There are many ways that you can design a cluster. The following illustration shows a basic architecture.
+There are many ways to design a cluster. The following illustration shows a basic architecture:
 
 ![multi-node cluster architecture diagram](../../images/cluster.png)
 
 This is a four-node cluster that has one dedicated master node, one dedicated coordinating node, and two data nodes that are master-eligible and also used for ingesting data.
 
-The following table provides brief descriptions of the node types.
+The following table provides brief descriptions of the node types:
 
 Node type | Description | Best practices for production
 :--- | :--- | :-- |
-`Master` | Manages the overall operation of a cluster and keeps track of the cluster state. This includes creating and deleting indices, keeping track of the nodes that join and leave the cluster, checking the health of each node in the cluster (by running ping requests), and allocating shards to nodes. | Three dedicated master nodes in three different zones is the right approach for almost all production use cases. This makes sure your cluster never loses quorum. Two nodes will be idle for most of the time except when one node goes down or needs some maintenance.
+`Master` | Manages the overall operation of a cluster and keeps track of the cluster state. This includes creating and deleting indices, keeping track of the nodes that join and leave the cluster, checking the health of each node in the cluster (by running ping requests), and allocating shards to nodes. | Three dedicated master nodes in three different zones is the right approach for almost all production use cases. This configuration ensures your cluster never loses quorum. Two nodes will be idle for most of the time except when one node goes down or needs some maintenance.
 `Master-eligible` | Elects one node among them as the master node through a voting process. | For production clusters, make sure you have dedicated master nodes. The way to achieve a dedicated node type is to mark all other node types as false. In this case, you have to mark all the other nodes as not master-eligible.
 `Data` | Stores and searches data. Performs all data-related operations (indexing, searching, aggregating) on local shards. These are the worker nodes of your cluster and need more disk space than any other node type. | As you add data nodes, keep them balanced between zones. For example, if you have three zones, add data nodes in multiples of three, one for each zone. We recommend using storage and RAM-heavy nodes.
 `Ingest` | Preprocesses data before storing it in the cluster. Runs an ingest pipeline that transforms your data before adding it to an index. | If you plan to ingest a lot of data and run complex ingest pipelines, we recommend you use dedicated ingest nodes. You can also optionally offload your indexing from the data nodes so that your data nodes are used exclusively for searching and aggregating.
@@ -37,11 +37,9 @@ This page demonstrates how to work with the different node types. It assumes tha
 
 ## Prerequisites
 
-Before you get started, you must install and configure OpenSearch on all of your nodes. For information about the available options, see [Install and Configure](../../install/).
+Before you get started, you must install and configure OpenSearch on all of your nodes. For information about the available options, see [Install and configure OpenSearch](../../install/).
 
-After you are done, use SSH to connect to each node, and then open the `config/opensearch.yml` file.
-
-You can set all configurations for your cluster in this file.
+After you're done, use SSH to connect to each node, then open the `config/opensearch.yml` file. You can set all configurations for your cluster in this file.
 
 ## Step 1: Name a cluster
 
@@ -132,7 +130,7 @@ node.ingest: false
 
 ## Step 3: Bind a cluster to specific IP addresses
 
-`network_host` defines the IP address that's used to bind the node. By default, OpenSearch listens on a local host, which limits the cluster to a single node. You can also use `_local_` and `_site_` to bind to any loopback or site-local address, whether IPv4 or IPv6:
+`network_host` defines the IP address used to bind the node. By default, OpenSearch listens on a local host, which limits the cluster to a single node. You can also use `_local_` and `_site_` to bind to any loopback or site-local address, whether IPv4 or IPv6:
 
 ```yml
 network.host: [_local_, _site_]
@@ -154,7 +152,7 @@ Now that you've configured the network hosts, you need to configure the discover
 
 Zen Discovery is the built-in, default mechanism that uses [unicast](https://en.wikipedia.org/wiki/Unicast) to find other nodes in the cluster.
 
-You can generally just add all of your master-eligible nodes to the `discovery.seed_hosts` array. When a node starts up, it finds the other master-eligible nodes, determines which one is the master, and asks to join the cluster.
+You can generally just add all your master-eligible nodes to the `discovery.seed_hosts` array. When a node starts up, it finds the other master-eligible nodes, determines which one is the master, and asks to join the cluster.
 
 For example, for `opensearch-master` the line looks something like this:
 
@@ -165,7 +163,7 @@ discovery.seed_hosts: ["<private IP of opensearch-d1>", "<private IP of opensear
 
 ## Step 5: Start the cluster
 
-After you set the configurations, start OpenSearch on all nodes.
+After you set the configurations, start OpenSearch on all nodes:
 
 ```bash
 sudo systemctl start opensearch.service
@@ -220,9 +218,9 @@ PUT _cluster/settings
 }
 ```
 
-You can either use `persistent` or `transient` settings. We recommend the `persistent` setting because it persists through a cluster reboot. Transient settings do not persist through a cluster reboot.
+You can either use `persistent` or `transient` settings. We recommend the `persistent` setting because it persists through a cluster reboot. Transient settings don't persist through a cluster reboot.
 
-Shard allocation awareness attempts to separate primary and replica shards across multiple zones. But, if only one zone is available (such as after a zone failure), OpenSearch allocates replica shards to the only remaining zone.
+Shard allocation awareness attempts to separate primary and replica shards across multiple zones. However, if only one zone is available (such as after a zone failure), OpenSearch allocates replica shards to the only remaining zone.
 
 Another option is to require that primary and replica shards are never allocated to the same zone. This is called forced awareness.
 
@@ -238,7 +236,7 @@ PUT _cluster/settings
 }
 ```
 
-Now, if a data node fails, forced awareness does not allocate the replicas to a node in the same zone. Instead, the cluster enters a yellow state and only allocates the replicas when nodes in another zone come online.
+Now, if a data node fails, forced awareness doesn't allocate the replicas to a node in the same zone. Instead, the cluster enters a yellow state and only allocates the replicas when nodes in another zone come online.
 
 In our two-zone architecture, we can use allocation awareness if `opensearch-d1` and `opensearch-d2` are less than 50% utilized, so that each of them have the storage capacity to allocate replicas in the same zone.
 If that is not the case, and `opensearch-d1` and `opensearch-d2` do not have the capacity to contain all primary and replica shards, we can use forced awareness. This approach helps to make sure that, in the event of a failure, OpenSearch doesn't overload your last remaining zone and lock up your cluster due to lack of storage.

@@ -1,6 +1,6 @@
 ---
 layout: default
-title: Full-Text Queries
+title: Full-text queries
 parent: OpenSearch
 nav_order: 10
 ---
@@ -421,10 +421,10 @@ Option | Valid values | Description
 `fuzziness` | `AUTO`, `0`, or a positive integer | The number of character edits (insert, delete, substitute) that it takes to change one word to another when determining whether a term matched a value. For example, the distance between `wined` and `wind` is 1. The default, `AUTO`, chooses a value based on the length of each term and is a good choice for most use cases.
 `fuzzy_transpositions` | Boolean | Setting `fuzzy_transpositions` to true (default) adds swaps of adjacent characters to the insert, delete, and substitute operations of the `fuzziness` option. For example, the distance between `wind` and `wnid` is 1 if `fuzzy_transpositions` is true (swap "n" and "i") and 2 if it is false (delete "n", insert "n"). <br /><br />If `fuzzy_transpositions` is false, `rewind` and `wnid` have the same distance (2) from `wind`, despite the more human-centric opinion that `wnid` is an obvious typo. The default is a good choice for most use cases.
 `lenient` | Boolean | Setting `lenient` to true lets you ignore data type mismatches between the query and the document field. For example, a query string of "8.2" could match a field of type `float`. The default is false.
-`low_freq_operator` | `and, or` | The operator for low-frequency terms. The default is `or`. See [Common Terms](#common-terms) queries and `operator` in this table.
+`low_freq_operator` | `and, or` | The operator for low-frequency terms. The default is `or`. See [Common terms](#common-terms) queries and `operator` in this table.
 `max_determinized_states` | Positive integer | The maximum number of "[states](https://lucene.apache.org/core/8_4_0/core/org/apache/lucene/util/automaton/Operations.html#DEFAULT_MAX_DETERMINIZED_STATES)" (a measure of complexity) that Lucene can create for query strings that contain regular expressions (e.g. `"query": "/wind.+?/"`). Larger numbers allow for queries that use more memory. The default is 10,000.
 `max_expansions` | Positive integer | Fuzzy queries "expand to" a number of matching terms that are within the distance specified in `fuzziness`. Then OpenSearch tries to match those terms against its indices. `max_expansions` specifies the maximum number of terms that the fuzzy query expands to. The default is 50.
-`minimum_should_match` | Positive or negative integer, positive or negative percentage, combination | If the query string contains multiple search terms and you used the `or` operator, the number of terms that need to match for the document to be considered a match. For example, if `minimum_should_match` is 2, "wind often rising" does not match "The Wind Rises." If `minimum_should_match` is 1, it matches. This option also has `low_freq` and `high_freq` properties for [Common Terms](#common-terms) queries.
+`minimum_should_match` | Positive or negative integer, positive or negative percentage, combination | If the query string contains multiple search terms and you used the `or` operator, the number of terms that need to match for the document to be considered a match. For example, if `minimum_should_match` is 2, "wind often rising" does not match "The Wind Rises." If `minimum_should_match` is 1, it matches. This option also has `low_freq` and `high_freq` properties for [Common terms](#common-terms) queries.
 `operator` | `or, and` | If the query string contains multiple search terms, whether all terms need to match (`and`) or only one term needs to match (`or`) for a document to be considered a match.
 `phrase_slop` | `0` (default) or a positive integer | See `slop`.
 `prefix_length` | `0` (default) or a positive integer | The number of leading characters that are not considered in fuzziness.

@@ -1,11 +1,11 @@
 ---
 layout: default
-title: Index Aliases
+title: Index aliases
 parent: OpenSearch
 nav_order: 4
 ---
 
-# Index alias
+# Index aliases
 
 An alias is a virtual index name that can point to one or more indices.
 

@@ -1,6 +1,6 @@
 ---
 layout: default
-title: Index Data
+title: Index data
 parent: OpenSearch
 nav_order: 3
 ---
@@ -16,9 +16,9 @@ For situations in which new data arrives incrementally (for example, customer or
 
 Before you can search data, you must *index* it. Indexing is the method by which search engines organize data for fast retrieval. The resulting structure is called, fittingly, an index.
 
-In OpenSearch, the basic unit of data is a JSON *document*. Within an index, OpenSearch identifies each document using a unique *ID*.
+In OpenSearch, the basic unit of data is a JSON *document*. Within an index, OpenSearch identifies each document using a unique ID.
 
-A request to the index API looks like the following:
+A request to the index API looks like this:
 
 ```json
 PUT <index>/_doc/<id>
@@ -31,7 +31,6 @@ A request to the `_bulk` API looks a little different, because you specify the i
 POST _bulk
 { "index": { "_index": "<index>", "_id": "<id>" } }
 { "A JSON": "document" }
-
 ```
 
 Bulk data must conform to a specific format, which requires a newline character (`\n`) at the end of every line, including the last line. This is the basic format:
@@ -41,10 +40,9 @@ Action and metadata\n
 Optional document\n
 Action and metadata\n
 Optional document\n
-
 ```
 
-The document is optional, because `delete` actions do not require a document. The other actions (`index`, `create`, and `update`) all require a document. If you specifically want the action to fail if the document already exists, use the `create` action instead of the `index` action.
+The document is optional, because `delete` actions don't require a document. The other actions (`index`, `create`, and `update`) all require a document. If you specifically want the action to fail if the document already exists, use the `create` action instead of the `index` action.
 {: .note }
 
 To index bulk data using the `curl` command, navigate to the folder where you have your file saved and run the following command:
@@ -55,14 +53,14 @@ curl -H "Content-Type: application/x-ndjson" -POST https://localhost:9200/data/_
 
 If any one of the actions in the `_bulk` API fail, OpenSearch continues to execute the other actions. Examine the `items` array in the response to figure out what went wrong. The entries in the `items` array are in the same order as the actions specified in the request.
 
-OpenSearch features automatic index creation when you add a document to an index that doesn't already exist. It also features automatic ID generation if you don't specify an ID in the request. This simple example automatically creates the movies index, indexes the document, and assigns it a unique ID:
+OpenSearch automatically creates an index when you add a document to an index that doesn't already exist. It also automatically generates an ID if you don't specify an ID in the request. This simple example automatically creates the movies index, indexes the document, and assigns it a unique ID:
 
 ```json
 POST movies/_doc
 { "title": "Spirited Away" }
 ```
 
-Automatic ID generation has a clear downside: because the indexing request didn't specify a document ID, you can't easily update the document at a later time. Also, if you run this request 10 times, OpenSearch indexes this document as 10 different documents with unique IDs. To specify an ID of 1, use the following request, and note the use of PUT instead of POST:
+Automatic ID generation has a clear downside: because the indexing request didn't specify a document ID, you can't easily update the document at a later time. Also, if you run this request 10 times, OpenSearch indexes this document as 10 different documents with unique IDs. To specify an ID of 1, use the following request (note the use of PUT instead of POST):
 
 ```json
 PUT movies/_doc/1
@@ -83,7 +81,7 @@ PUT more-movies
 OpenSearch indices have the following naming restrictions:
 
 - All letters must be lowercase.
-- Index names can't begin with `_` (underscore) or `-` (hyphen).
+- Index names can't begin with underscores (`_`) or hyphens (`-`).
 - Index names can't contain spaces, commas, or the following characters:
 
   `:`, `"`, `*`, `+`, `/`, `\`, `|`, `?`, `#`, `>`, or `<`

@@ -1,17 +1,14 @@
 ---
 layout: default
-title: Index Templates
+title: Index templates
 parent: OpenSearch
 nav_order: 5
 ---
 
-# Index template
+# Index templates
 
 Index templates let you initialize new indices with predefined mappings and settings. For example, if you continuously index log data, you can define an index template so that all of these indices have the same number of shards and replicas.
 
-OpenSearch switched from `_template` to `_index_template` in version 7.8. Use `_template` for older versions of OpenSearch.
-{: .note }
-
 ---
 
 #### Table of contents
@@ -21,7 +18,7 @@ OpenSearch switched from `_template` to `_index_template` in version 7.8. Use `_
 
 ---
 
-## Create template
+## Create a template
 
 To create an index template, use a POST request:
 
@@ -110,7 +107,7 @@ GET logs-2020-01-01
 
 Any additional indices that match this pattern---`logs-2020-01-02`, `logs-2020-01-03`, and so on---will inherit the same mappings and settings.
 
-## Retrieve template
+## Retrieve a template
 
 To list all index templates:
 
@@ -148,7 +145,7 @@ You can create multiple index templates for your indices. If the index name matc
 
 The settings from the more recently created index templates override the settings of older index templates. So, you can first define a few common settings in a generic template that can act as a catch-all and then add more specialized settings as required.
 
-An even better approach is to explicitly specify template priority using the `order` parameter. OpenSearch applies templates with lower priority numbers first and then overrides them with templates that have higher priority numbers.
+An even better approach is to explicitly specify template priority using the `order` parameter. OpenSearch applies templates with lower priority numbers first and then overrides them with templates with higher priority numbers.
 
 For example, say you have the following two templates that both match the `logs-2020-01-02` index and there’s a conflict in the `number_of_shards` field:
 
@@ -188,19 +185,19 @@ PUT _index_template/template-02
 
 Because `template-02` has a higher `priority` value, it takes precedence over `template-01` . The `logs-2020-01-02` index would have the `number_of_shards` value as 3.
 
-## Delete template
+## Delete a template
 
-You can delete an index template using its name, as shown in the following command:
+You can delete an index template using its name:
 
 ```json
 DELETE _index_template/daily_logs
 ```
 
 ## Index template options
 
-You can specify the options shown in the following table:
+You can specify the following template options:
 
 Option | Type | Description | Required
 :--- | :--- | :--- | :---
-`priority` | `Number` | Specify the priority of the index template.  | No
-`create` | `Boolean` | Specify whether this index template should replace an existing one. | No
+`priority` | `Number` | The priority of the index template.  | No
+`create` | `Boolean` | Whether this index template should replace an existing one. | No