Skip to content

Commit 35911d8

Browse files
authored
Split the ingest processor docs into multiple files (#36887)
This commit breaks the single ingest docs file into multiple files, factoring out the processor docs into a documentation file per processor. This will help make this content easier to maintain.
1 parent 1a23417 commit 35911d8

29 files changed

+1846
-1851
lines changed

docs/reference/ingest/ingest-node.asciidoc

Lines changed: 26 additions & 1851 deletions
Large diffs are not rendered by default.
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
[[append-processor]]
2+
=== Append Processor
3+
Appends one or more values to an existing array if the field already exists and it is an array.
4+
Converts a scalar to an array and appends one or more values to it if the field exists and it is a scalar.
5+
Creates an array containing the provided values if the field doesn't exist.
6+
Accepts a single value or an array of values.
7+
8+
[[append-options]]
9+
.Append Options
10+
[options="header"]
11+
|======
12+
| Name | Required | Default | Description
13+
| `field` | yes | - | The field to be appended to. Supports <<accessing-template-fields,template snippets>>.
14+
| `value` | yes | - | The value to be appended. Supports <<accessing-template-fields,template snippets>>.
15+
include::common-options.asciidoc[]
16+
|======
17+
18+
[source,js]
19+
--------------------------------------------------
20+
{
21+
"append": {
22+
"field": "tags",
23+
"value": ["production", "{{app}}", "{{owner}}"]
24+
}
25+
}
26+
--------------------------------------------------
27+
// NOTCONSOLE
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
[[bytes-processor]]
2+
=== Bytes Processor
3+
Converts a human readable byte value (e.g. 1kb) to its value in bytes (e.g. 1024).
4+
5+
Supported human readable units are "b", "kb", "mb", "gb", "tb", "pb" case insensitive. An error will occur if
6+
the field is not a supported format or resultant value exceeds 2^63.
7+
8+
[[bytes-options]]
9+
.Bytes Options
10+
[options="header"]
11+
|======
12+
| Name | Required | Default | Description
13+
| `field` | yes | - | The field to convert
14+
| `target_field` | no | `field` | The field to assign the converted value to, by default `field` is updated in-place
15+
| `ignore_missing` | no | `false` | If `true` and `field` does not exist or is `null`, the processor quietly exits without modifying the document
16+
include::common-options.asciidoc[]
17+
|======
18+
19+
[source,js]
20+
--------------------------------------------------
21+
{
22+
"bytes": {
23+
"field": "file.size"
24+
}
25+
}
26+
--------------------------------------------------
27+
// NOTCONSOLE
File renamed without changes.
Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
[[convert-processor]]
2+
=== Convert Processor
3+
Converts a field in the currently ingested document to a different type, such as converting a string to an integer.
4+
If the field value is an array, all members will be converted.
5+
6+
The supported types include: `integer`, `long`, `float`, `double`, `string`, `boolean`, and `auto`.
7+
8+
Specifying `boolean` will set the field to true if its string value is equal to `true` (ignore case), to
9+
false if its string value is equal to `false` (ignore case), or it will throw an exception otherwise.
10+
11+
Specifying `auto` will attempt to convert the string-valued `field` into the closest non-string type.
12+
For example, a field whose value is `"true"` will be converted to its respective boolean type: `true`. Do note
13+
that float takes precedence of double in `auto`. A value of `"242.15"` will "automatically" be converted to
14+
`242.15` of type `float`. If a provided field cannot be appropriately converted, the Convert Processor will
15+
still process successfully and leave the field value as-is. In such a case, `target_field` will
16+
still be updated with the unconverted field value.
17+
18+
[[convert-options]]
19+
.Convert Options
20+
[options="header"]
21+
|======
22+
| Name | Required | Default | Description
23+
| `field` | yes | - | The field whose value is to be converted
24+
| `target_field` | no | `field` | The field to assign the converted value to, by default `field` is updated in-place
25+
| `type` | yes | - | The type to convert the existing value to
26+
| `ignore_missing` | no | `false` | If `true` and `field` does not exist or is `null`, the processor quietly exits without modifying the document
27+
include::common-options.asciidoc[]
28+
|======
29+
30+
[source,js]
31+
--------------------------------------------------
32+
PUT _ingest/pipeline/my-pipeline-id
33+
{
34+
"description": "converts the content of the id field to an integer",
35+
"processors" : [
36+
{
37+
"convert" : {
38+
"field" : "id",
39+
"type": "integer"
40+
}
41+
}
42+
]
43+
}
44+
--------------------------------------------------
45+
// NOTCONSOLE
Lines changed: 145 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,145 @@
1+
[[date-index-name-processor]]
2+
=== Date Index Name Processor
3+
4+
The purpose of this processor is to point documents to the right time based index based
5+
on a date or timestamp field in a document by using the <<date-math-index-names, date math index name support>>.
6+
7+
The processor sets the `_index` meta field with a date math index name expression based on the provided index name
8+
prefix, a date or timestamp field in the documents being processed and the provided date rounding.
9+
10+
First, this processor fetches the date or timestamp from a field in the document being processed. Optionally,
11+
date formatting can be configured on how the field's value should be parsed into a date. Then this date,
12+
the provided index name prefix and the provided date rounding get formatted into a date math index name expression.
13+
Also here optionally date formatting can be specified on how the date should be formatted into a date math index name
14+
expression.
15+
16+
An example pipeline that points documents to a monthly index that starts with a `myindex-` prefix based on a
17+
date in the `date1` field:
18+
19+
[source,js]
20+
--------------------------------------------------
21+
PUT _ingest/pipeline/monthlyindex
22+
{
23+
"description": "monthly date-time index naming",
24+
"processors" : [
25+
{
26+
"date_index_name" : {
27+
"field" : "date1",
28+
"index_name_prefix" : "myindex-",
29+
"date_rounding" : "M"
30+
}
31+
}
32+
]
33+
}
34+
--------------------------------------------------
35+
// CONSOLE
36+
37+
38+
Using that pipeline for an index request:
39+
40+
[source,js]
41+
--------------------------------------------------
42+
PUT /myindex/_doc/1?pipeline=monthlyindex
43+
{
44+
"date1" : "2016-04-25T12:02:01.789Z"
45+
}
46+
--------------------------------------------------
47+
// CONSOLE
48+
// TEST[continued]
49+
50+
[source,js]
51+
--------------------------------------------------
52+
{
53+
"_index" : "myindex-2016-04-01",
54+
"_type" : "_doc",
55+
"_id" : "1",
56+
"_version" : 1,
57+
"result" : "created",
58+
"_shards" : {
59+
"total" : 2,
60+
"successful" : 1,
61+
"failed" : 0
62+
},
63+
"_seq_no" : 55,
64+
"_primary_term" : 1
65+
}
66+
--------------------------------------------------
67+
// TESTRESPONSE[s/"_seq_no" : \d+/"_seq_no" : $body._seq_no/ s/"_primary_term" : 1/"_primary_term" : $body._primary_term/]
68+
69+
70+
The above request will not index this document into the `myindex` index, but into the `myindex-2016-04-01` index because
71+
it was rounded by month. This is because the date-index-name-processor overrides the `_index` property of the document.
72+
73+
To see the date-math value of the index supplied in the actual index request which resulted in the above document being
74+
indexed into `myindex-2016-04-01` we can inspect the effects of the processor using a simulate request.
75+
76+
77+
[source,js]
78+
--------------------------------------------------
79+
POST _ingest/pipeline/_simulate
80+
{
81+
"pipeline" :
82+
{
83+
"description": "monthly date-time index naming",
84+
"processors" : [
85+
{
86+
"date_index_name" : {
87+
"field" : "date1",
88+
"index_name_prefix" : "myindex-",
89+
"date_rounding" : "M"
90+
}
91+
}
92+
]
93+
},
94+
"docs": [
95+
{
96+
"_source": {
97+
"date1": "2016-04-25T12:02:01.789Z"
98+
}
99+
}
100+
]
101+
}
102+
--------------------------------------------------
103+
// CONSOLE
104+
105+
and the result:
106+
107+
[source,js]
108+
--------------------------------------------------
109+
{
110+
"docs" : [
111+
{
112+
"doc" : {
113+
"_id" : "_id",
114+
"_index" : "<myindex-{2016-04-25||/M{yyyy-MM-dd|UTC}}>",
115+
"_type" : "_type",
116+
"_source" : {
117+
"date1" : "2016-04-25T12:02:01.789Z"
118+
},
119+
"_ingest" : {
120+
"timestamp" : "2016-11-08T19:43:03.850+0000"
121+
}
122+
}
123+
}
124+
]
125+
}
126+
--------------------------------------------------
127+
// TESTRESPONSE[s/2016-11-08T19:43:03.850\+0000/$body.docs.0.doc._ingest.timestamp/]
128+
129+
The above example shows that `_index` was set to `<myindex-{2016-04-25||/M{yyyy-MM-dd|UTC}}>`. Elasticsearch
130+
understands this to mean `2016-04-01` as is explained in the <<date-math-index-names, date math index name documentation>>
131+
132+
[[date-index-name-options]]
133+
.Date index name options
134+
[options="header"]
135+
|======
136+
| Name | Required | Default | Description
137+
| `field` | yes | - | The field to get the date or timestamp from.
138+
| `index_name_prefix` | no | - | A prefix of the index name to be prepended before the printed date. Supports <<accessing-template-fields,template snippets>>.
139+
| `date_rounding` | yes | - | How to round the date when formatting the date into the index name. Valid values are: `y` (year), `M` (month), `w` (week), `d` (day), `h` (hour), `m` (minute) and `s` (second). Supports <<accessing-template-fields,template snippets>>.
140+
| `date_formats` | no | yyyy-MM-dd'T'HH:mm:ss.SSSZ | An array of the expected date formats for parsing dates / timestamps in the document being preprocessed. Can be a Joda pattern or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N.
141+
| `timezone` | no | UTC | The timezone to use when parsing the date and when date math index supports resolves expressions into concrete index names.
142+
| `locale` | no | ENGLISH | The locale to use when parsing the date from the document being preprocessed, relevant when parsing month names or week days.
143+
| `index_name_format` | no | yyyy-MM-dd | The format to be used when printing the parsed date into the index name. An valid Joda pattern is expected here. Supports <<accessing-template-fields,template snippets>>.
144+
include::common-options.asciidoc[]
145+
|======
Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
[[date-processor]]
2+
=== Date Processor
3+
4+
Parses dates from fields, and then uses the date or timestamp as the timestamp for the document.
5+
By default, the date processor adds the parsed date as a new field called `@timestamp`. You can specify a
6+
different field by setting the `target_field` configuration parameter. Multiple date formats are supported
7+
as part of the same date processor definition. They will be used sequentially to attempt parsing the date field,
8+
in the same order they were defined as part of the processor definition.
9+
10+
[[date-options]]
11+
.Date options
12+
[options="header"]
13+
|======
14+
| Name | Required | Default | Description
15+
| `field` | yes | - | The field to get the date from.
16+
| `target_field` | no | @timestamp | The field that will hold the parsed date.
17+
| `formats` | yes | - | An array of the expected date formats. Can be a Joda pattern or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N.
18+
| `timezone` | no | UTC | The timezone to use when parsing the date. Supports <<accessing-template-fields,template snippets>>.
19+
| `locale` | no | ENGLISH | The locale to use when parsing the date, relevant when parsing month names or week days. Supports <<accessing-template-fields,template snippets>>.
20+
include::common-options.asciidoc[]
21+
|======
22+
23+
Here is an example that adds the parsed date to the `timestamp` field based on the `initial_date` field:
24+
25+
[source,js]
26+
--------------------------------------------------
27+
{
28+
"description" : "...",
29+
"processors" : [
30+
{
31+
"date" : {
32+
"field" : "initial_date",
33+
"target_field" : "timestamp",
34+
"formats" : ["dd/MM/yyyy hh:mm:ss"],
35+
"timezone" : "Europe/Amsterdam"
36+
}
37+
}
38+
]
39+
}
40+
--------------------------------------------------
41+
// NOTCONSOLE
42+
43+
The `timezone` and `locale` processor parameters are templated. This means that their values can be
44+
extracted from fields within documents. The example below shows how to extract the locale/timezone
45+
details from existing fields, `my_timezone` and `my_locale`, in the ingested document that contain
46+
the timezone and locale values.
47+
48+
[source,js]
49+
--------------------------------------------------
50+
{
51+
"description" : "...",
52+
"processors" : [
53+
{
54+
"date" : {
55+
"field" : "initial_date",
56+
"target_field" : "timestamp",
57+
"formats" : ["ISO8601"],
58+
"timezone" : "{{my_timezone}}",
59+
"locale" : "{{my_locale}}"
60+
}
61+
}
62+
]
63+
}
64+
--------------------------------------------------
65+
// NOTCONSOLE

0 commit comments

Comments
 (0)