You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: _data-prepper/pipelines/configuration/processors/key-value.md
+340Lines changed: 340 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -11,6 +11,346 @@ nav_order: 170
11
11
12
12
You can use the `key_value` processor to parse the specified field into key-value pairs. You can customize the `key_value` processor to parse field information with the following options. The type for each of the following options is `string`.
13
13
14
+
## Examples
15
+
16
+
The following examples demonstrate several configurations you can use with this processor.
17
+
18
+
The examples don't use security and are for demonstration purposes only. We strongly recommend configuring SSL before using these examples in production.
19
+
{: .warning}
20
+
21
+
### Key-value parsing, normalization, and deduplication
22
+
23
+
The following example parses the `message` field into `key=value` pairs, normalizes and cleans the keys, prefixes them with `meta_`, deduplicates values, and drops keys without values into `parsed_kv`:
24
+
25
+
```yaml
26
+
kv-basic-pipeline:
27
+
source:
28
+
http:
29
+
path: /logs
30
+
ssl: false
31
+
32
+
processor:
33
+
- key_value:
34
+
# Read key=value pairs from the "message" field (default anyway)
35
+
source: message
36
+
# Write parsed pairs into a nested object "parsed_kv"
37
+
destination: parsed_kv
38
+
39
+
# Split pairs on '&' and split key vs value on '='
40
+
field_split_characters: "&"
41
+
value_split_characters: "="
42
+
43
+
# Normalize keys and trim garbage whitespace around keys/values
44
+
transform_key: lowercase
45
+
delete_key_regex: "\\s+"# remove spaces from keys
46
+
delete_value_regex: "^\\s+|\\s+$"# trim leading/trailing spaces
47
+
48
+
# Add a prefix to every key (after normalization + delete_key_regex)
49
+
prefix: meta_
50
+
51
+
# Keep a single unique value for duplicate keys
52
+
skip_duplicate_values: true
53
+
54
+
# Drop keys whose value is empty/absent (e.g., `empty=` or `novalue`)
55
+
drop_keys_with_no_value: true
56
+
57
+
sink:
58
+
- opensearch:
59
+
hosts: ["https://opensearch:9200"]
60
+
insecure: true
61
+
username: admin
62
+
password: admin_password
63
+
index_type: custom
64
+
index: kv-basic-%{yyyy.MM.dd}
65
+
```
66
+
{% include copy.html %}
67
+
68
+
You can test this pipeline using the following command:
69
+
70
+
```bash
71
+
curl -sS -X POST "http://localhost:2021/logs" \
72
+
-H "Content-Type: application/json" \
73
+
-d '[
74
+
{"message":"key1=value1&key1=value1&Key Two = value two & empty=&novalue"},
75
+
{"message":"ENV = prod & TEAM = core & owner = alice "}
76
+
]'
77
+
```
78
+
{% include copy.html %}
79
+
80
+
The documents stored in OpenSearch contain the following information:
81
+
82
+
```json
83
+
{
84
+
...
85
+
"hits": {
86
+
"total": {
87
+
"value": 2,
88
+
"relation": "eq"
89
+
},
90
+
"max_score": 1,
91
+
"hits": [
92
+
{
93
+
"_index": "kv-basic-2025.10.14",
94
+
"_id": "M6d84pkB3P3jd6EROH_f",
95
+
"_score": 1,
96
+
"_source": {
97
+
"message": "key1=value1&key1=value1&Key Two = value two & empty=&novalue",
98
+
"parsed_kv": {
99
+
"meta_key1": "value1",
100
+
"meta_empty": "",
101
+
"meta_keytwo": "value two"
102
+
}
103
+
}
104
+
},
105
+
{
106
+
"_index": "kv-basic-2025.10.14",
107
+
"_id": "NKd84pkB3P3jd6EROH_f",
108
+
"_score": 1,
109
+
"_source": {
110
+
"message": "ENV = prod & TEAM = core & owner = alice ",
111
+
"parsed_kv": {
112
+
"meta_owner": "alice",
113
+
"meta_team": "core",
114
+
"meta_env": "prod"
115
+
}
116
+
}
117
+
}
118
+
]
119
+
}
120
+
}
121
+
```
122
+
123
+
### Grouped values to root
124
+
125
+
The following example parses the `payload` field by using `&&` to separate pairs and `==` to separate keys and values. It preserves bracketed groups as single values, writes the parsed results to the event root without overwriting existing fields, and records any unmatched tokens as `null`:
126
+
127
+
```yaml
128
+
kv-grouping-pipeline:
129
+
source:
130
+
http:
131
+
path: /logs
132
+
ssl: false
133
+
134
+
processor:
135
+
- key_value:
136
+
source: "payload"
137
+
destination: null
138
+
139
+
field_split_characters: "&&"# pair delimiter (OK with grouping)
140
+
value_split_characters: null # disable the default "="
141
+
key_value_delimiter_regex: "=="# exact '==' for key/value
142
+
143
+
value_grouping: true
144
+
remove_brackets: false
145
+
overwrite_if_destination_exists: false
146
+
non_match_value: null
147
+
148
+
sink:
149
+
- opensearch:
150
+
hosts: ["https://opensearch:9200"]
151
+
insecure: true
152
+
username: admin
153
+
password: "admin_pass"
154
+
index_type: custom
155
+
index: "kv-regex-%{yyyy.MM.dd}"
156
+
```
157
+
{% include copy.html %}
158
+
159
+
You can test this pipeline using the following command:
The following example parses bracketed nested `key=value` structures from `body` into `parsed.*` only when `/type == "nested"`. It preserves group hierarchy, enforces strict nesting rules, applies default fields, and leaves non-nested events unchanged:
221
+
222
+
```yaml
223
+
kv-conditional-recursive-pipeline:
224
+
source:
225
+
http:
226
+
path: /logs
227
+
ssl: false
228
+
229
+
processor:
230
+
- key_value:
231
+
source: "body"
232
+
destination: "parsed"
233
+
234
+
key_value_when: '/type == "nested"'
235
+
recursive: true
236
+
237
+
# Split rules (per docs; not regex)
238
+
field_split_characters: "&"
239
+
value_split_characters: "="
240
+
241
+
# Grouping & quoting (per docs)
242
+
value_grouping: true
243
+
string_literal_character: "\""
244
+
remove_brackets: false
245
+
246
+
# Keep only some top-level keys; then set defaults
247
+
include_keys: ["item1","item2","owner"]
248
+
default_values:
249
+
owner: "unknown"
250
+
region: "eu-west-1"
251
+
252
+
strict_grouping: true
253
+
tags_on_failure: ["keyvalueprocessor_failure"]
254
+
255
+
sink:
256
+
- opensearch:
257
+
hosts: ["https://opensearch:9200"]
258
+
insecure: true
259
+
username: admin
260
+
password: "admin_pass"
261
+
index_type: custom
262
+
index: "kv-recursive-%{yyyy.MM.dd}"
263
+
```
264
+
{% include copy.html %}
265
+
266
+
You can test this pipeline using the following command:
`source` | The message field to be parsed. Optional. Default value is `message`. | If `source` is `"message1"`, `{"message1": {"key1=value1"}, "message2": {"key2=value2"}}` parses into `{"message1": {"key1=value1"}, "message2": {"key2=value2"}, "parsed_message": {"key1": "value1"}}`.
0 commit comments