@@ -96,7 +96,7 @@ These templates are loaded when the integration is installed, and are used to co
9696
9797[discrete]
9898[[data-streams-ilm]]
99- == Configure an {ilm} ({ilm-init}) policy
99+ == {ilm} ({ilm-init})
100100
101101Use the {ref}/index-lifecycle-management.html[index lifecycle
102102management] ({ilm-init}) feature in {es} to manage your {agent} data stream indices as they age.
@@ -108,9 +108,29 @@ By default, these data streams use an {ilm-init} policy that matches their data
108108For example, the data stream `metrics-system.logs-*`,
109109uses the metrics {ilm-init} policy as defined in the `metrics-system.logs` index template.
110110
111+ Want to customize your index lifecycle management? See <<data-streams-ilm-tutorial>>.
112+
111113[discrete]
114+ [[data-streams-pipelines]]
115+ == Ingest pipelines
116+
117+ {agent} integration data streams ship with a default {ref}/ingest.html[ingest pipeline]
118+ that preprocesses and enriches data before indexing.
119+ The default pipeline should not be directly edited as changes can easily break the functionality of the integration.
120+
121+ Starting in version 8.4, all default ingest pipelines call a non-existent and non-versioned "`@custom`" ingest pipeline.
122+ If left uncreated, this pipeline has no effect on your data. However, if added to a data stream and customized,
123+ this pipeline can be used for custom data processing, adding fields, sanitizing data, and more.
124+
125+ The full name of the `@custom` pipeline follows the following pattern: `<type>-<dataset>@custom`.
126+ The `@custom` pipeline can directly contain processors or you can use the
127+ pipeline processor to call other pipelines that can be shared across multiple data streams or integrations.
128+ The `@custom` pipeline will persist across all version upgrades.
129+
130+ See <<data-streams-pipeline-tutorial>> to get started.
131+
112132[[data-streams-ilm-tutorial]]
113- == Tutorial: Customize data retention for integrations
133+ == Tutorial: Customize data retention policies
114134
115135This tutorial explains how to apply a custom {ilm-init} policy to an integration's data stream.
116136
@@ -240,3 +260,188 @@ or force a rollover using the {ref}/indices-rollover-index.html[{es} rollover AP
240260----
241261POST /metrics-system.network-production/_rollover/
242262----
263+
264+ [[data-streams-pipeline-tutorial]]
265+ == Tutorial: Transform data with custom ingest pipelines
266+
267+ This tutorial explains how to add a custom ingest pipeline to an Elastic Integration.
268+ Custom pipelines can be used to add custom data processing,
269+ like adding fields, obfuscate sensitive information, and more.
270+
271+ **Scenario:** You have {agent}s collecting system metrics with the System integration.
272+
273+ **Goal:** Add a custom ingest pipeline that adds a new field to each {es} document before it is indexed.
274+
275+ [discrete]
276+ [[data-streams-pipeline-one]]
277+ === Step 1: Create a custom ingest pipeline
278+
279+ Create a custom ingest pipeline that will be called by the default integration pipeline.
280+ In this tutorial, we'll create a pipeline that adds a new field to our documents.
281+
282+ . In {kib}, navigate to **Stack Management** -> **Ingest Pipelines** -> **Create pipeline** -> **New pipeline**.
283+
284+ . Name your pipeline. We'll call this one, `add_field`.
285+
286+ . Select **Add a processor**. Fill out the following information:
287+ +
288+ ** Processor: "Set"
289+ ** Field: `test`
290+ ** Value: `true`
291+ +
292+ The {ref}/set-processor.html[Set processor] sets a document field and associates it with the specified value.
293+
294+ . Click **Add**.
295+
296+ . Click **Create pipeline**.
297+
298+ [discrete]
299+ [[data-streams-pipeline-two]]
300+ === Step 2: Apply your ingest pipeline
301+
302+ Add a custom pipeline to an integration by calling it from the default ingest pipeline.
303+ The custom pipeline will run after the default pipeline but before the final pipeline.
304+
305+ [discrete]
306+ ==== Edit integration
307+
308+ Add a custom pipeline to an integration from the **Edit integration** workflow.
309+ The integration must already be configured and installed before a custom pipeline can be added.
310+ To enter this workflow, do the following:
311+
312+ . Navigate to **{fleet}**
313+ . Select the relevant {agent} policy
314+ . Search for the integration you want to edit
315+ . Select **Actions** -> **Edit integration**
316+
317+ [discrete]
318+ ==== Select a data stream
319+
320+ Most integrations write to multiple data streams.
321+ You'll need to add the custom pipeline to each data stream individually.
322+
323+ . Find the first data stream you wish to edit and select **Change defaults**.
324+ For this tutorial, find the data stream configuration titled, **Collect metrics from System instances**.
325+
326+ . Scroll to **System CPU metrics** and under **Advanced options** select **Add custom pipeline**.
327+ +
328+ This will take you to the **Create pipeline** workflow in **Stack management**.
329+
330+ [discrete]
331+ ==== Add the pipeline
332+
333+ Add the pipeline you created in step one.
334+
335+ . Select **Add a processor**. Fill out the following information:
336+ +
337+ ** Processor: "Pipeline"
338+ ** Pipeline name: "add_field"
339+ ** Value: `true`
340+
341+ . Click **Create pipeline** to return to the **Edit integration** page.
342+
343+ [discrete]
344+ ==== Roll over the data stream (optional)
345+
346+ For pipeline changes to take effect immediately, you must roll over the data stream.
347+ If you do not, the changes will not take effect until the next scheduled roll over.
348+ Select **Apply now and rollover**.
349+
350+ After the data stream rolls over, note the name of the custom ingest pipeline.
351+ In this tutorial, it's `metrics-system.cpu@custom`.
352+ The name follows the pattern `<type>-<dataset>@custom`:
353+
354+ * type: `metrics`
355+ * dataset: `system.cpu`
356+ * Custom ingest pipeline designation: `@custom`
357+
358+ [discrete]
359+ ==== Repeat
360+
361+ Add the custom ingest pipeline to any other data streams you wish to update.
362+
363+ [discrete]
364+ [[data-streams-pipeline-three]]
365+ === Step 3: Test the ingest pipeline (optional)
366+
367+ Allow time for new data to be ingested before testing your pipeline.
368+ In a new window, open {kib} and navigate to **{kib} Dev tools**.
369+
370+ Use an {ref}/query-dsl-exists-query.html[exists query] to ensure that the
371+ new field, "test" is being applied to documents.
372+
373+ [source,console]
374+ ----
375+ GET metrics-system.cpu-default/_search <1>
376+ {
377+ "query": {
378+ "exists": {
379+ "field": "test" <2>
380+ }
381+ }
382+ }
383+ ----
384+ <1> The data stream to search. In this tutorial, we've edited the `metrics-system.cpu` type and dataset.
385+ `default` is the default namespace.
386+ Combining all three of these gives us a data stream name of `metrics-system.cpu-default`.
387+ <2> The name of the field set in step one.
388+
389+ If your custom pipeline is working correctly, this query will return at least one document.
390+
391+ [discrete]
392+ [[data-streams-pipeline-four]]
393+ === Step 4: Add custom mappings
394+
395+ Now that a new field is being set in your {es} documents, you'll want to assign a new mapping for that field.
396+ Use the `@custom` component template to apply custom mappings to an integration data stream.
397+
398+ In the **Edit integration** workflow, do the following:
399+
400+ . Under **Advanced options** select the pencil icon to edit the `@custom` component template.
401+
402+ . Define the new field for your indexed documents. Select **Add field** and add the following information:
403+ +
404+ * Field name: `test`
405+ * Field type: `Boolean`
406+
407+ . Click **Add field**.
408+
409+ . Click **Review** to fast-forward to the review step and click **Save component template** to return to the **Edit integration** workflow.
410+
411+ . For changes to take effect immediately, select **Apply now and rollover**.
412+
413+ [discrete]
414+ [[data-streams-pipeline-five]]
415+ === Step 5: Test the custom mappings (optional)
416+
417+ Allow time for new data to be ingested before testing your mappings.
418+ In a new window, open {kib} and navigate to **{kib} Dev tools**.
419+
420+ Use the {ref}/indices-get-field-mapping.html[Get field mapping API] to ensure that the
421+ custom mapping has been applied.
422+
423+ [source,console]
424+ ----
425+ GET metrics-system.cpu-default/_mapping/field/test <1>
426+ ----
427+ <1> The data stream to search. In this tutorial, we've edited the `metrics-system.cpu` type and dataset.
428+ `default` is the default namespace.
429+ Combining all three of these gives us a data stream name of `metrics-system.cpu-default`.
430+
431+ The result should include `type: "boolean"` for the specified field.
432+
433+ [source,json]
434+ ----
435+ ".ds-metrics-system.cpu-default-2022.08.10-000002": {
436+ "mappings": {
437+ "test": {
438+ "full_name": "test",
439+ "mapping": {
440+ "test": {
441+ "type": "boolean"
442+ }
443+ }
444+ }
445+ }
446+ }
447+ ----
0 commit comments