Skip to content

Cherry-pick #8914 to 6.x: Accept multiple ingest pipelines in Filebeat#9811

Merged
ycombinator merged 1 commit intoelastic:6.xfrom
ycombinator:backport_8914_6.x
Dec 28, 2018
Merged

Cherry-pick #8914 to 6.x: Accept multiple ingest pipelines in Filebeat#9811
ycombinator merged 1 commit intoelastic:6.xfrom
ycombinator:backport_8914_6.x

Conversation

@ycombinator
Copy link
Contributor

Cherry-pick of PR #8914 to 6.x branch. Original message:

Motivated by #8852 (comment).

Starting with 6.5.0, Elasticsearch Ingest Pipelines have gained the ability to:

These abilities combined present the opportunity for a fileset to ingest the same logical information presented in different formats, e.g. plaintext vs. json versions of the same log files. Imagine an entry point ingest pipeline that detects the format of a log entry and then conditionally delegates further processing of that log entry, depending on the format, to another pipeline.

This PR allows filesets to specify one or more ingest pipelines via the ingest_pipeline property in their manifest.yml. If more than one ingest pipeline is specified, the first one is taken to be the entry point ingest pipeline.

Example with multiple pipelines

ingest_pipeline:
  - pipeline-ze-boss.json 
  - pipeline-plain.json
  - pipeline-json.json

Example with a single pipeline

This is just to show that the existing functionality will continue to work as-is.

ingest_pipeline: pipeline.json

Now, if the root pipeline wants to delegate processing to another pipeline, it must use a pipeline processor to do so. This processor's name field will need to reference the other pipeline by its name. To ensure correct referencing, the name field must be specified as follows:

{
  "pipeline" : {
    "name": "{< IngestPipeline "pipeline-plain" >}"
  }
}

This will ensure that the specified name gets correctly converted to the corresponding name in Elasticsearch, since Filebeat prefixes it's "raw" Ingest pipeline names with filebeat-<version>-<module>-<fileset>- when loading them into Elasticsearch.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exported method GetPipelines returns unexported type []fileset.pipeline, which can be annoying to use

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hound has a point here ;-)

@ycombinator
Copy link
Contributor Author

This PR depends on #9813 to be merged first. Then this PR should be rebased on 6.x.

Motivated by #8852 (comment).

Starting with 6.5.0, Elasticsearch Ingest Pipelines have gained the ability to:
- run sub-pipelines via the [`pipeline` processor](https://www.elastic.co/guide/en/elasticsearch/reference/6.5/pipeline-processor.html), and
- conditionally run processors via an [`if` field](https://www.elastic.co/guide/en/elasticsearch/reference/6.5/ingest-processors.html).

These abilities combined present the opportunity for a fileset to ingest the same _logical_ information presented in different formats, e.g. plaintext vs. json versions of the same log files. Imagine an entry point ingest pipeline that detects the format of a log entry and then conditionally delegates further processing of that log entry, depending on the format, to another pipeline.

This PR allows filesets to specify one or more ingest pipelines via the `ingest_pipeline` property in their `manifest.yml`. If more than one ingest pipeline is specified, the first one is taken to be the entry point ingest pipeline.

```yaml
ingest_pipeline:
  - pipeline-ze-boss.json
  - pipeline-plain.json
  - pipeline-json.json
```
_This is just to show that the existing functionality will continue to work as-is._
```yaml
ingest_pipeline: pipeline.json
```

Now, if the root pipeline wants to delegate processing to another pipeline, it must use a `pipeline` processor to do so. This processor's `name` field will need to reference the other pipeline by its name. To ensure correct referencing, the `name` field must be specified as follows:

```json
{
  "pipeline" : {
    "name": "{< IngestPipeline "pipeline-plain" >}"
  }
}
```

This will ensure that the specified name gets correctly converted to the corresponding name in Elasticsearch, since Filebeat prefixes it's "raw" Ingest pipeline names with `filebeat-<version>-<module>-<fileset>-` when loading them into Elasticsearch.

(cherry picked from commit 5ba1f11)
@ycombinator
Copy link
Contributor Author

jenkins, test this

@ycombinator ycombinator requested review from ruflin and urso December 28, 2018 07:14
@ycombinator ycombinator merged commit 7e38917 into elastic:6.x Dec 28, 2018
@ycombinator ycombinator deleted the backport_8914_6.x branch December 25, 2019 11:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants