Cherry-pick #8914 to 6.x: Accept multiple ingest pipelines in Filebeat#9811
Merged
ycombinator merged 1 commit intoelastic:6.xfrom Dec 28, 2018
ycombinator:backport_8914_6.x
Merged
Cherry-pick #8914 to 6.x: Accept multiple ingest pipelines in Filebeat#9811ycombinator merged 1 commit intoelastic:6.xfrom ycombinator:backport_8914_6.x
ycombinator merged 1 commit intoelastic:6.xfrom
ycombinator:backport_8914_6.x
Conversation
houndci-bot
reviewed
Dec 27, 2018
filebeat/fileset/fileset.go
Outdated
There was a problem hiding this comment.
exported method GetPipelines returns unexported type []fileset.pipeline, which can be annoying to use
Contributor
Author
|
This PR depends on #9813 to be merged first. Then this PR should be rebased on |
Motivated by #8852 (comment). Starting with 6.5.0, Elasticsearch Ingest Pipelines have gained the ability to: - run sub-pipelines via the [`pipeline` processor](https://www.elastic.co/guide/en/elasticsearch/reference/6.5/pipeline-processor.html), and - conditionally run processors via an [`if` field](https://www.elastic.co/guide/en/elasticsearch/reference/6.5/ingest-processors.html). These abilities combined present the opportunity for a fileset to ingest the same _logical_ information presented in different formats, e.g. plaintext vs. json versions of the same log files. Imagine an entry point ingest pipeline that detects the format of a log entry and then conditionally delegates further processing of that log entry, depending on the format, to another pipeline. This PR allows filesets to specify one or more ingest pipelines via the `ingest_pipeline` property in their `manifest.yml`. If more than one ingest pipeline is specified, the first one is taken to be the entry point ingest pipeline. ```yaml ingest_pipeline: - pipeline-ze-boss.json - pipeline-plain.json - pipeline-json.json ``` _This is just to show that the existing functionality will continue to work as-is._ ```yaml ingest_pipeline: pipeline.json ``` Now, if the root pipeline wants to delegate processing to another pipeline, it must use a `pipeline` processor to do so. This processor's `name` field will need to reference the other pipeline by its name. To ensure correct referencing, the `name` field must be specified as follows: ```json { "pipeline" : { "name": "{< IngestPipeline "pipeline-plain" >}" } } ``` This will ensure that the specified name gets correctly converted to the corresponding name in Elasticsearch, since Filebeat prefixes it's "raw" Ingest pipeline names with `filebeat-<version>-<module>-<fileset>-` when loading them into Elasticsearch. (cherry picked from commit 5ba1f11)
Contributor
Author
|
jenkins, test this |
ruflin
approved these changes
Dec 28, 2018
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Cherry-pick of PR #8914 to 6.x branch. Original message:
Motivated by #8852 (comment).
Starting with 6.5.0, Elasticsearch Ingest Pipelines have gained the ability to:
pipelineprocessor, andiffield.These abilities combined present the opportunity for a fileset to ingest the same logical information presented in different formats, e.g. plaintext vs. json versions of the same log files. Imagine an entry point ingest pipeline that detects the format of a log entry and then conditionally delegates further processing of that log entry, depending on the format, to another pipeline.
This PR allows filesets to specify one or more ingest pipelines via the
ingest_pipelineproperty in theirmanifest.yml. If more than one ingest pipeline is specified, the first one is taken to be the entry point ingest pipeline.Example with multiple pipelines
Example with a single pipeline
This is just to show that the existing functionality will continue to work as-is.
Now, if the root pipeline wants to delegate processing to another pipeline, it must use a
pipelineprocessor to do so. This processor'snamefield will need to reference the other pipeline by its name. To ensure correct referencing, thenamefield must be specified as follows:{ "pipeline" : { "name": "{< IngestPipeline "pipeline-plain" >}" } }This will ensure that the specified name gets correctly converted to the corresponding name in Elasticsearch, since Filebeat prefixes it's "raw" Ingest pipeline names with
filebeat-<version>-<module>-<fileset>-when loading them into Elasticsearch.