Ingest ES structured audit logs#10352
Ingest ES structured audit logs#10352ycombinator merged 9 commits intoelastic:masterfrom ycombinator:fb-es-structured-audit-log
Conversation
|
Pinging @elastic/stack-monitoring |
|
While working on this PR I realized that we don't have sample lines for the structured elasticsearch audit log containing a request body (which is supposed to be parsed into the |
|
jenkins, test this |
ruflin
left a comment
There was a problem hiding this comment.
Can you also update the ecs-migration.yml file?
Updated now. Do I need to run any |
This reverts commit ab7cf63.
|
@ruflin I'm not sure what you meant by #10352 (review):
Could you please clarify? Otherwise this PR is ready for review, IMHO. |
|
jenkins, test this |
webmat
left a comment
There was a problem hiding this comment.
Noticed very minor things. Looking pretty good!
| }, | ||
| { | ||
| "dot_expander": { | ||
| "field": "origin.address", |
There was a problem hiding this comment.
I would instead rename this field to source.address (and keep it around).
| { | ||
| "rename": { | ||
| "field": "elasticsearch.audit.user.name", | ||
| "target_field": "user.name" |
There was a problem hiding this comment.
Can't we dot_expand in place? If the original key is "user.name", I would think that the output to "user": { "name": "..." } doesn't conflict.
It would simplify the code in a few places where you have the same pattern happening.
There was a problem hiding this comment.
I'm not sure I follow what you mean by doing "dot_expand in place"? The dot_expander processor right above this one is necessary to go from:
{
"elasticsearch.audit.user.name": "foo"
}
to:
{
"elasticsearch.audit.user": {
"name": "foo"
}
}
That then allows us to call the rename processor as we are doing over here.
There was a problem hiding this comment.
I haven't used dot_expander yet, so perhaps I'm misunderstanding it. But I was under the impression that the following did the equivalent of the 2 processors above:
{ "dot_expander": { "field": "user.name" } }And in cases where the output isn't the object equivalent of the dotted notation, you would use path this way, to get the equivalent of the two node.name processors above:
{ "dot_expander": { "field": "node.name", "path": "elasticsearch.node" } }If that's not the case, you can ignore this ;-)
| { | ||
| "date": { | ||
| "field": "elasticsearch.audit.timestamp", | ||
| "field": "elasticsearch.audit.@timestamp", |
There was a problem hiding this comment.
Nit: prior to grabbing the real timestamp from the log, could you populate event.created with Beat's @timestamp?
There was a problem hiding this comment.
This is already being done in the very first processor in this pipeline. It's collapsed in the diff since nothing changed there:
|
Also, the It's not the format as the various But in the ecs-migration.yml file, you should document all of the fields for which you have
|
|
@webmat Thanks for the detailed explanation of how to use the I've addressed all your comments in the review. Only one of them resulted in a code change, however. So you might want to look at my replies to your other two comments. Thanks! |
Follow up to #10352 per #10352 (comment): > While working on this PR I realized that we don't have sample lines for the **structured** elasticsearch audit log containing a request body (which is supposed to be parsed into the `http.request.body.content` field). I'm working with `@albertzaharovits` to get such a sample and will incorporate it into follow up PRs (for `master` and `6.x`). Accordingly, this PR adds sample lines to the structured and unstructured log file test fixtures for the `elasticsearch/audit` fileset and teaches the fileset to parse any new fields encountered in these sample lines.
webmat
left a comment
There was a problem hiding this comment.
LGTM
Muy understanding of dot_expander may be flawed, haven't used it before. So feel free to ignore, if what I'm saying in my response below doesn't make sense ;-)
| { | ||
| "rename": { | ||
| "field": "elasticsearch.audit.user.name", | ||
| "target_field": "user.name" |
There was a problem hiding this comment.
I haven't used dot_expander yet, so perhaps I'm misunderstanding it. But I was under the impression that the following did the equivalent of the 2 processors above:
{ "dot_expander": { "field": "user.name" } }And in cases where the output isn't the object equivalent of the dotted notation, you would use path this way, to get the equivalent of the two node.name processors above:
{ "dot_expander": { "field": "node.name", "path": "elasticsearch.node" } }If that's not the case, you can ignore this ;-)
| { | ||
| "date": { | ||
| "field": "elasticsearch.audit.timestamp", | ||
| "field": "elasticsearch.audit.@timestamp", |
Follow up to #10352 per #10352 (comment): > While working on this PR I realized that we don't have sample lines for the **structured** elasticsearch audit log containing a request body (which is supposed to be parsed into the `http.request.body.content` field). I'm working with `@albertzaharovits` to get such a sample and will incorporate it into follow up PRs (for `master` and `6.x`). Accordingly, this PR adds sample lines to the structured and unstructured log file test fixtures for the `elasticsearch/audit` fileset and teaches the fileset to parse any new fields encountered in these sample lines.
This PR teaches the `elasticsearch/slowlog` fileset to ingest structured Elasticsearch search and indexing slow logs. This PR takes the same approach as #10352, in that it creates an entrypoint pipeline, `pipeline.json`, that delegates further processing of a log entry depending on what it sees as the first character of the entry: - If the first character is `{`, it assumes the log line is structured as JSON and delegates further processing to `pipeline-json.json`. - Else, it assumes the log line is plaintext and delegates further processing to `pipeline-plaintext.json`.
This is a "forward port" of #8852.
In #8852, we taught Filebeat to ingest either structured or unstructured ES audit logs but the resulting fields conformed to the 6.x mapping structure.
In this PR we also teach Filebeat to ingest either structured or unstructured ES audit logs but the resulting fields conform to the 7.0 (ECS-based) mapping structure.