Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix system tests using logstash for ingest only write one event per data stream #2117

Merged
merged 1 commit into from
Sep 23, 2024

Conversation

aleksmaus
Copy link
Member

This change addresses the issue elastic/integrations#8530

Reported issue

When experimenting with using stack.logstash_enabled and running the panw system tests, only one event is indexed for any data stream checked:

Using Logstash for ingest:

2023/11/16 11:40:34 DEBUG checking for expected data in data stream...
2023/11/16 11:40:34 DEBUG found 0 hits in logs-panw.panos-ep data stream
...
2023/11/16 11:40:42 DEBUG found 1 hits in logs-panw.panos-ep data stream

Expected:

2023/11/16 11:55:28 DEBUG checking for expected data in data stream...
2023/11/16 11:55:28 DEBUG found 0 hits in logs-panw.panos-ep data stream
...
2023/11/16 11:55:36 DEBUG found 226 hits in logs-panw.panos-ep data stream

Logstash logs showing version conflict errors for every other missing event:

{"type"=>"version_conflict_engine_exception", "reason"=>"[%{[@metadata][_ingest_document][id]}]: version conflict, document already exists (current version [9])", "index_uuid"=>"kHyWSHBOSfqankKX9IDHeg", "shard"=>"0", "index"=>".ds-logs-panw.panos-ep-2023.11.16-000001"}

Full error

[2023-11-16T17:40:40,513][WARN ][logstash.outputs.elasticsearch][main][211ae8873d1c1f484126be31c8966101f7c64e43c9dacc50e65ab9ac46725963] Failed action {:status=>409, :action=>["create", {:_id=>"%{[@metadata][_ingest_document][id]}", :_index=>"logs-panw.panos-ep", :routing=>nil, :pipeline=>"_none"}, {"ecs"=>{"version"=>"8.11.0"}, "message"=>"192.168.15.224,175.16.199.1,192.168.1.63,175.16.199.1,new_outbound_from_trust,,,dns,vsys1,trust,untrust,ethernet1/2,ethernet1/1,send_to_mac,2018/11/30 16:09:52,24243,1,5511,53,21643,53,0x400019,udp,allow,242,72,170,2,2018/11/30 16:09:19,0,any,0,32091208,0x0,192.168.0.0-192.168.255.255,United States,0,1,1,aged-out,0,0,0,0,,PA-220,from-policy,,,0,,0,,N/A,0,0,0,0", "network"=>{"application"=>"dns", "type"=>"ipv4", "community_id"=>["1:4RiaH+n0JwxG6zcL26BuXxb9VkY=", "1:tMlsHUEsYDQ3Vv3JAJSu15cqkNE="], "packets"=>2, "bytes"=>242, "transport"=>"udp"}, "panw"=>{"panos"=>{"network"=>{"nat"=>{"community_id"=>"1:tMlsHUEsYDQ3Vv3JAJSu15cqkNE="}}, "ruleset"=>"new_outbound_from_trust", "endreason"=>"aged-out", "imsi"=>"0", "action"=>"allow", "action_source"=>"from-policy", "sub_type"=>"end", "device_group_hierarchy1"=>"0", "sctp"=>{"chunks_sent"=>0, "assoc_id"=>"0", "chunks_received"=>0, "chunks"=>0}, "type"=>"TRAFFIC", "log_profile"=>"send_to_mac", "sequence_number"=>"32091208", "action_flags"=>"0x0", "device_group_hierarchy2"=>"0", "url"=>{"category"=>"any"}, "parent_session"=>{"id"=>"0"}, "device_group_hierarchy3"=>"0", "device_group_hierarchy4"=>"0", "repeat_count"=>1, "virtual_sys"=>"vsys1", "tunnel_type"=>"N/A", "flow_id"=>"24243"}}, "log"=>{"syslog"=>{"severity"=>{"name"=>"Informational", "code"=>6}, "version"=>"1", "facility"=>{"name"=>"user-level", "code"=>1}, "priority"=>14, "hostname"=>"PA-220"}, "source"=>{"address"=>"172.18.0.4:56854"}}, "data_stream"=>{"namespace"=>"ep", "dataset"=>"panw.panos", "type"=>"logs"}, "elastic_agent"=>{"snapshot"=>false, "id"=>"b610b7a8-df73-46de-ab1a-7fb5dcb51c9b", "version"=>"8.11.1"}, "rule"=>{"name"=>"new_outbound_from_trust"}, "related"=>{"hosts"=>["PA-220"], "ip"=>["192.168.15.224", "175.16.199.1", "192.168.1.63"]}, "destination"=>{"geo"=>{"name"=>"United States"}, "ip"=>"175.16.199.1", "port"=>53, "nat"=>{"port"=>53, "ip"=>"175.16.199.1"}, "packets"=>1, "bytes"=>170}, "event"=>{"timezone"=>"+00:00", "outcome"=>"success", "action"=>"flow_terminated", "duration"=>0, "start"=>"2018-11-30T16:09:19.000Z", "created"=>"2018-11-30T16:09:52.000Z", "category"=>["network"], "kind"=>"event", "end"=>2018-11-30T16:09:19.000Z, "dataset"=>"panw.panos", "original"=>"<14>1 2018-11-30T16:09:52Z PA-220 - - - - 1,2018/11/30 16:09:52,012801096514,TRAFFIC,end,2049,2018/11/30 16:09:52,192.168.15.224,175.16.199.1,192.168.1.63,175.16.199.1,new_outbound_from_trust,,,dns,vsys1,trust,untrust,ethernet1/2,ethernet1/1,send_to_mac,2018/11/30 16:09:52,24243,1,5511,53,21643,53,0x400019,udp,allow,242,72,170,2,2018/11/30 16:09:19,0,any,0,32091208,0x0,192.168.0.0-192.168.255.255,United States,0,1,1,aged-out,0,0,0,0,,PA-220,from-policy,,,0,,0,,N/A,0,0,0,0", "type"=>["allowed", "end", "connection"]}, "observer"=>{"ingress"=>{"zone"=>"trust", "interface"=>{"name"=>"ethernet1/2"}}, "serial_number"=>"012801096514", "type"=>"firewall", "hostname"=>"PA-220", "vendor"=>"Palo Alto Networks", "egress"=>{"zone"=>"untrust", "interface"=>{"name"=>"ethernet1/1"}}, "product"=>"PAN-OS"}, "source"=>{"geo"=>{"name"=>"192.168.0.0-192.168.255.255"}, "ip"=>"192.168.15.224", "port"=>5511, "nat"=>{"port"=>21643, "ip"=>"192.168.1.63"}, "packets"=>1, "bytes"=>72}, "tags"=>["preserve_original_event", "panw-panos", "forwarded", "beats_input_codec_plain_applied", "_geoip_database_unavailable_GeoLite2-City.mmdb", "_geoip_database_unavailable_GeoLite2-City.mmdb", "_geoip_database_unavailable_GeoLite2-City.mmdb", "_geoip_database_unavailable_GeoLite2-City.mmdb", "_geoip_database_unavailable_GeoLite2-ASN.mmdb", "_geoip_database_unavailable_GeoLite2-ASN.mmdb", "_geoip_database_unavailable_GeoLite2-ASN.mmdb", "_geoip_database_unavailable_GeoLite2-ASN.mmdb"], "agent"=>{"type"=>"filebeat", "ephemeral_id"=>"c547f492-8435-40c5-8395-04ad95adffce", "version"=>"8.11.1", "name"=>"docker-fleet-agent", "id"=>"b610b7a8-df73-46de-ab1a-7fb5dcb51c9b"}, "labels"=>{"nat_translated"=>true}, "input"=>{"type"=>"tcp"}, "@timestamp"=>2018-11-30T16:09:52.000Z}], :response=>{"create"=>{"status"=>409, "error"=>{"type"=>"version_conflict_engine_exception", "reason"=>"[%{[@metadata][_ingest_document][id]}]: version conflict, document already exists (current version [9])", "index_uuid"=>"kHyWSHBOSfqankKX9IDHeg", "shard"=>"0", "index"=>".ds-logs-panw.panos-ep-2023.11.16-000001"}}}}

Problem details

I found that the document_id is not getting resolved, like it is expected in this template:
https://github.com/elastic/elastic-package/blob/main/internal/stack/_static/logstash.conf.tmpl#L31
when the _ingest_document is missing.

So first document was created with:

 "_id": "%{[@metadata][_ingest_document][id]}",

and the remaining documents had the same string as the id. That's why only one document was stored, and rest were getting document version conflicts.

Solution

The solution is to add the condition checking for [@metadata][_ingest_document][id] existance and alternate the output configuration.

@elasticmachine
Copy link
Collaborator

💚 Build Succeeded

cc @aleksmaus

Copy link
Contributor

@bhapas bhapas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@aleksmaus aleksmaus merged commit 3992988 into elastic:main Sep 23, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants