Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fluentd not picking up all container logs #2348

Closed
kasunt84 opened this issue Mar 27, 2019 · 9 comments
Closed

Fluentd not picking up all container logs #2348

kasunt84 opened this issue Mar 27, 2019 · 9 comments
Labels

Comments

@kasunt84
Copy link

kasunt84 commented Mar 27, 2019

  • fluentd version: 1.4.1
  • Environment information:
    • Operating system: Debian 9
    • Kernel version: 4.14.65+
  • Container input configuration looks like below,
    <source>
      @id fluentd-containers.log
      @type tail
      path /var/log/containers/*.log
      pos_file /var/log/es-containers.log.pos
      time_format %Y-%m-%dT%H:%M:%S.%NZ
      tag raw.kubernetes.*
      format json
      read_from_head true
    </source>

    # Filter out stuff we dont care about
    <filter raw.kubernetes.**>
      @type grep
      <exclude>
        key log
        pattern POST\ \/api\/v2\/spans HTTP\/1\.1
      </exclude>
    </filter>
    <filter raw.kubernetes.**>
      @type grep
      <exclude>
        key log
        pattern GET\ \/management\/health HTTP\/1\.1
      </exclude>
    </filter>

    # Detect exceptions in the log output and forward them as one log entry.
    # Unfortunately this doesnt work well for Java/Spring so look at the output to see how we deal with it
    <match raw.kubernetes.**>
      @id raw.kubernetes
      @type detect_exceptions
      remove_tag_prefix raw
      message log
      stream stream
      multiline_flush_interval 5
      max_bytes 500000
      max_lines 1000
    </match>
  • On Kubernetes, I'm seeing a strange behaviour with the tail plugin where not all logs are picked up. This seems to be only affecting 2 of the 8 nodes so far where the daemonset is running.

Logging shows,

2019-03-27 06:30:32 +0000 [info]: fluent/log.rb:322:info: starting fluentd worker pid=8 ppid=1 worker=0
2019-03-27 06:30:32 +0000 [info]: fluent/log.rb:322:info: listening port port=24224 bind="0.0.0.0"
2019-03-27 06:30:32 +0000 [info]: [fluentd-containers.log] following tail of /var/log/containers/jg-jaeger-spark-1545781740-t6z5j_monitoring_jg-jaeger-spark-c0f112b01739ce6e4e4a12dc7220bc1d50e432c935c0713e72a37860d6a8cb11.log
2019-03-27 06:30:32 +0000 [info]: [fluentd-containers.log] following tail of /var/log/containers/jg-jaeger-cassandra-schema-fclxc_monitoring_jg-jaeger-cassandra-schema-13846a531a3310541f8559702f12b97954b713275ff87b17bb34c011d97cd159.log
2019-03-27 06:30:32 +0000 [info]: [fluentd-containers.log] following tail of /var/log/containers/kube-dns-fdfbdf56b-hk8qt_kube-system_sidecar-dddedab8cf95e103e7ebeae5670ba332e5e78b10b6e6d86656a9d1953c154970.log
2019-03-27 06:30:32 +0000 [info]: [fluentd-containers.log] following tail of /var/log/containers/istio-sidecar-578fcfb7-8pbpk_control_sidecar-bd37e19b8f21753ab831ba98cdadc2303151bfc9ea57e60c68c6a901e0d41c19.log
2019-03-27 06:30:32 +0000 [trace]: fluent/log.rb:281:trace: fetching pod metadata: control/istio-sidecar-578fcfb7-8pbpk

However I can see there's more logs that fluentd should've discovered,

lrwxrwxrwx 1 root root 63 Dec  3 08:44 kube-proxy-gke-riro-cluster-control-pool-v3-432ba7fc-j44j_kube-system_kube-proxy-68bd3d94274b6a239c31be50c4bf7148d60ed2642ff58e17f950a8eb21c2b517.log -> /var/log/pods/fb28d7d42bfae6ea34eaf0e51127fdde/kube-proxy/0.log
lrwxrwxrwx 1 root root 69 Dec  3 08:54 op-oauth2-proxy-dcd86b57f-m7fn7_control_oauth2-proxy-98b7d84cc3cc357762279fafaf4e8707242bc3347155642927a9d55861fa237b.log -> /var/log/pods/0b1dba66-f6d9-11e8-a684-42010a9a0121/oauth2-proxy/0.log
lrwxrwxrwx 1 root root 68 Dec  3 10:29 wv-weave-probe-58r4w_monitoring_weave-agent-64aa43f39f626b4baf94b5f6329206440a9e62de180e0803054037f8f51f6d6a.log -> /var/log/pods/58842227-f6e6-11e8-a51b-42010a9a019f/weave-agent/0.log
lrwxrwxrwx 1 root root 63 Dec  3 10:33 kb-kibana-54b7575b89-qdkng_monitoring_kibana-811c3c94fecff46ab62b3eb386c10c3bb3fe7aae3e32553726be0a8d9b960227.log -> /var/log/pods/d2476768-f6e6-11e8-a51b-42010a9a019f/kibana/0.log
lrwxrwxrwx 1 root root 62 Dec  3 10:34 istio-mixer-telemetry-7b68cb5fb8-5rwc9_control_mixer-52d8a52fe25ab339e60d7d95a84dd26d0c9a6819a4ab841b3de7c69af0e44029.log -> /var/log/pods/f77bb71e-f6e6-11e8-a51b-42010a9a019f/mixer/0.log
lrwxrwxrwx 1 root root 68 Dec  3 10:34 istio-mixer-telemetry-7b68cb5fb8-5rwc9_control_istio-proxy-90b2094ac6f25a63edd7e3d85c148512abdf00fa27672031069ed4c7d8a3c81b.log -> /var/log/pods/f77bb71e-f6e6-11e8-a51b-42010a9a019f/istio-proxy/0.log
lrwxrwxrwx 1 root root 64 Dec  3 10:35 istio-sidecar-578fcfb7-8pbpk_control_sidecar-bd37e19b8f21753ab831ba98cdadc2303151bfc9ea57e60c68c6a901e0d41c19.log -> /var/log/pods/f7f6cdf6-f6e6-11e8-a51b-42010a9a019f/sidecar/0.log
lrwxrwxrwx 1 root root 83 Dec  3 10:37 jg-jaeger-cassandra-schema-fclxc_monitoring_jg-jaeger-cassandra-schema-13846a531a3310541f8559702f12b97954b713275ff87b17bb34c011d97cd159.log -> /var/log/pods/661532fa-f6e7-11e8-a51b-42010a9a019f/jg-jaeger-cassandra-schema/0.log
lrwxrwxrwx 1 root root 76 Dec  3 10:37 jg-jaeger-collector-8465876475-p5pg2_monitoring_jg-jaeger-collector-7d51ec49b43ac779821c733f54d6b01a5d1d783587c4493965f77ad15c6dc707.log -> /var/log/pods/661534c5-f6e7-11e8-a51b-42010a9a019f/jg-jaeger-collector/0.log
lrwxrwxrwx 1 root root 65 Dec 10 20:47 heapster-v1.6.0-beta.1-84b69f457-bg55z_kube-system_heapster-dcfb2ab090e6b90b62df93746cc22a3c4b2f18e98d1297097ce9a8d996343868.log -> /var/log/pods/c5471359-fcbc-11e8-9228-42010a9a0067/heapster/0.log
lrwxrwxrwx 1 root root 71 Dec 10 20:47 heapster-v1.6.0-beta.1-84b69f457-bg55z_kube-system_heapster-nanny-d8ef15ddaeecbc6dbffd476f34f41016e6d9ee15cd119b2f6162505be6e85b71.log -> /var/log/pods/c5471359-fcbc-11e8-9228-42010a9a0067/heapster-nanny/0.log
lrwxrwxrwx 1 root root 72 Dec 25 23:49 jg-jaeger-spark-1545781740-t6z5j_monitoring_jg-jaeger-spark-c0f112b01739ce6e4e4a12dc7220bc1d50e432c935c0713e72a37860d6a8cb11.log -> /var/log/pods/abc6e301-089f-11e9-b13c-42010a9a0143/jg-jaeger-spark/0.log
lrwxrwxrwx 1 root root 72 Dec 26 23:49 jg-jaeger-spark-1545868140-t5zk5_monitoring_jg-jaeger-spark-d7f483b895cfa4396d12f5e8bca438543b2c158827458c59986c111c8732b05c.log -> /var/log/pods/d591c51e-0968-11e9-b13c-42010a9a0143/jg-jaeger-spark/0.log
lrwxrwxrwx 1 root root 72 Jan  1 23:50 jg-jaeger-spark-1546386540-ctb5l_monitoring_jg-jaeger-spark-557ae2dc4dc80c251899d9af3383d8646a5a8645ed9c943a55ce765ee374f21f.log -> /var/log/pods/d2ad9b0b-0e1f-11e9-b13c-42010a9a0143/jg-jaeger-spark/1.log
lrwxrwxrwx 1 root root 77 Jan 14 06:34 l7-default-backend-7ff48cffd7-6gsfk_kube-system_default-http-backend-4fe905fbb3fd360e71439ba582b52213b41ae8635cad55331aa0428d812e5c3c.log -> /var/log/pods/61bb5a0d-17c6-11e9-8b3b-42010a9a01d1/default-http-backend/0.log
lrwxrwxrwx 1 root root 72 Jan 19 23:49 jg-jaeger-spark-1547941740-npnrj_monitoring_jg-jaeger-spark-40d0f3047bf583317cc1fddc79c6ab26cfe6abd3f5b0f732c7b767d99f0d795c.log -> /var/log/pods/cd9a619f-1c44-11e9-8b3b-42010a9a01d1/jg-jaeger-spark/0.log
lrwxrwxrwx 1 root root 72 Jan 23 23:49 jg-jaeger-spark-1548287340-9rgrs_monitoring_jg-jaeger-spark-85a9df5289867b6e927907c55ee1217a4ebcd0722c766c3fd32d384ab2eb0b3c.log -> /var/log/pods/74f7040f-1f69-11e9-95f3-42010a9a0fcf/jg-jaeger-spark/1.log
lrwxrwxrwx 1 root root 64 Jan 29 21:27 cm-manager-79478c4dcc-fqz5r_control_manager-4a4c4fc86daf0162959b0add4c03460eac915a765444397d22682fead4aea20b.log -> /var/log/pods/02d96fe0-f6de-11e8-a51b-42010a9a019f/manager/6.log
lrwxrwxrwx 1 root root 68 Feb 19 22:40 istio-egressgateway-789b7c4d75-j4ftx_control_istio-proxy-3426e3494a422f2e5b2f2cb7baef1e65ea1e595f5ee82d74f56356bb79e0fc72.log -> /var/log/pods/4e3bf790-3497-11e9-a565-42010a9a00c9/istio-proxy/0.log
lrwxrwxrwx 1 root root 81 Feb 21 03:46 ni-nginx-ingress-controller-pdqjb_control_nginx-ingress-controller-33f290731dcd593dc71f81860d126e0f7e950c0b3cfbc116e896916ca6f79b1a.log -> /var/log/pods/3db2a8d2-358b-11e9-a565-42010a9a00c9/nginx-ingress-controller/0.log
lrwxrwxrwx 1 root root 94 Feb 21 03:46 ni-nginx-ingress-custom-backend-handlers-7fb66cdc7c-hh7cg_control_nginx-ingress-custom-backend-handlers-8532b15c6cc48643e17339156b0293256ebcbb3372a2b023425a0853b1012a53.log -> /var/log/pods/3db3315b-358b-11e9-a565-42010a9a00c9/nginx-ingress-custom-backend-handlers/0.log
lrwxrwxrwx 1 root root 62 Feb 25 03:37 istio-mixer-telemetry-7b68cb5fb8-5rwc9_control_mixer-d6d5648c9f02da47074c2896d49b61064b1fb6a7bda55b374db1f18bc3c6627f.log -> /var/log/pods/f77bb71e-f6e6-11e8-a51b-42010a9a019f/mixer/1.log
lrwxrwxrwx 1 root root 68 Mar  8 10:54 istio-ingressgateway-6c89d564b7-rm6nf_control_istio-proxy-cb63de12aeb14e4af39fac50cacdbb73fe630a9c7703a85c9ef69895ac2cad05.log -> /var/log/pods/87c01fc7-4190-11e9-a565-42010a9a00c9/istio-proxy/0.log
lrwxrwxrwx 1 root root 63 Mar  9 06:32 tiller-deploy-57c574bfb8-kg6fn_kube-system_tiller-16a270f1e2ec207558747d486743a943037c0bfd0dace84e965de3afd261c638.log -> /var/log/pods/25b87df4-4235-11e9-a565-42010a9a00c9/tiller/0.log
lrwxrwxrwx 1 root root 64 Mar 19 22:32 cm-manager-79478c4dcc-fqz5r_control_manager-18f306b28148ae454e476f991aaf353ec5d443c73ec7a46b73792ce6c1d91b7b.log -> /var/log/pods/02d96fe0-f6de-11e8-a51b-42010a9a019f/manager/7.log
lrwxrwxrwx 1 root root 64 Mar 19 22:35 kube-dns-fdfbdf56b-hk8qt_kube-system_kubedns-7f95859704a26e26cd20c764d30e1a16f8bd7b6809c08127bc9eb1319cfd8d4f.log -> /var/log/pods/4fbe8291-4a97-11e9-a27e-42010a9a0110/kubedns/0.log
lrwxrwxrwx 1 root root 64 Mar 19 22:35 kube-dns-fdfbdf56b-hk8qt_kube-system_dnsmasq-9c159a2e7d19bf29ccb86c0d52171f1a7ae59e3a3dc20a19a3f58de6c55e3e74.log -> /var/log/pods/4fbe8291-4a97-11e9-a27e-42010a9a0110/dnsmasq/0.log
lrwxrwxrwx 1 root root 64 Mar 19 22:35 kube-dns-fdfbdf56b-hk8qt_kube-system_sidecar-dddedab8cf95e103e7ebeae5670ba332e5e78b10b6e6d86656a9d1953c154970.log -> /var/log/pods/4fbe8291-4a97-11e9-a27e-42010a9a0110/sidecar/0.log
lrwxrwxrwx 1 root root 73 Mar 19 22:35 kube-dns-fdfbdf56b-hk8qt_kube-system_prometheus-to-sd-df7c2f9de99bd11b5de16cc9f501fd09f0a6a65c20566b7f020e96b806202b73.log -> /var/log/pods/4fbe8291-4a97-11e9-a27e-42010a9a0110/prometheus-to-sd/0.log
lrwxrwxrwx 1 root root 66 Mar 25 23:57 cm-webhook-ca-sync-1553558220-vzttv_control_ca-helper-cffbee7690bfb82a97e4fa9172e931443d1514e5f28c954f9336a7a57ff53e31.log -> /var/log/pods/ae50af56-4f59-11e9-a27e-42010a9a0110/ca-helper/0.log
lrwxrwxrwx 1 root root 66 Mar 25 23:59 cm-webhook-ca-sync-1553558340-b9qlv_control_ca-helper-db0d2ed40a3fa2144b794c282611b8672a83625c54dcdd36e64d6a62dbfcb15e.log -> /var/log/pods/f62347de-4f59-11e9-a27e-42010a9a0110/ca-helper/0.log
lrwxrwxrwx 1 root root 74 Mar 26 17:21 spw-test-website-56f556b9bd-974n2_website_test-website-e43c607e50352e4102dbea1a4cc54544b7da009578f77977442f095922b9b507.log -> /var/log/pods/9a6c7901-4feb-11e9-a27e-42010a9a0110/test-website/0.log
lrwxrwxrwx 1 root root 67 Mar 27 06:30 fd-fluentd-f7mnv_monitoring_fd-fluentd-25c33fb80f74c032443b4885648e995006aa41276bee0b4b92ba2507306c386c.log -> /var/log/pods/c1f1fa4d-5059-11e9-a27e-42010a9a0110/fd-fluentd/0.log

Am i missing something here ?

@yusufgungor
Copy link

yusufgungor commented Mar 29, 2019

Hi, we are experiencing the similar situation. Logs randomly missing.

  • fluentd version: 1.3.3
  • Environment information:
  • Operating system: Debian 9
  • Kernel version: 4.14.67

Container input configuration like below:

<system>
  log_level debug
</system>
 
<source>
  @type tail
  tag fn.container
 
  <parse>
    @type multiline
     format_firstline /\d{4}-\d{1,2}-\d{1,2}/
     format1 /^(?<Timestamp>\d{4}-\d{1,2}-\d{1,2}(T|\s)\d{1,2}:\d{1,2}:\d{1,2}((,|\.)\d{1,3})?(\+\d{4})?) (?<Project>[^ ]*) (?<Service>[^ ]*) (?<Module>[^ ]*) (?<ProcessNo>[^ ]*)-(?<ThreadNo>[^ ]*) (?<Level>[^ ]*) (?<ServerIP>[^ ]*) (?<ClientIP>[^ ]*) (?<Log>.*)/
  </parse>
 
  path /var/lib/mesos/slave/slaves/*/frameworks/*/executors/*/runs/latest/stdout
  pos_file /var/log/td-agent/fn.container.pos
  path_key "Log Path"
  limit_recently_modified 10m
  read_lines_limit 10000
  rotate_wait 30s
</source>

# Log Forwarding
<match fn.container>
  @type forward
 
  # primary host
  <server>
    host 192.168.8.220
    port 24224
  </server>

  <buffer>
    type file
    path /tmp
    chunk_limit_size 16m
    queued_chunks_limit_size 512
    overflow_action throw_exception
    retry_wait 15s
    flush_thread_count 32
    flush_interval 30s
  </buffer>
</match>

We are using DC/OS which have similar config with Kubernetes to tail log files.

Our all nodes affected. Some logs are missing. stdout the logs on aggregator side but it seems logs not taken by aggregator.

Agent 1 Files:

-rw-r--r--. 1 root root 48448 Mar 29 00:24 /var/lib/mesos/slave/slaves/slave01/frameworks/framework_id/executors/cron01_20190329002359S3Aga.f2c045c8-51b8-11e9-81c8-da0c25427daa/runs/latest/stdout
-rw-r--r--. 1 root root 48448 Mar 29 01:48 /var/lib/mesos/slave/slaves/slave01/frameworks/framework_id/executors/cron01_20190329014759pXYE8.ae4f8e44-51c4-11e9-81c8-da0c25427daa/runs/latest/stdout
-rw-r--r--. 1 root root 48448 Mar 29 02:12 /var/lib/mesos/slave/slaves/slave01/frameworks/framework_id/executors/cron01_20190329021159yEgW4.08a4ae5d-51c8-11e9-81c8-da0c25427daa/runs/latest/stdout
-rw-r--r--. 1 root root 48448 Mar 29 02:24 /var/lib/mesos/slave/slaves/slave01/frameworks/framework_id/executors/cron01_20190329022359NNP8V.b5cf2adf-51c9-11e9-81c8-da0c25427daa/runs/latest/stdout
-rw-r--r--. 1 root root 48448 Mar 29 02:36 /var/lib/mesos/slave/slaves/slave01/frameworks/framework_id/executors/cron01_20190329023559ixZ4N.62f90b25-51cb-11e9-81c8-da0c25427daa/runs/latest/stdout
-rw-r--r--. 1 root root 48448 Mar 29 02:48 /var/lib/mesos/slave/slaves/slave01/frameworks/framework_id/executors/cron01_20190329024759tmK6h.102386a7-51cd-11e9-81c8-da0c25427daa/runs/latest/stdout
-rw-r--r--. 1 root root 48448 Mar 29 03:12 /var/lib/mesos/slave/slaves/slave01/frameworks/framework_id/executors/cron01_20190329031159ZvAZW.6a8243b0-51d0-11e9-81c8-da0c25427daa/runs/latest/stdout
-rw-r--r--. 1 root root 48448 Mar 29 03:24 /var/lib/mesos/slave/slaves/slave01/frameworks/framework_id/executors/cron01_20190329032359HJj0E.17a28702-51d2-11e9-81c8-da0c25427daa/runs/latest/stdout
-rw-r--r--. 1 root root 52730 Mar 29 03:48 /var/lib/mesos/slave/slaves/slave01/frameworks/framework_id/executors/cron01_20190329034759xSHJQ.71f757fa-51d5-11e9-81c8-da0c25427daa/runs/latest/stdout
-rw-r--r--. 1 root root 48448 Mar 29 04:12 /var/lib/mesos/slave/slaves/slave01/frameworks/framework_id/executors/cron01_20190329041159CpEV7.cc4c02e3-51d8-11e9-81c8-da0c25427daa/runs/latest/stdout
-rw-r--r--. 1 root root 48448 Mar 29 04:24 /var/lib/mesos/slave/slaves/slave01/frameworks/framework_id/executors/cron01_20190329042359JegGb.7975e325-51da-11e9-81c8-da0c25427daa/runs/latest/stdout
-rw-r--r--. 1 root root 48448 Mar 29 06:12 /var/lib/mesos/slave/slaves/slave01/frameworks/framework_id/executors/cron01_20190329061159MnSDr.8ff48fea-51e9-11e9-81c8-da0c25427daa/runs/latest/stdout
-rw-r--r--. 1 root root 48448 Mar 29 06:48 /var/lib/mesos/slave/slaves/slave01/frameworks/framework_id/executors/cron01_20190329064759WqZJg.97738f44-51ee-11e9-81c8-da0c25427daa/runs/latest/stdout

Agent 1 Pos File Status

/var/lib/mesos/slave/slaves/slave01/frameworks/framework_id/executors/cron01_20190329002359S3Aga.f2c045c8-51b8-11e9-81c8-da0c25427daa/runs/latest/stdout	ffffffffffffffff	0000000000343b0a
/var/lib/mesos/slave/slaves/slave01/frameworks/framework_id/executors/cron01_20190329014759pXYE8.ae4f8e44-51c4-11e9-81c8-da0c25427daa/runs/latest/stdout	ffffffffffffffff	0000000000540f09
/var/lib/mesos/slave/slaves/slave01/frameworks/framework_id/executors/cron01_20190329021159yEgW4.08a4ae5d-51c8-11e9-81c8-da0c25427daa/runs/latest/stdout	ffffffffffffffff	00000000005e0aba
/var/lib/mesos/slave/slaves/slave01/frameworks/framework_id/executors/cron01_20190329022359NNP8V.b5cf2adf-51c9-11e9-81c8-da0c25427daa/runs/latest/stdout	ffffffffffffffff	000000000060137b
/var/lib/mesos/slave/slaves/slave01/frameworks/framework_id/executors/cron01_20190329023559ixZ4N.62f90b25-51cb-11e9-81c8-da0c25427daa/runs/latest/stdout	ffffffffffffffff	000000000062160f
/var/lib/mesos/slave/slaves/slave01/frameworks/framework_id/executors/cron01_20190329024759tmK6h.102386a7-51cd-11e9-81c8-da0c25427daa/runs/latest/stdout	ffffffffffffffff	00000000000a070c
/var/lib/mesos/slave/slaves/slave01/frameworks/framework_id/executors/cron01_20190329031159ZvAZW.6a8243b0-51d0-11e9-81c8-da0c25427daa/runs/latest/stdout	ffffffffffffffff	000000000014126c
/var/lib/mesos/slave/slaves/slave01/frameworks/framework_id/executors/cron01_20190329032359HJj0E.17a28702-51d2-11e9-81c8-da0c25427daa/runs/latest/stdout	ffffffffffffffff	00000000001c0b8a
/var/lib/mesos/slave/slaves/slave01/frameworks/framework_id/executors/cron01_20190329034759xSHJQ.71f757fa-51d5-11e9-81c8-da0c25427daa/runs/latest/stdout	ffffffffffffffff	0000000000261065
/var/lib/mesos/slave/slaves/slave01/frameworks/framework_id/executors/cron01_20190329041159CpEV7.cc4c02e3-51d8-11e9-81c8-da0c25427daa/runs/latest/stdout	ffffffffffffffff	0000000000360144
/var/lib/mesos/slave/slaves/slave01/frameworks/framework_id/executors/cron01_20190329042359JegGb.7975e325-51da-11e9-81c8-da0c25427daa/runs/latest/stdout	ffffffffffffffff	00000000003a028e
/var/lib/mesos/slave/slaves/slave01/frameworks/framework_id/executors/cron01_20190329061159MnSDr.8ff48fea-51e9-11e9-81c8-da0c25427daa/runs/latest/stdout	ffffffffffffffff	00000000005e114b
/var/lib/mesos/slave/slaves/slave01/frameworks/framework_id/executors/cron01_20190329064759WqZJg.97738f44-51ee-11e9-81c8-da0c25427daa/runs/latest/stdout	ffffffffffffffff	00000000000a1833

Agent 2 Files:

-rw-r--r--. 1 root root 48448 Mar 29 00:12 /var/lib/mesos/slave/slaves/slave02/frameworks/framework_id/executors/cron01_20190329001159aONcF.4595c946-51b7-11e9-81c8-da0c25427daa/runs/latest/stdout
-rw-r--r--. 1 root root 48448 Mar 29 00:36 /var/lib/mesos/slave/slaves/slave02/frameworks/framework_id/executors/cron01_20190329003600nnedi.9fea742e-51ba-11e9-81c8-da0c25427daa/runs/latest/stdout
-rw-r--r--. 1 root root 48448 Mar 29 00:48 /var/lib/mesos/slave/slaves/slave02/frameworks/framework_id/executors/cron01_20190329004759JE2ta.4c7bbcf0-51bc-11e9-81c8-da0c25427daa/runs/latest/stdout
-rw-r--r--. 1 root root 49113 Mar 29 01:00 /var/lib/mesos/slave/slaves/slave02/frameworks/framework_id/executors/cron01_20190329005959zitq3.f9a66080-51bd-11e9-81c8-da0c25427daa/runs/latest/stdout
-rw-r--r--. 1 root root 48448 Mar 29 01:12 /var/lib/mesos/slave/slaves/slave02/frameworks/framework_id/executors/cron01_20190329011159Q9J30.a6d0dd0a-51bf-11e9-81c8-da0c25427daa/runs/latest/stdout
-rw-r--r--. 1 root root 48448 Mar 29 01:24 /var/lib/mesos/slave/slaves/slave02/frameworks/framework_id/executors/cron01_201903290123597bX3A.53fb327c-51c1-11e9-81c8-da0c25427daa/runs/latest/stdout
-rw-r--r--. 1 root root 52438 Mar 29 01:36 /var/lib/mesos/slave/slaves/slave02/frameworks/framework_id/executors/cron01_20190329013559rcSXq.012560e2-51c3-11e9-81c8-da0c25427daa/runs/latest/stdout
-rw-r--r--. 1 root root 49113 Mar 29 02:00 /var/lib/mesos/slave/slaves/slave02/frameworks/framework_id/executors/cron01_20190329015959hvsBl.5b79e3b4-51c6-11e9-81c8-da0c25427daa/runs/latest/stdout
-rw-r--r--. 1 root root 49113 Mar 29 03:00 /var/lib/mesos/slave/slaves/slave02/frameworks/framework_id/executors/cron01_20190329025959yYjWV.c04aef78-51ce-11e9-81c8-da0c25427daa/runs/latest/stdout
-rw-r--r--. 1 root root 48448 Mar 29 03:36 /var/lib/mesos/slave/slaves/slave02/frameworks/framework_id/executors/cron01_20190329033559Xg7fX.c4cc6748-51d3-11e9-81c8-da0c25427daa/runs/latest/stdout
-rw-r--r--. 1 root root 49113 Mar 29 04:00 /var/lib/mesos/slave/slaves/slave02/frameworks/framework_id/executors/cron01_20190329035959gRsFM.21edc5bc-51d7-11e9-81c8-da0c25427daa/runs/latest/stdout
-rw-r--r--. 1 root root 48448 Mar 29 04:36 /var/lib/mesos/slave/slaves/slave02/frameworks/framework_id/executors/cron01_20190329043559yMteH.26bc732b-51dc-11e9-81c8-da0c25427daa/runs/latest/stdout
-rw-r--r--. 1 root root 48448 Mar 29 04:48 /var/lib/mesos/slave/slaves/slave02/frameworks/framework_id/executors/cron01_20190329044759rZBp3.d3cde86d-51dd-11e9-81c8-da0c25427daa/runs/latest/stdout
-rw-r--r--. 1 root root 49113 Mar 29 05:00 /var/lib/mesos/slave/slaves/slave02/frameworks/framework_id/executors/cron01_20190329045959OupXA.83cabed3-51df-11e9-81c8-da0c25427daa/runs/latest/stdout
-rw-r--r--. 1 root root 48448 Mar 29 05:12 /var/lib/mesos/slave/slaves/slave02/frameworks/framework_id/executors/cron01_20190329051159AUlwW.2e263cd6-51e1-11e9-81c8-da0c25427daa/runs/latest/stdout
-rw-r--r--. 1 root root 48448 Mar 29 05:24 /var/lib/mesos/slave/slaves/slave02/frameworks/framework_id/executors/cron01_20190329052359iGV6q.db6579d8-51e2-11e9-81c8-da0c25427daa/runs/latest/stdout
-rw-r--r--. 1 root root 48448 Mar 29 05:36 /var/lib/mesos/slave/slaves/slave02/frameworks/framework_id/executors/cron01_20190329053559rQ611.8875b79e-51e4-11e9-81c8-da0c25427daa/runs/latest/stdout
-rw-r--r--. 1 root root 69676 Mar 29 05:48 /var/lib/mesos/slave/slaves/slave02/frameworks/framework_id/executors/cron01_20190329054759wjYE3.359fbdf0-51e6-11e9-81c8-da0c25427daa/runs/latest/stdout
-rw-r--r--. 1 root root 49908 Mar 29 06:00 /var/lib/mesos/slave/slaves/slave02/frameworks/framework_id/executors/cron01_20190329055959tpc31.e5a2fcf7-51e7-11e9-81c8-da0c25427daa/runs/latest/stdout
-rw-r--r--. 1 root root 50204 Mar 29 06:24 /var/lib/mesos/slave/slaves/slave02/frameworks/framework_id/executors/cron01_201903290623594cbkc.3d243c8c-51eb-11e9-81c8-da0c25427daa/runs/latest/stdout
-rw-r--r--. 1 root root 65991 Mar 29 06:36 /var/lib/mesos/slave/slaves/slave02/frameworks/framework_id/executors/cron01_20190329063559X4gCu.ea4913c2-51ec-11e9-81c8-da0c25427daa/runs/latest/stdout

Agent 2 Pos File Status

/var/lib/mesos/slave/slaves/slave02/frameworks/framework_id/executors/cron01_20190329001159aONcF.4595c946-51b7-11e9-81c8-da0c25427daa/runs/latest/stdout	ffffffffffffffff	00000000000a0c16
/var/lib/mesos/slave/slaves/slave02/frameworks/framework_id/executors/cron01_20190329003600nnedi.9fea742e-51ba-11e9-81c8-da0c25427daa/runs/latest/stdout	ffffffffffffffff	00000000000a1643
/var/lib/mesos/slave/slaves/slave02/frameworks/framework_id/executors/cron01_20190329004759JE2ta.4c7bbcf0-51bc-11e9-81c8-da0c25427daa/runs/latest/stdout	ffffffffffffffff	00000000000a17bc
/var/lib/mesos/slave/slaves/slave02/frameworks/framework_id/executors/cron01_20190329005959zitq3.f9a66080-51bd-11e9-81c8-da0c25427daa/runs/latest/stdout	ffffffffffffffff	00000000000c0029
/var/lib/mesos/slave/slaves/slave02/frameworks/framework_id/executors/cron01_20190329011159Q9J30.a6d0dd0a-51bf-11e9-81c8-da0c25427daa/runs/latest/stdout	ffffffffffffffff	00000000000c042e
/var/lib/mesos/slave/slaves/slave02/frameworks/framework_id/executors/cron01_201903290123597bX3A.53fb327c-51c1-11e9-81c8-da0c25427daa/runs/latest/stdout	ffffffffffffffff	00000000000c0909
/var/lib/mesos/slave/slaves/slave02/frameworks/framework_id/executors/cron01_20190329013559rcSXq.012560e2-51c3-11e9-81c8-da0c25427daa/runs/latest/stdout	ffffffffffffffff	00000000000c0f45
/var/lib/mesos/slave/slaves/slave02/frameworks/framework_id/executors/cron01_20190329015959hvsBl.5b79e3b4-51c6-11e9-81c8-da0c25427daa/runs/latest/stdout	ffffffffffffffff	00000000000c16e1
/var/lib/mesos/slave/slaves/slave02/frameworks/framework_id/executors/cron01_20190329025959yYjWV.c04aef78-51ce-11e9-81c8-da0c25427daa/runs/latest/stdout	ffffffffffffffff	000000000014013b
/var/lib/mesos/slave/slaves/slave02/frameworks/framework_id/executors/cron01_20190329033559Xg7fX.c4cc6748-51d3-11e9-81c8-da0c25427daa/runs/latest/stdout	ffffffffffffffff	00000000002415cb
/var/lib/mesos/slave/slaves/slave02/frameworks/framework_id/executors/cron01_20190329035959gRsFM.21edc5bc-51d7-11e9-81c8-da0c25427daa/runs/latest/stdout	ffffffffffffffff	00000000002a1861
/var/lib/mesos/slave/slaves/slave02/frameworks/framework_id/executors/cron01_20190329043559yMteH.26bc732b-51dc-11e9-81c8-da0c25427daa/runs/latest/stdout	ffffffffffffffff	00000000003619a7
/var/lib/mesos/slave/slaves/slave02/frameworks/framework_id/executors/cron01_20190329044759rZBp3.d3cde86d-51dd-11e9-81c8-da0c25427daa/runs/latest/stdout	ffffffffffffffff	00000000003c0b39
/var/lib/mesos/slave/slaves/slave02/frameworks/framework_id/executors/cron01_20190329045959OupXA.83cabed3-51df-11e9-81c8-da0c25427daa/runs/latest/stdout	ffffffffffffffff	00000000004009a4
/var/lib/mesos/slave/slaves/slave02/frameworks/framework_id/executors/cron01_20190329051159AUlwW.2e263cd6-51e1-11e9-81c8-da0c25427daa/runs/latest/stdout	ffffffffffffffff	0000000000460d53
/var/lib/mesos/slave/slaves/slave02/frameworks/framework_id/executors/cron01_20190329052359iGV6q.db6579d8-51e2-11e9-81c8-da0c25427daa/runs/latest/stdout	ffffffffffffffff	00000000004829ec
/var/lib/mesos/slave/slaves/slave02/frameworks/framework_id/executors/cron01_20190329053559rQ611.8875b79e-51e4-11e9-81c8-da0c25427daa/runs/latest/stdout	ffffffffffffffff	00000000004c0a1e
/var/lib/mesos/slave/slaves/slave02/frameworks/framework_id/executors/cron01_20190329054759wjYE3.359fbdf0-51e6-11e9-81c8-da0c25427daa/runs/latest/stdout	ffffffffffffffff	00000000005011f9
/var/lib/mesos/slave/slaves/slave02/frameworks/framework_id/executors/cron01_20190329055959tpc31.e5a2fcf7-51e7-11e9-81c8-da0c25427daa/runs/latest/stdout	ffffffffffffffff	0000000000563f00
/var/lib/mesos/slave/slaves/slave02/frameworks/framework_id/executors/cron01_201903290623594cbkc.3d243c8c-51eb-11e9-81c8-da0c25427daa/runs/latest/stdout	ffffffffffffffff	0000000000601012
/var/lib/mesos/slave/slaves/slave02/frameworks/framework_id/executors/cron01_20190329063559X4gCu.ea4913c2-51ec-11e9-81c8-da0c25427daa/runs/latest/stdout	ffffffffffffffff	0000000000621676
/var/lib/mesos/slave/slaves/slave02/frameworks/framework_id/executors/cron01_20190329065959tk7cf.477399f8-51f0-11e9-81c8-da0c25427daa/runs/latest/stdout	000000000000bd40	00000000001210d6

For these files we only got logs for some of the files like:

/var/lib/mesos/slave/slaves/slave02/frameworks/framework_id/executors/cron01_20190329013559rcSXq.012560e2-51c3-11e9-81c8-da0c25427daa/runs/latest/stdout
/var/lib/mesos/slave/slaves/slave02/frameworks/framework_id/executors/cron01_20190329045959OupXA.83cabed3-51df-11e9-81c8-da0c25427daa/runs/latest/stdout
/var/lib/mesos/slave/slaves/slave02/frameworks/framework_id/executors/cron01_20190329054759wjYE3.359fbdf0-51e6-11e9-81c8-da0c25427daa/runs/latest/stdout
/var/lib/mesos/slave/slaves/slave02/frameworks/framework_id/executors/cron01_20190329055959tpc31.e5a2fcf7-51e7-11e9-81c8-da0c25427daa/runs/latest/stdout

There is no warn, error at forwarder or aggregator logs for this cron01 job.

But we got too much "[warn]: #0 got incomplete line before first line from ...." for other services, crons.

We have disabled the multiline, now missing less log but still some cron logs missing. (Cron run every 3 minutes)

Do you have any idea?

@yusufgungor
Copy link

yusufgungor commented Apr 8, 2019

Hi,

@repeatedly , thanks for this great log collector.

Our problem is resolved. The problem was cron's total run time. Cron finishes it's job in in 1 minute. Forwarder's refresh_interval config default value is 60 seconds. Cron runs, create a new log file and then forwarder updates the file list and tail the cron log file. But there is not any new logs, because cron finished the job. Because of also not having "read_from_head true" in our forwarder configs, then we do not get any log about that cron. We got some logs if refresh_interval and cron run times intersects.

Documentation states setting "enable_stat_watcher false" to prevent possible stuck issue with inotify. But we have set "enable_watch_timer false" and "enable_stat_watcher true". Does this cause any problem @repeatedly ?

Thanks.

We have updated our configs as below.

<source>
  log_level error

  @type tail
  tag fn.container

  <parse>
    @type multiline
     format_firstline /\d{4}-\d{1,2}-\d{1,2}/
     format1 /^(?<Timestamp>\d{4}-\d{1,2}-\d{1,2}(T|\s)\d{1,2}:\d{1,2}:\d{1,2}((,|\.)\d{1,3})?(\+\d{4})?) (?<Project>[^ ]*) (?<Service>[^ ]*) (?<Module>[^ ]*) (?<ProcessNo>[^ ]*)-(?<ThreadNo>[^ ]*) (?<Level>[^ ]*) (?<ServerIP>[^ ]*) (?<ClientIP>[^ ]*) (?<Log>.*)/
  </parse>
 
  path /var/lib/mesos/slave/slaves/*/frameworks/*/executors/*/runs/latest/stdout
  pos_file "/fluentd/etc/files.pos"
  path_key "MesosLogPath"
  limit_recently_modified 60m
  read_lines_limit 100000
  rotate_wait 10s
  enable_watch_timer false
  enable_stat_watcher true
  read_from_head true
  refresh_interval 60s
</source>

@pandeyrahulgwl
Copy link

I am trying to push my application container logs which resides in directory inside my application contianer (/logs/*.log) via fluentd, i have created fluentd as a service inside my ecs cluster while my application services are also running in parallel.

I have made an handshake by passing fluentd-address to application, even i can able to curl into my fluentd container running on ecs-ec2 host.

but somehow it cant able to send the logs to my cloudwatch stream.

Please find attachment for fluent.conf

Fluentconf_sample

@juliohm1978
Copy link

Hi @yusufgungor,

I ran into a similar issue collecting logs from our Kubernetes cluster. We keep missing some log messages from certain pods and CronJobs, until we realized only pods that start and stop very quickly are affected.

I managed to pin point the issue to the scenario you described. Log files are created with a few lines, and because the source dies almost immediately, fluentd only starts tailing the log file a few moments after that.

I tried using read_from_head true, but it never made it difference here. Did that setting work for you?

@juliohm1978
Copy link

juliohm1978 commented Aug 14, 2020

Nevermind... I was using read_from_head true in the wrong place in our config file. I got it working as expected, and we no loger have missing log messages.

@subash20
Copy link

I too have experience of Fluentd not reading log files generated from cron job every one minute. Is there any solution for that ???I read above comments and tried myself. Instead from my log file I have following message "unexpected error error_class=Errno::EACCES error="Permission denied @ rb_sysopen - /var/adump/adump.pos"

@prashantkumashi
Copy link

This seems to be an issue for which no direct solution is available. We have a similar configuration, only stdout and stderr are picked and other logs are ignored!! Is there any solution?

Here is our configuration:

<source>
@type tail
@id in_tail_container_logs
@label @containers
path /var/log/containers/*.log
exclude_path ["/var/log/containers/cloudwatch-agent*", "/var/log/containers/fluentd*"]
pos_file /var/log/fluentd-containers.log.pos
tag *
read_from_head true
<parse>
@type none
#@type json
#time_format %Y-%m-%dT%H:%M:%S.%NZ
</parse>
</source>

<label @containers>
<filter **>
@type kubernetes_metadata
@id filter_kube_metadata
</filter>

 <filter **>
@type record_transformer
@id filter_containers_stream_transformer
<record>
stream_name ${tag_parts[3]}
</record>
</filter>

 <filter **>
@type concat
key log
multiline_start_regexp /^\S/
separator ""
flush_interval 5
timeout_label @NORMAL
</filter>

 <match **>
@type relabel
@label @NORMAL
</match>
</label>

@github-actions
Copy link

This issue has been automatically marked as stale because it has been open 90 days with no activity. Remove stale label or comment or this issue will be closed in 30 days

@github-actions github-actions bot added the stale label Jan 31, 2021
@github-actions
Copy link

github-actions bot commented Mar 2, 2021

This issue was automatically closed because of stale in 30 days

@github-actions github-actions bot closed this as completed Mar 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants