Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Symlinks aren't created for all buffers #4316

Closed
mrudawski opened this issue Oct 2, 2023 · 5 comments
Closed

Symlinks aren't created for all buffers #4316

mrudawski opened this issue Oct 2, 2023 · 5 comments

Comments

@mrudawski
Copy link

mrudawski commented Oct 2, 2023

Describe the bug

Hey there,

recently I noticed an issue, related to symlink_path mechanism in td-agent/fluentd receiver agent. It seems like td-agent successfully receives two (or more) log streams from k8s cluster and creates two (or more) buffers for them, but it always create only one symlink in provided directory.

From agent's logs we can read, that td-agent recognized and created two buffers for two log streams:

2023-10-02 15:00:04 +0200 [debug]: #0 Created new chunk chunk_id="606bb5a4e90896a0fcff5b8d2d5c1b83" metadata=#<struct Fluent::Plugin::Buffer::Metadata timekey=1696251600, tag="k8s.cilium-sxjgc.stdout.log", variabl
es={:"$.kubernetes.namespace_name"=>"kube-system", :"$.kubernetes.pod_name"=>"cilium-sxjgc", :stream_group=>"stdout", :"$.kubernetes.container_name"=>"fwlogs-prod"}, seq=0>
2023-10-02 15:00:04 +0200 [debug]: #0 Created new chunk chunk_id="606bb5a4e95bcad718a493023a4caa68" metadata=#<struct Fluent::Plugin::Buffer::Metadata timekey=1696251600, tag="k8s.cilium-sxjgc.stdout.log", variabl
es={:"$.kubernetes.namespace_name"=>"kube-system", :"$.kubernetes.pod_name"=>"cilium-sxjgc", :stream_group=>"stdout", :"$.kubernetes.container_name"=>"fwlogs-tech"}, seq=0>

And when I checked buffers' directory, both of them was present and was continuously written:

ls -la /opt/logs/.fluentd_buffers/1/buffer.b606bb5a4e95bcad718a493023a4caa68.log
-rw-r--r-- 1 td-agent td-agent 76023492 10-02 15:46 /opt/logs/.fluentd_buffers/1/buffer.b606bb5a4e95bcad718a493023a4caa68.log
ls -la /opt/logs/.fluentd_buffers/1/buffer.b606bb5a4e90896a0fcff5b8d2d5c1b83.log
-rw-r--r-- 1 td-agent td-agent 1696307 10-02 15:46 /opt/logs/.fluentd_buffers/1/buffer.b606bb5a4e90896a0fcff5b8d2d5c1b83.log

So far so good. But I configured td-agent to create symlinks for me. And it also works, but only for one buffer:

tree /opt/logs/k8s/kube-system/current/cilium-sxjgc
/opt/logs/k8s/kube-system/current/cilium-sxjgc
└── ciliumlogs
└── stdout
└── stdout.log -> /opt/logs/.fluentd_buffers/1/buffer.b606bb5a4e95bcad718a493023a4caa68.log

The second buffer exists, but symlink wasn't created. I didn't find anything interesting in td-agent logs. It just skips creating symlink for the other buffer for some reason.

In the meantime I recreated my k8s pods many times and td-agent always creates only one symlink. It's definitely not related to my system or permissions. Seems like td-agent internal logic issue...

To Reproduce

To reproduce this issue, use td-agent.conf provided below.

Expected behavior

Symlinks should be created for every open buffer.

Your Environment

- Fluentd version: 1.16.2
- TD Agent version: 4.5.1
- Operating system: Oracle Linux Server release 7.9
- Kernel version: 5.4.17-2136.318.7.1.el7uek.x86_64

Your Configuration

# Enable RPC endpoint (control API)
<system>
  log_level debug
  rpc_endpoint 127.0.0.1:24444
</system>

# Enable monitor_agent (monitoring API)
<source>
  @type monitor_agent
  bind 0.0.0.0
  port 24220
</source>

# Define input (listening on 24224)
<source>
  @type forward
  @log_level debug
  port 24224
</source>

<match k8s.**>
  @type file
  path /opt/logs/${tag[0]}/${$.kubernetes.namespace_name}/%Y-%m/%d/${$.kubernetes.pod_name}/${$.kubernetes.container_name}/${stream_group}/${stream_group}.%Y-%m-%d-%H
  symlink_path /opt/logs/${tag[0]}/${$.kubernetes.namespace_name}/current/${$.kubernetes.pod_name}/${$.kubernetes.container_name}/${stream_group}/${stream_group}.log
  append true
  compress gzip
  <format>
    @type single_value
    message_key log
  </format>
  <buffer time,tag,$.kubernetes.namespace_name,$.kubernetes.pod_name,stream_group,$.kubernetes.container_name>
    @type file
    path /opt/logs/.fluentd_buffers/1
    timekey      1h
    timekey_wait 10s
    retry_max_interval 30
    retry_forever true
  </buffer>
</match>

Your Error Log

There isn't any error in the logs.

Additional context

No response

@mrudawski
Copy link
Author

Well, seems like buffers always use tag as an unique ID. In my example, both log streams had same tag (but other variables, which should be included and separated by the buffers). Anyway, I removed 'tag' from section and now it works properly. But what's the point of defining custom variables for the buffer, if it still uses 'tag' as a determinant of uniqueness?

@Shingo-Nakayama
Copy link
Contributor

Thank you for your report!
I think that the symlink path should be unique for each buffer.
So you will get the expected result by changing the symlink path tag from tag{0} to tag.

<match k8s.**>
  @type file
  path /opt/logs/${tag[0]}/${$.kubernetes.namespace_name}/%Y-%m/%d/${$.kubernetes.pod_name}/${$.kubernetes.container_name}/${stream_group}/${stream_group}.%Y-%m-%d-%H
  symlink_path /opt/logs/${tag}/${$.kubernetes.namespace_name}/current/${$.kubernetes.pod_name}/${$.kubernetes.container_name}/${stream_group}/${stream_group}.log
  append true
  compress gzip
...
</match>

@Shingo-Nakayama Shingo-Nakayama added waiting-for-user Similar to "moreinfo", but especially need feedback from user and removed waiting-for-triage labels May 2, 2024
@Shingo-Nakayama
Copy link
Contributor

But it may be a bug that there is no check for the setting in symlink_path.

Copy link

github-actions bot commented Jun 1, 2024

This issue has been automatically marked as stale because it has been open 30 days with no activity. Remove stale label or comment or this issue will be closed in 7 days

@github-actions github-actions bot added the stale label Jun 1, 2024
@daipom
Copy link
Contributor

daipom commented Jun 2, 2024

From this fix, warnings occur when the setting is insufficient.
It will be released on v1.17.1 and v1.16.6.

@daipom daipom removed waiting-for-user Similar to "moreinfo", but especially need feedback from user stale labels Jun 2, 2024
@daipom daipom closed this as completed Jun 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants