Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

root_dir/@id parameter documentation is not empirically correct #3552

Closed
brsolomon-deloitte opened this issue Nov 4, 2021 · 7 comments
Closed
Labels
document Improvements or additions to documentation

Comments

@brsolomon-deloitte
Copy link

Describe the bug

From https://docs.fluentd.org/deployment/multi-process-workers#root_dir-id-parameter:

With multi-process workers, you cannot use the fixed path configuration for file buffer because it conflicts buffer file path between processes.

That does not seem to be true at all. Is this oudated?

I can specify path for a file buffer with multi-process workers and td-agent will not complain whatsoever, and will create directories for each worker automatically.

To Reproduce

Use config below.

Will result in:

ls -1 /var/log/td-agent/buffer/td/
worker5
worker6
worker7

Expected behavior

If I "cannot" configure this, then an error should be raised and td-agent should not start.

Your Environment

- Fluentd version:
- TD Agent version: td-agent 4.2.0 fluentd 1.13.3 (12de3b5a260a174fe4a419036d6e2b2e18fe7497)
- Operating system: Ubuntu Focal
- Kernel version: 5.4.0-89-generic

Your Configuration

<system>
  workers 8
</system>

<worker 0-3>
  <source>
    @type syslog
    port 42185
    bind 0.0.0.0
    <transport tcp>
    </transport>
    tag syslog
  </source>
</worker>

<worker 4>
  <source>
    @type forward
  </source>
</worker>

<worker 5-7>
  <filter syslog.**>
    @type grep
    <exclude>
      key message
      pattern /sda/
    </exclude>
  </filter>

  <match syslog.**>
    @type kafka2

    # The list of all seed brokers, with their host and port information
    brokers xx.xx.xx.xx:yyyy

    # Set fluentd event time to Kafka's CreateTime.
    use_event_time true

    # format of each message
    <format>
      @type json
    </format>

    # write events to this kafka topic
    default_topic syslog
    <buffer>
      @type file
      path /var/log/td-agent/buffer/td
      flush_interval 30s
      chunk_limit_size 250k
    </buffer>

    # producer settings
    required_acks -1
    compression_codec gzip
  </match>
</worker>


### Your Error Log

```shell
2021-11-04 18:45:33 +0000 [info]: #5 adding filter pattern="syslog.**" type="grep"
2021-11-04 18:45:33 +0000 [info]: #3 adding source type="syslog"
2021-11-04 18:45:33 +0000 [info]: #1 adding source type="syslog"
2021-11-04 18:45:33 +0000 [info]: #2 adding source type="syslog"
2021-11-04 18:45:33 +0000 [info]: #4 adding source type="forward"
2021-11-04 18:45:33 +0000 [info]: #0 adding source type="syslog"
2021-11-04 18:45:33 +0000 [info]: #7 adding filter pattern="syslog.**" type="grep"
2021-11-04 18:45:33 +0000 [info]: #6 adding filter pattern="syslog.**" type="grep"
2021-11-04 18:45:33 +0000 [info]: #5 adding match pattern="syslog.**" type="kafka2"
2021-11-04 18:45:33 +0000 [info]: #4 starting fluentd worker pid=2980 ppid=2973 worker=4
2021-11-04 18:45:33 +0000 [info]: #4 listening port port=24224 bind="0.0.0.0"
2021-11-04 18:45:33 +0000 [info]: #4 fluentd worker is now running worker=4
2021-11-04 18:45:33 +0000 [info]: #7 adding match pattern="syslog.**" type="kafka2"
2021-11-04 18:45:33 +0000 [info]: #1 starting fluentd worker pid=2977 ppid=2973 worker=1
2021-11-04 18:45:33 +0000 [info]: #3 starting fluentd worker pid=2979 ppid=2973 worker=3
2021-11-04 18:45:33 +0000 [info]: #1 listening syslog socket on 0.0.0.0:42185 with tcp
2021-11-04 18:45:33 +0000 [info]: #3 listening syslog socket on 0.0.0.0:42185 with tcp
2021-11-04 18:45:33 +0000 [info]: #2 starting fluentd worker pid=2978 ppid=2973 worker=2
2021-11-04 18:45:33 +0000 [info]: #2 listening syslog socket on 0.0.0.0:42185 with tcp
2021-11-04 18:45:33 +0000 [info]: #0 starting fluentd worker pid=2976 ppid=2973 worker=0
2021-11-04 18:45:33 +0000 [info]: #0 listening syslog socket on 0.0.0.0:42185 with tcp
2021-11-04 18:45:33 +0000 [info]: #1 fluentd worker is now running worker=1
2021-11-04 18:45:33 +0000 [info]: #3 fluentd worker is now running worker=3
2021-11-04 18:45:33 +0000 [info]: #2 fluentd worker is now running worker=2
2021-11-04 18:45:33 +0000 [info]: #0 fluentd worker is now running worker=0
2021-11-04 18:45:33 +0000 [info]: #6 adding match pattern="syslog.**" type="kafka2"
2021-11-04 18:45:33 +0000 [info]: #5 brokers has been set: ["xx.xx.xx.xx:yyyy"]
2021-11-04 18:45:33 +0000 [info]: #5 starting fluentd worker pid=2981 ppid=2973 worker=5
2021-11-04 18:45:33 +0000 [info]: #7 brokers has been set: ["xx.xx.xx.xx:yyyy"]
2021-11-04 18:45:33 +0000 [info]: #5 initialized kafka producer: fluentd
2021-11-04 18:45:33 +0000 [info]: #5 fluentd worker is now running worker=5
2021-11-04 18:45:33 +0000 [info]: #7 starting fluentd worker pid=2983 ppid=2973 worker=7
2021-11-04 18:45:33 +0000 [info]: #7 initialized kafka producer: fluentd
2021-11-04 18:45:33 +0000 [info]: #7 fluentd worker is now running worker=7
2021-11-04 18:45:33 +0000 [info]: #6 brokers has been set: ["xx.xx.xx.xx:yyyy"]
2021-11-04 18:45:33 +0000 [info]: #6 starting fluentd worker pid=2982 ppid=2973 worker=6
2021-11-04 18:45:33 +0000 [info]: #6 initialized kafka producer: fluentd
2021-11-04 18:45:33 +0000 [info]: #6 fluentd worker is now running worker=6


### Additional context

_No response_
@ashie
Copy link
Member

ashie commented Dec 24, 2021

Hmm, probably you are right.
The plugin seems to add worker ID automatically even though path is used:

if using_plugin_root_dir || !multi_workers_configured
@path = File.join(@path, "buffer.*#{@path_suffix}")
else
@path = File.join(@path, "worker#{fluentd_worker_id}", "buffer.*#{@path_suffix}")

We should update the document.

@ashie ashie added the document Improvements or additions to documentation label Dec 24, 2021
@ashie
Copy link
Member

ashie commented Dec 24, 2021

@ashie
Copy link
Member

ashie commented Dec 27, 2021

When the specified path is file, it's not multi workers available:

else # specified path is file path
if File.basename(@path).include?('.*.')
# valid file path
elsif File.basename(@path).end_with?('.*')
@path = @path + @path_suffix
else
# existing file will be ignored
@path = @path + ".*#{@path_suffix}"
end
@multi_workers_available = false
end

In this case, fluentd correctly rejects it:

def multi_workers_ready?
unless @multi_workers_available
log.error "file buffer with multi workers should be configured to use directory 'path', or system root_dir and plugin id"
end
@multi_workers_available
end

@ashie
Copy link
Member

ashie commented Dec 27, 2021

Anyway the document isn't correct, we should fix it.

@github-actions
Copy link

This issue has been automatically marked as stale because it has been open 90 days with no activity. Remove stale label or comment or this issue will be closed in 30 days

@github-actions github-actions bot added the stale label Mar 27, 2022
@github-actions
Copy link

This issue was automatically closed because of stale in 30 days

@ashie ashie removed the stale label Jul 8, 2022
@ashie
Copy link
Member

ashie commented Jul 8, 2022

Fixed in fluent/fluentd-docs-gitbook#386

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
document Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

2 participants