Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

timekey in buffer config is not used for s3 output frequency, is there any way to change it? #348

Closed
NW-MLakin opened this issue Aug 28, 2020 · 5 comments
Labels

Comments

@NW-MLakin
Copy link

It seems that the buffer timekey is not used at all to determine the frequency to write files to s3, as the fluentd documentation states:

The out_s3 Output plugin writes records into the Amazon S3 cloud object storage service. By default, it creates files on an hourly basis. This means that when you first import records using the plugin, no file is created immediately.
The file will be created when the timekey condition has been met. To change the output frequency, please modify the timekey value in buffer section.
(https://docs.fluentd.org/output/s3)

The README in this repo does seem to indicate that it writes logs as it receives them, but there isn't any mention of how to control how frequently logs are written to S3:

This plugin splits files exactly by using the time of event logs (not the time when the logs are received). For example, a log '2011-01-02 message B' is reached, and then another log '2011-01-03 message B' is reached in this order, the former one is stored in "20110102.gz" file, and latter one in "20110103.gz" file.
(README.md)

I have tried using every combination of timekey, flush_interval, and chunk limits on both file and memory buffers, changing time_slice_format, and every other setting I could think of, and there seems to be no way to change the frequency of writing out files to S3. I even updated from 4.0.0 to 4.0.1 as there was some "timekey optimization" done in the release notes, which didn't help.

Is there a recommended method to receive a constant stream of logs and write them out to s3 at specific intervals, such as to create only one file per minute?

@NW-MLakin
Copy link
Author

So, I finally found a bit of consistency in controlling the file output frequency using timekey, but it's pretty weird.

After testing a bunch of different timekey values, I noticed that it always writes out 11 files per timekey period. With a timekey of 60s, it creates 11 files per minute. At 120s it creates about 5-6 files per minute, at 240s 2-3 files per minute. I changed the timekey to 660, and as expected it now creates one file per minute. The server is continuously receiving logs every second, but the number and size of incoming logs varies throughout the day, yet it consistently creates 1 file per minute (11 per timekey).

One file per minute is my desired setting, and this technically achieves that, but it's pretty gross. Also, having the timekey larger than 60 causes the %M variable to represent the current minute of the timekey (0, 11, 22, etc.), instead of the actual current minute from the system clock, so I can't really use it in path or s3_object_key_format.

Am I the only one seeing this?

Here is the config I am using:

<source>
  @type forward
  port 24224
  bind 0.0.0.0
</source>
<match stuff.**>
  @type rewrite_tag_filter
  <rule>
    key message
    pattern /.*/
    tag something
  </rule>
</match>
<match something>
  @type s3
  s3_bucket s3-bucket-name
  s3_region region-name
  path something/${host}
  s3_object_key_format %{path}/%Y/%m/%d/%H-%M_%{hms_slice}.%{file_extension}
  check_object false
  <buffer time,host>
    @type file
    timekey 660
    path /var/log/fluentd/something/
  </buffer>
  <format>
    @type json
  </format>
  <inject>
    time_key log_time
    time_type string
    time_format %Y-%m-%dT%H:%M:%S
    utc true
  </inject>
</match>

I have tried various different settings, and none of them has changed the observed behavior:

  • Buffer type (file/memory)
  • Buffer settings like flush_mode, timekey_wait, chunk_limit_size, total_limit_size, queued_chunks_limit_size, etc.
  • S3 settings like check_object, s3_object_key_format, path, acl, etc.
  • Buffer settings (timekey, output frequency) of the fluentd server that is forwarding logs to this server
  • time_slice_format, utc, timekey_use_utc, and system clock tz on both fluentd servers

@0x63lv
Copy link

0x63lv commented Nov 6, 2020

We are also witnessing this weirdness. We are using timekey 5m, which means that it sent ~2 files per minute to S3 (11 per timekey).
Because of this, it also appeared to be accumulating a backlog of un-flushed buffer files due to output to S3 happening at a lower rate than logs are generated, which, over time, leads to exhaustion of open file descriptors ulimit.
It seems that, if there are more than 11 buffer files created, no matter the timekey, it would always lead to this backog, as it, apparently, cannot send more than 11 files per timekey to S3.

Why 11, were does it come from, is it possible to adjust or configure it?

These other issues here might be because of the same reason: #339, #315, #237

@0x63lv
Copy link

0x63lv commented Nov 6, 2020

It appears, that the issue is not with this S3 plugin, but rather the way Fluentd output buffer is configured.
The 11 comes from here: https://github.com/fluent/fluentd/blob/8dd83461d8f34317d6e0de189beeb6bcff01481d/lib/fluent/plugin/output.rb#L1354-L1365
Essentially, it takes the minimum of timekey and timekey_wait, and divides it by 11 for some reason. If the resulting number is less than flush_thread_interval (default is 1.0) it substitutes this min value with that of flush_thread_interval, and uses that as the interval for flush. So a timekey 5m would mean the previously mentioned one flush every 30sec.

When time is specified as one of the chunk keys, the solution could be to configure flush_mode to something else than the default lazy, or set timekey_wait to something lower than timekey to help it flush logs faster than they are generated.
https://docs.fluentd.org/configuration/buffer-section#flushing-parameters

@github-actions
Copy link

github-actions bot commented Jul 6, 2021

This issue has been automatically marked as stale because it has been open 90 days with no activity. Remove stale label or comment or this issue will be closed in 30 days

@github-actions github-actions bot added the stale label Jul 6, 2021
@github-actions
Copy link

github-actions bot commented Aug 5, 2021

This issue was automatically closed because of stale in 30 days

@github-actions github-actions bot closed this as completed Aug 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants