Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large Memory Consumption when ignore_same_log_interval #4174

Closed
yangjiel opened this issue May 11, 2023 · 6 comments · Fixed by #4229
Closed

Large Memory Consumption when ignore_same_log_interval #4174

yangjiel opened this issue May 11, 2023 · 6 comments · Fixed by #4229
Labels
bug Something isn't working

Comments

@yangjiel
Copy link

yangjiel commented May 11, 2023

Describe the bug

My colleague Lester Lu and I found the issue related to #3401

The abnormal increasing memory usage is because of the ignore repeated log feature.
The current implementation uses dictionary to store the message as key. However, depends on the plugin implementation, the message could be the actual log that being send out. For example in azure-loganalytics, the log message is like log.fatal "Exception occured in posting to DataCollector API: " + "'#{ex}', data=>" + Yajl.dump(records). The records/log could be large like 1k bytes per log and never repeat.

It leads to the dictionary cached_log keep growing and the memory will never be released, therefore can observe significant memory usage for some plugins.

cached_log[message] = time

To Reproduce

Use plugin azure-loganalytics, config with an invalid account so that the log will not be send out, with large traffic 3k log/s, with ignore_same_log_interval 60s, observed memory usage keep increasing.

Expected behavior

Memory should not increase infinitely. The cached_log need to be freed once a while.

Your Environment

- Fluentd version: 1.15.2
- TD Agent version: 4.4.1
- Operating system: Ubuntu
- Kernel version:

Your Configuration

<system>
  log_level      info
  ignore_repeated_log_interval   60s
  ignore_same_log_interval       60s
</system>

<match **>
    @type azure-loganalytics
    customer_id invalidid
    shared_key anykeyshouldbefine
    log_type ApacheAccessLog
<buffer tag>
	@type                      memory
	chunk_limit_size           10M
	total_limit_size           50M
	flush_at_shutdown          true
	overflow_action            block
	retry_forever              true
	disable_chunk_backup       true
</buffer>
</match>

Your Error Log

No error, abnormal memory usage.

Additional context

No response

@ashie ashie added bug Something isn't working and removed waiting-for-triage labels May 12, 2023
@ashie
Copy link
Member

ashie commented May 12, 2023

Thanks for your report!
Yep, we need to purge old logs.

fluentd/lib/fluent/log.rb

Lines 464 to 483 in 0a6d706

def ignore_same_log?(time, message)
cached_log = Thread.current[:last_same_log]
if cached_log.nil?
Thread.current[:last_same_log] = {message => time}
return false
end
prev_time = cached_log[message]
if prev_time
if (time - prev_time) <= @ignore_same_log_interval
true
else
cached_log[message] = time
false
end
else
cached_log[message] = time
false
end
end

@daipom
Copy link
Contributor

daipom commented May 12, 2023

Wow, indeed.

@yangjiel
Copy link
Author

May also relate to #1657

@daipom
Copy link
Contributor

daipom commented Jul 6, 2023

I have created PR to fix this!

Is it correct that this occurs only when using ignore_same_log_interval?

@yangjiel
Copy link
Author

yangjiel commented Jul 6, 2023

Yes correct. Thanks!

@daipom daipom changed the title Large Memory Consumption Large Memory Consumption when ignore_same_log_interval Jul 6, 2023
@daipom
Copy link
Contributor

daipom commented Jul 6, 2023

I modified the title a bit!
Thanks for your valuable report @yangjiel !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants