Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TLS listener source causes memory leak #1767

Closed
bungoume opened this issue Nov 30, 2017 · 9 comments · Fixed by #1771
Closed

TLS listener source causes memory leak #1767

bungoume opened this issue Nov 30, 2017 · 9 comments · Fixed by #1771
Assignees
Labels

Comments

@bungoume
Copy link

bungoume commented Nov 30, 2017

Description

Forward source listener with TLS causes memory leak when TCP connection closed by FIN.

Some healthcheck system (for example: AWS NLB) use TCP check, so
memory leaks occur frequently.

This problem does not occur with normal(not TLS) listener.

Config

Fluentd 0.14.24 (Docker fluent/fluentd:edge)

<source>
  @type  forward
  port  24228
  bind 0.0.0.0
  <transport tls>
    insecure true
  </transport>
</source>

<match **>
  @type null
</match>

Leak

Client - Fluentd Server
-> Syn
<- Syn+Ack
-> Ack
-> Fin+Ack
<- Rst+Ack

reproduction command:

nping -rate 30 -c 1000 --tcp-connect localhost -p 24228
(With this command, Fin is sent at the last)

Not leak

-> Syn
<- Syn+Ack
-> R

nping -rate 30 -c 1000 --tcp localhost -p 24228

Similar issue

#1678

(comment in Japanese)
https://twitter.com/bungoume/status/935780821515956225

@repeatedly
Copy link
Member

Thanks for the report.
I will check the problem.

@repeatedly repeatedly self-assigned this Nov 30, 2017
@repeatedly
Copy link
Member

I checked the behaviour and it is caused by SSLSocket#accept_nonblock or something.

@_handler_socket.accept_nonblock # this method call actually try to do handshake via TLS

For example, I commented out this line and memory usage becomes low.
Setup/start OpenSSL session seems high cost for healthcheck.

We will work on this issue later.

@repeatedly
Copy link
Member

In nping -rate 30 -c 1000 --tcp localhost -p 24228 case, cool.io doesn't create new socket so the problem doesn't happen.

@repeatedly
Copy link
Member

nurse suggests using accept_nonblock(exception: false) to reduce the number of exceptions but no luck.
I replaced @_handler_socket.accept_nonblock with raise OpenSSL::SSL::SSLError, "hoge" and it reduced memory usage significantly.
accept_nonblock returns self so internal routine consumes lots of memory I think.

@repeatedly
Copy link
Member

repeatedly commented Dec 8, 2017

I investigated the problem and it seems the memory fragmentation causes it.
Could you try debian version for checking this problem happens or not.
In my ubuntu environment, using jemalloc memory usage is decreased after GC.

@bungoume
Copy link
Author

I tested it with fluent/fluentd:v1.0.2-debian, but it was reproduced.
I seems that memory reduction caused by fluentd restart, not by GC.

2017-12-21 02:52:00 +0000 [info]: #0 fluentd worker is now running worker=0
2017-12-21 02:53:29 +0000 [warn]: #0 thread exited by unexpected error plugin=Fluent::Plugin::ForwardInput title=:event_loop error_class=Errno::EINVAL error="Invalid argument - SSL_accept"
2017-12-21 02:53:29 +0000 [error]: #0 unexpected error error_class=Errno::EINVAL error="Invalid argument - SSL_accept"
  2017-12-21 02:53:29 +0000 [error]: #0 /var/lib/gems/2.3.0/gems/fluentd-1.0.2/lib/fluent/plugin_helper/server.rb:667:in `accept_nonblock'
  2017-12-21 02:53:29 +0000 [error]: #0 /var/lib/gems/2.3.0/gems/fluentd-1.0.2/lib/fluent/plugin_helper/server.rb:667:in `try_handshake'
  2017-12-21 02:53:29 +0000 [error]: #0 /var/lib/gems/2.3.0/gems/fluentd-1.0.2/lib/fluent/plugin_helper/server.rb:683:in `try_tls_accept'
  2017-12-21 02:53:29 +0000 [error]: #0 /var/lib/gems/2.3.0/gems/fluentd-1.0.2/lib/fluent/plugin_helper/server.rb:710:in `on_readable'
  2017-12-21 02:53:29 +0000 [error]: #0 /var/lib/gems/2.3.0/gems/cool.io-1.5.3/lib/cool.io/io.rb:186:in `on_readable'
  2017-12-21 02:53:29 +0000 [error]: #0 /var/lib/gems/2.3.0/gems/cool.io-1.5.3/lib/cool.io/loop.rb:88:in `run_once'
  2017-12-21 02:53:29 +0000 [error]: #0 /var/lib/gems/2.3.0/gems/cool.io-1.5.3/lib/cool.io/loop.rb:88:in `run'
  2017-12-21 02:53:29 +0000 [error]: #0 /var/lib/gems/2.3.0/gems/fluentd-1.0.2/lib/fluent/plugin_helper/event_loop.rb:84:in `block in start'
  2017-12-21 02:53:29 +0000 [error]: #0 /var/lib/gems/2.3.0/gems/fluentd-1.0.2/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'
2017-12-21 02:53:29 +0000 [error]: #0 unexpected error error_class=Errno::EINVAL error="Invalid argument - SSL_accept"
  2017-12-21 02:53:29 +0000 [error]: #0 suppressed same stacktrace
2017-12-21 02:53:29 +0000 [info]: Worker 0 finished unexpectedly with status 1
2017-12-21 02:53:30 +0000 [info]: gem 'fluentd' version '1.0.2'
2017-12-21 02:53:30 +0000 [info]: adding match pattern="**" type="null"
2017-12-21 02:53:30 +0000 [info]: adding source type="forward"
2017-12-21 02:53:30 +0000 [warn]: #0 insecure TLS communication server is configured (using 'insecure' mode)
2017-12-21 02:53:30 +0000 [info]: #0 starting fluentd worker pid=43 ppid=34 worker=0
2017-12-21 02:53:30 +0000 [info]: #0 listening port port=24228 bind="0.0.0.0"
2017-12-21 02:53:30 +0000 [warn]: #0 insecure TLS communication server is configured (using 'insecure' mode)
2017-12-21 02:53:30 +0000 [info]: #0 fluentd worker is now running worker=0

@repeatedly
Copy link
Member

I'm continue to investigate this problem.

@lachlanmunro
Copy link

I walked into something looking a lot like this in td-agent 3.2.0 on debian stretch:

2018-07-20 12:25:53 +0100 [warn]: #0 [inbound] thread exited by unexpected error plugin=Fluent::Plugin::ForwardInput title=:event_loop error_class=Errno::ECONNRESET error="Connection reset by peer - SSL_accept"
2018-07-20 12:25:53 +0100 [error]: #0 unexpected error error_class=Errno::ECONNRESET error="Connection reset by peer - SSL_accept"
  2018-07-20 12:25:53 +0100 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.2.2/lib/fluent/plugin_helper/server.rb:672:in `accept_nonblock'
  2018-07-20 12:25:53 +0100 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.2.2/lib/fluent/plugin_helper/server.rb:672:in `try_handshake'
  2018-07-20 12:25:53 +0100 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.2.2/lib/fluent/plugin_helper/server.rb:688:in `try_tls_accept'
  2018-07-20 12:25:53 +0100 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.2.2/lib/fluent/plugin_helper/server.rb:715:in `on_readable'
  2018-07-20 12:25:53 +0100 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/cool.io-1.5.3/lib/cool.io/io.rb:186:in `on_readable'
  2018-07-20 12:25:53 +0100 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/cool.io-1.5.3/lib/cool.io/loop.rb:88:in `run_once'
  2018-07-20 12:25:53 +0100 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/cool.io-1.5.3/lib/cool.io/loop.rb:88:in `run'
  2018-07-20 12:25:53 +0100 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.2.2/lib/fluent/plugin_helper/event_loop.rb:93:in `block in start'
  2018-07-20 12:25:53 +0100 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.2.2/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'
2018-07-20 12:25:53 +0100 [error]: #0 unexpected error error_class=Errno::ECONNRESET error="Connection reset by peer - SSL_accept"
  2018-07-20 12:25:53 +0100 [error]: #0 suppressed same stacktrace
2018-07-20 12:25:54 +0100 [info]: Worker 0 finished unexpectedly with status 1

@bungoume
Copy link
Author

This problem seems to have been fixed with #2087

thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants