Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UDP socket should also disable dual stack mode. #2768

Closed
brycied00d opened this issue Jan 10, 2020 · 4 comments · Fixed by #2781
Closed

UDP socket should also disable dual stack mode. #2768

brycied00d opened this issue Jan 10, 2020 · 4 comments · Fixed by #2781
Assignees

Comments

@brycied00d
Copy link

Describe the bug
Following an upgrade from 1.7.4, to 1.8.1, my forward source no longer listens on both v4 and v6 when using bind ::.

Seeing #2682, I tried configuring two separate in_forward sources, one using bind :: for IPv6, the other using bind 0.0.0.0 for IPv4, however that configuration fails to start: [error]: fluent/log.rb:362:error: unexpected error error_class=Errno::EADDRINUSE error="Address already in use - bind(2) for \"0.0.0.0\" port 24224"

It seems that proper "dual stack" listening is broken.

To Reproduce

Use the configuration below. When using bind ::, try an IPv4 connection and it will come back as "connection refused." When using bind 0.0.0.0, try an IPv6 connection and it too will fail. When using separate bind statements for IPv4 and IPv6 listeners on the same port, fluentd will fail to start.

Expected behavior

I expected fluentd to continue its pre-1.8.0 behaviour where bind :: binds to both IPv4 and IPv6 addresses, as determined by the kernel's net.ipv6.bindv6only sysctl. Alternatively, if the position of the fluentd project is that :: implies v6only, then I expect to be able to define separate v4 and v6 listeners without an EADDRINUSE error.

Your Environment

  • Fluentd or td-agent version: 1.8.1. Also tested 1.8.0 and 1.7.4
  • Operating system: PRETTY_NAME="Debian GNU/Linux 9 (stretch)"
  • Kernel version: 4.9.0-11-amd64

Your Configuration

This is a very simple configuration defining two separate listeners, one for v4 and one for v6. Nothing special is required to reproduce the bug.

<source>
	@type forward
	@id in_forward4
	@log_level debug
	bind 0.0.0.0
	port 1234
	<security>
		shared_key DoesntMatter
		self_hostname my.host.name
	</security>
</source>
<source>
	@type forward
	@id in_forward6
	@log_level debug
	bind ::
	port 1234
	<security>
		shared_key DoesntMatter
		self_hostname my.host.name
	</security>
</source>

<match *>
	@type stdout
	@id out_stdout
	<format>
		@type out_file
	</format>
</match>

Your Error Log

$ /opt/td-agent/embedded/bin/fluentd --config test.conf
2020-01-10 01:57:44 +0000 [info]: parsing config file is succeeded path="test.conf"
2020-01-10 01:57:44 +0000 [info]: gem 'fluentd' version '1.8.1'
2020-01-10 01:57:44 +0000 [info]: using configuration file: <ROOT>
  <source>
    @type forward
    @id in_forward4
    @log_level "debug"
    bind "0.0.0.0"
    port 1234
    <security>
      shared_key xxxxxx
      self_hostname "my.host.name"
    </security>
  </source>
  <source>
    @type forward
    @id in_forward6
    @log_level "debug"
    bind "::"
    port 1234
    <security>
      shared_key xxxxxx
      self_hostname "my.host.name"
    </security>
  </source>
  <match *>
    @type stdout
    @id out_stdout
    <format>
      @type "out_file"
    </format>
  </match>
</ROOT>
2020-01-10 01:57:44 +0000 [info]: starting fluentd-1.8.1 pid=12651 ruby="2.4.9"
2020-01-10 01:57:44 +0000 [info]: spawn command to main:  cmdline=["/opt/td-agent/embedded/bin/ruby", "-Eascii-8bit:ascii-8bit", "/opt/td-agent/embedded/bin/fluentd", "--config", "test.conf", "--under-supervisor"]
2020-01-10 01:57:45 +0000 [info]: adding match pattern="*" type="stdout"
2020-01-10 01:57:45 +0000 [info]: adding source type="forward"
2020-01-10 01:57:45 +0000 [info]: adding source type="forward"
2020-01-10 01:57:45 +0000 [info]: #0 starting fluentd worker pid=12658 ppid=12651 worker=0
2020-01-10 01:57:45 +0000 [info]: #0 [in_forward6] listening port port=1234 bind="::"
2020-01-10 01:57:45 +0000 [info]: #0 [in_forward4] listening port port=1234 bind="0.0.0.0"
2020-01-10 01:57:45 +0000 [error]: #0 unexpected error error_class=Errno::EADDRINUSE error="Address already in use - bind(2) for \"0.0.0.0\" port 1234"
  2020-01-10 01:57:45 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.8.1/lib/fluent/plugin_helper/server.rb:375:in `bind'
  2020-01-10 01:57:45 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.8.1/lib/fluent/plugin_helper/server.rb:375:in `server_create_udp_socket'
  2020-01-10 01:57:45 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.8.1/lib/fluent/plugin_helper/server.rb:174:in `server_create'
  2020-01-10 01:57:45 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.8.1/lib/fluent/plugin/in_forward.rb:183:in `start'
  2020-01-10 01:57:45 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.8.1/lib/fluent/root_agent.rb:200:in `block in start'
  2020-01-10 01:57:45 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.8.1/lib/fluent/root_agent.rb:189:in `block (2 levels) in lifecycle'
  2020-01-10 01:57:45 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.8.1/lib/fluent/root_agent.rb:188:in `each'
  2020-01-10 01:57:45 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.8.1/lib/fluent/root_agent.rb:188:in `block in lifecycle'
  2020-01-10 01:57:45 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.8.1/lib/fluent/root_agent.rb:175:in `each'
  2020-01-10 01:57:45 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.8.1/lib/fluent/root_agent.rb:175:in `lifecycle'
  2020-01-10 01:57:45 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.8.1/lib/fluent/root_agent.rb:199:in `start'
  2020-01-10 01:57:45 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.8.1/lib/fluent/engine.rb:193:in `start'
  2020-01-10 01:57:45 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.8.1/lib/fluent/engine.rb:146:in `run'
  2020-01-10 01:57:45 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.8.1/lib/fluent/supervisor.rb:534:in `block in run_worker'
  2020-01-10 01:57:45 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.8.1/lib/fluent/supervisor.rb:740:in `main_process'
  2020-01-10 01:57:45 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.8.1/lib/fluent/supervisor.rb:528:in `run_worker'
  2020-01-10 01:57:45 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.8.1/lib/fluent/command/fluentd.rb:338:in `<top (required)>'
  2020-01-10 01:57:45 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/site_ruby/2.4.0/rubygems/core_ext/kernel_require.rb:68:in `require'
  2020-01-10 01:57:45 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/site_ruby/2.4.0/rubygems/core_ext/kernel_require.rb:68:in `require'
  2020-01-10 01:57:45 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.8.1/bin/fluentd:8:in `<top (required)>'
  2020-01-10 01:57:45 +0000 [error]: #0 /opt/td-agent/embedded/bin/fluentd:23:in `load'
  2020-01-10 01:57:45 +0000 [error]: #0 /opt/td-agent/embedded/bin/fluentd:23:in `<main>'
2020-01-10 01:57:45 +0000 [error]: #0 unexpected error error_class=Errno::EADDRINUSE error="Address already in use - bind(2) for \"0.0.0.0\" port 1234"
  2020-01-10 01:57:45 +0000 [error]: #0 suppressed same stacktrace
2020-01-10 01:57:46 +0000 [info]: Worker 0 finished unexpectedly with status 1

Additional context

This is the above configuration with in_forward4 commented-out; only in_forward6 active.

$ netstat -an | fgrep 1234
tcp6       0      0 :::1234                 :::*                    LISTEN
udp6       0      0 :::1234                 :::*

$ telnet -4 localhost 1234
Trying 127.0.0.1...
Trying 127.0.0.1...
telnet: Unable to connect to remote host: Connection refused

$ telnet -6 localhost 1234
Trying ::1...
Connected to localhost.
Escape character is '^]'.
HELOnonce_
Ruthy_}RkeepaliveCConnection closed by foreign host.

This is the above configuration with in_forward6 commented-out; only in_forward4 active.

$ netstat -an | fgrep 1234
tcp        0      0 0.0.0.0:1234            0.0.0.0:*               LISTEN
udp        0      0 0.0.0.0:1234            0.0.0.0:*

$ telnet -4 localhost 1234
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
HELOnonceL%8[Jw_authA8y_CkeepaliveCConnection closed by foreign host.

$ telnet -6 localhost 1234
Trying ::1...
telnet: Unable to connect to remote host: Connection refused

This is the above configuration with in_forward4 listening on 1234, and in_forward6 listening on 1236.

(Relevant fluentd log output confirming separate listeners)
2020-01-10 02:03:51 +0000 [info]: adding source type="forward"
2020-01-10 02:03:51 +0000 [info]: adding source type="forward"
2020-01-10 02:03:51 +0000 [info]: #0 starting fluentd worker pid=13848 ppid=13841 worker=0
2020-01-10 02:03:51 +0000 [info]: #0 [in_forward6] listening port port=1236 bind="::"
2020-01-10 02:03:51 +0000 [info]: #0 [in_forward4] listening port port=1234 bind="0.0.0.0"
2020-01-10 02:03:51 +0000 [info]: #0 fluentd worker is now running worker=0

$ netstat -an | egrep '123[46]'
tcp        0      0 0.0.0.0:1234            0.0.0.0:*               LISTEN
tcp6       0      0 :::1236                 :::*                    LISTEN
udp        0      0 0.0.0.0:1234            0.0.0.0:*
udp6       0      0 :::1236                 :::*

$ telnet -4 localhost 1234
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
HELOnonceSs/+_6M{5auth9>(0PqkeepaliveCConnection closed by foreign host.

$ telnet -6 localhost 1234
Trying ::1...
telnet: Unable to connect to remote host: Connection refused

$ telnet -4 localhost 1236
Trying 127.0.0.1...
Trying 127.0.0.1...
telnet: Unable to connect to remote host: Connection refused

$ telnet -6 localhost 1236
Trying ::1...
Connected to localhost.
Escape character is '^]'.
HELOnonce{_     :
                 ~SauthFD_Z`_keepaliveCConnection closed by foreign host.

Finally, here's the original configuration working as-expected with a single listener under fluentd 1.7.4

$ /opt/td-agent/embedded/bin/fluentd --config test.conf
2020-01-10 02:08:14 +0000 [info]: parsing config file is succeeded path="test.conf"
2020-01-10 02:08:14 +0000 [info]: using configuration file: <ROOT>
  <source>
    @type forward
    @id in_forward
    @log_level "debug"
    bind "::"
    port 1234
    <security>
      shared_key xxxxxx
      self_hostname "my.host.name"
    </security>
  </source>
  <match *>
    @type stdout
    @id out_stdout
    <format>
      @type "out_file"
    </format>
  </match>
</ROOT>
2020-01-10 02:08:14 +0000 [info]: starting fluentd-1.7.4 pid=14696 ruby="2.4.9"
2020-01-10 02:08:14 +0000 [info]: spawn command to main:  cmdline=["/opt/td-agent/embedded/bin/ruby", "-Eascii-8bit:ascii-8bit", "/opt/td-agent/embedded/bin/fluentd", "--config", "test.conf", "--under-supervisor"]
2020-01-10 02:08:14 +0000 [info]: gem 'fluentd' version '1.7.4'
2020-01-10 02:08:14 +0000 [info]: adding match pattern="*" type="stdout"
2020-01-10 02:08:14 +0000 [info]: adding source type="forward"
2020-01-10 02:08:14 +0000 [info]: #0 starting fluentd worker pid=14703 ppid=14696 worker=0
2020-01-10 02:08:14 +0000 [info]: #0 [in_forward] listening port port=1234 bind="::"
2020-01-10 02:08:14 +0000 [info]: #0 fluentd worker is now running worker=0

$ telnet -4 localhost 1234
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
HELOnonceKy=\
             .bauth]inxv1*keepaliveCConnection closed by foreign host.
$ telnet -6 localhost 1234
Trying ::1...
Connected to localhost.
Escape character is '^]'.
HELOnonceEc:_uthj'
'LY5A'keepaliveCConnection closed by foreign host.

$ netstat -an | fgrep 1234
tcp6       0      0 :::1234                 :::*                    LISTEN
udp6       0      0 :::1234                 :::*
@brycied00d
Copy link
Author

brycied00d commented Jan 10, 2020

Two more things I tried but forgot to mention:

  • The issue is reproducible with <transport tls> too, but I omitted it for keeping the "to reproduce" steps as simple as possible.
  • I tried the configuration from When using the same port number, the td-agent fails to bind to IPv4 alongside IPv6 #2682 which uses http instead of forward and it works fine. Separate IPv4 and IPv6 listeners on the same port do NOT conflict and there is no EADDRINUSE error. The key difference between these two plugins is that http is ONLY opening a TCP socket, while forward opens BOTH UDP and TCP sockets.

In summary, it appears that the fix for #2682 is incompatible with the forward input source.

@ganmacs
Copy link
Member

ganmacs commented Jan 14, 2020

@repeatedly can you check this?

@repeatedly
Copy link
Member

Yeah, I will check it. v1.8 didn't touch socket initialization, so need step-by-step debug.

@repeatedly repeatedly changed the title Regression in IPv6 behaviour, unable to listen on the same port on both v4 and v6 on a Linux dualstack host. UDP socket should also disable dual stack mode. Jan 17, 2020
@repeatedly
Copy link
Member

"regression" is misleading title. Changed.

repeatedly added a commit that referenced this issue Jan 17, 2020
repeatedly added a commit that referenced this issue Jan 17, 2020
repeatedly added a commit that referenced this issue Jan 20, 2020
server: Fix IPv6 dual stack mode issue for udp socket. fix #2768
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants