Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Writing to stderr from out_exec_filter plugin, blocks #2609

Closed
kfdm opened this issue Sep 10, 2019 · 7 comments · Fixed by #2649
Closed

Writing to stderr from out_exec_filter plugin, blocks #2609

kfdm opened this issue Sep 10, 2019 · 7 comments · Fixed by #2649
Assignees
Labels
bug Something isn't working

Comments

@kfdm
Copy link

kfdm commented Sep 10, 2019

Describe the bug

I'm using exec_filter to process some data using Python. My initial command looks like

<match logs.**>
    # https://docs.fluentd.org/output/exec_filter
    @type exec_filter

    command /path/to/python/script

    <format>
      @type json
    </format>
    <parse>
      @type json
    </parse>
</match>

For logging, my script is using the standard python3 logging library for debugging.
I noticed the script backing up and spent a while trying to debug it, but I now suspect the lower level child_process_execute in fluentd/lib/fluent/plugin_helper/child_process.rb to be suspect

Updating the above command to command /path/to/python/script 2>> /var/log/td-agent/script.log seems to suddenly unblock processing

Expected behavior

Within, out_exec_filter, a child process is called via

execute_child_process = ->(index){
    child_process_execute("out_exec_filter_child#{index}".to_sym, @command, on_exit_callback: exit_callback) do |readio, writeio|
        child_process_callback.call(index, readio, writeio)
    end
}

which means the parameter defaults to stderr: :discard which then arrives at this block

if !mode.include?(:stderr) && !mode.include?(:read_with_stderr) && stderr != :discard # connect
    writeio, readio, wait_thread = *Open3.popen2(*spawn_args, spawn_opts)
elsif mode.include?(:read_with_stderr)
    writeio, readio, wait_thread = *Open3.popen2e(*spawn_args, spawn_opts)
else
    writeio, readio, stderrio, wait_thread = *Open3.popen3(*spawn_args, spawn_opts)
end

While I have not debugged any further than this, I would expect if fluentd's plugin is not handling stderr, then it would at least default to sending it to /dev/null or some other place. There was

Your Environment

fluentd --version 
fluentd 1.7.0
python -m platform
Linux-3.10.0-957.21.3.el7.x86_64-x86_64-with-centos-7.6.1810-Core
@kfdm kfdm added the bug Something isn't working label Sep 10, 2019
@kfdm
Copy link
Author

kfdm commented Sep 30, 2019

Several weeks have passed, and I still sometimes see some oddities with the handling of stderr. Has someone had an opportunity to double check things ?

@michnmi
Copy link

michnmi commented Oct 3, 2019

I'm seeing the same thing as well.
In my debugging what I had, was that stderr was filling up with HTTPS verification messages, at some point reaching 4096 bytes and then blocking.
It seems that something is expected to consume this and isn't.
This is an extract from the strace, i.e. I don't have all 8223 bytes here:

11:33:20.070228 write(2, "InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings\n  InsecureRequestWarning,\n", 8323) = 4096 <208.369726>

and then it blocks.

Can we please fix have someone pick this up and comment?
MM

@repeatedly repeatedly self-assigned this Oct 3, 2019
@repeatedly
Copy link
Member

Will check default and read_with_stderr behaviour to resolve this.

@repeatedly
Copy link
Member

Here is quick patch.

diff --git a/lib/fluent/plugin_helper/child_process.rb b/lib/fluent/plugin_helper/child_process.rb
index 437c435c..8569a11b 100644
--- a/lib/fluent/plugin_helper/child_process.rb
+++ b/lib/fluent/plugin_helper/child_process.rb
@@ -257,7 +257,7 @@ module Fluent
         readio = writeio = stderrio = wait_thread = nil
         readio_in_use = writeio_in_use = stderrio_in_use = false
 
-        if !mode.include?(:stderr) && !mode.include?(:read_with_stderr) && stderr != :discard # connect
+        if !mode.include?(:stderr) && !mode.include?(:read_with_stderr) && stderr == :discard # connect
           writeio, readio, wait_thread = *Open3.popen2(*spawn_args, spawn_opts)
         elsif mode.include?(:read_with_stderr)
           writeio, readio, wait_thread = *Open3.popen2e(*spawn_args, spawn_opts)
@@ -281,7 +281,7 @@ module Fluent
           stderrio.set_encoding(external_encoding, internal_encoding, encoding_options)
           stderrio_in_use = true
         else
-          stderrio.reopen(IO::NULL) if stderrio && stderrio == :discard
+          stderrio.reopen(IO::NULL) if stderrio && stderr == :discard
         end
 
         pid = wait_thread.pid # wait_thread => Process::Waiter

We need to check this condition change doesn't break existing code.

@repeatedly
Copy link
Member

Above patch doesn't cover all cases. Need more time to fix the potential issue.

repeatedly added a commit that referenced this issue Oct 11, 2019
reopen with IO::NULL is not correct way to discard message.
Need to specify IO::NULL when launch process.

Signed-off-by: Masahiro Nakagawa <[email protected]>
@repeatedly
Copy link
Member

Here is the patch: #2649
This patch should fix this problem.

repeatedly added a commit that referenced this issue Oct 15, 2019
child_process helper: fix stderr blocking for discard case. fix #2609
@kfdm
Copy link
Author

kfdm commented Oct 16, 2019

Thank you! 🙇
I look forward to testing out the fix in the next release.

repeatedly added a commit that referenced this issue Oct 25, 2019
child_process helper: fix stderr blocking for discard case. fix #2609
Signed-off-by: Masahiro Nakagawa <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants