Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor performance improvements #4302

Merged
merged 4 commits into from
Oct 10, 2023
Merged

Minor performance improvements #4302

merged 4 commits into from
Oct 10, 2023

Conversation

isimluk
Copy link
Contributor

@isimluk isimluk commented Sep 18, 2023

Minor, local, and inconsequential performance improvements.

@isimluk isimluk changed the title Minor perf Minor performance improvements Sep 18, 2023
Copy link
Contributor

@daipom daipom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

I'm worried about one point.
I would like to hear other maintainers' opinions on the following points.

lib/fluent/plugin/in_tail.rb Outdated Show resolved Hide resolved
lib/fluent/plugin/parser.rb Outdated Show resolved Hide resolved
lib/fluent/log.rb Outdated Show resolved Hide resolved
@isimluk
Copy link
Contributor Author

isimluk commented Sep 22, 2023

Thank You @daipom and @ashie for your careful review.

I have removed affected chunks.

@daipom
Copy link
Contributor

daipom commented Sep 25, 2023

@isimluk, @ashie Thanks!
This looks good to me.
Checking CI.

@isimluk
Copy link
Contributor Author

isimluk commented Sep 27, 2023

I am seeing segfault in the CI for Ruby head, although, I don't think it is related to this PR.

@daipom
Copy link
Contributor

daipom commented Sep 28, 2023

Yes, currently, the CI doesn't work on Ruby head.

I am concerned that the CI on macOS is unstable.
It does not appear related to this PR, but I will check it next week, just in case.

@isimluk
Copy link
Contributor Author

isimluk commented Sep 29, 2023

I have reviewed several failed macos jobs in CI and here I am presenting a summary of what I saw.

I have seen 6 different errors reported on macos. None of them is immediately deterministic.

no nodes available (4 occurences)

  • non-deterministic
  • seen twice on macos with ruby 2.7
  • seen on macos with ruby 3.2
  • seen on macos with ruby 3.0
#<Thread:0x00007fc4ae0818d8@flush_thread_0 /Users/runner/work/fluentd/fluentd/lib/fluent/plugin_helper/thread.rb:70 run> terminated with exception (report_on_exception is true):
/Users/runner/work/fluentd/fluentd/lib/fluent/plugin/out_forward/load_balancer.rb:55:in `select_healthy_node': no nodes are available (Fluent::Plugin::ForwardOutput::NoNodesAvailable)
	from /Users/runner/work/fluentd/fluentd/lib/fluent/plugin_helper/service_discovery/manager.rb:108:in `select_service'
	from /Users/runner/work/fluentd/fluentd/lib/fluent/plugin_helper/service_discovery.rb:82:in `service_discovery_select_service'
	from /Users/runner/work/fluentd/fluentd/lib/fluent/plugin/out_forward.rb:375:in `try_write'
	from /Users/runner/work/fluentd/fluentd/test/plugin/test_out_forward.rb:56:in `try_write'
	from /Users/runner/work/fluentd/fluentd/lib/fluent/plugin/output.rb:1216:in `try_flush'
	from /Users/runner/work/fluentd/fluentd/lib/fluent/plugin/out_forward.rb:353:in `try_flush'
	from /Users/runner/work/fluentd/fluentd/lib/fluent/plugin/output.rb:1538:in `flush_thread_run'
	from /Users/runner/work/fluentd/fluentd/lib/fluent/plugin/output.rb:510:in `block (2 levels) in start'
	from /Users/runner/work/fluentd/fluentd/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'
===============================================================================
Error: test: a node supporting responses(ForwardOutputTest): Fluent::Plugin::ForwardOutput::NoNodesAvailable: no nodes are available
/Users/runner/work/fluentd/fluentd/lib/fluent/plugin/out_forward/load_balancer.rb:55:in `select_healthy_node'
/Users/runner/work/fluentd/fluentd/lib/fluent/plugin_helper/service_discovery/manager.rb:108:in `select_service'
/Users/runner/work/fluentd/fluentd/lib/fluent/plugin_helper/service_discovery.rb:82:in `service_discovery_select_service'
/Users/runner/work/fluentd/fluentd/lib/fluent/plugin/out_forward.rb:375:in `try_write'
/Users/runner/work/fluentd/fluentd/test/plugin/test_out_forward.rb:56:in `try_write'
/Users/runner/work/fluentd/fluentd/lib/fluent/plugin/output.rb:1216:in `try_flush'
/Users/runner/work/fluentd/fluentd/lib/fluent/plugin/out_forward.rb:353:in `try_flush'
/Users/runner/work/fluentd/fluentd/lib/fluent/plugin/output.rb:1538:in `flush_thread_run'
/Users/runner/work/fluentd/fluentd/lib/fluent/plugin/output.rb:510:in `block (2 levels) in start'
/Users/runner/work/fluentd/fluentd/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'
/Users/runner/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/webrick-1.8.1/lib/webrick/httpresponse.rb:260: warning: instance variable @upgrade not initialized
/Users/runner/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/webrick-1.8.1/lib/webrick/httpresponse.rb:260: warning: instance variable @upgrade not initialized
/Users/runner/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/webrick-1.8.1/lib/webrick/httpresponse.rb:260: warning: instance variable @upgrade not initialized
/Users/runner/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/webrick-1.8.1/lib/webrick/httpresponse.rb:260: warning: instance variable @upgrade not initialized
/Users/runner/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/webrick-1.8.1/lib/webrick/httpresponse.rb:260: warning: instance variable @upgrade not initialized
/Users/runner/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/webrick-1.8.1/lib/webrick/httpresponse.rb:260: warning: instance variable @upgrade not initialized
/Users/runner/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/webrick-1.8.1/lib/webrick/httpresponse.rb:260: warning: instance variable @upgrade not initialized
/Users/runner/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/webrick-1.8.1/lib/webrick/httpresponse.rb:260: warning: instance variable @upgrade not initialized
/Users/runner/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/webrick-1.8.1/lib/webrick/httpresponse.rb:260: warning: instance variable @upgrade not initialized
/Users/runner/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/webrick-1.8.1/lib/webrick/httpresponse.rb:260: warning: instance variable @upgrade not initialized
/Users/runner/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/webrick-1.8.1/lib/webrick/httpresponse.rb:260: warning: instance variable @upgrade not initialized
/Users/runner/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/webrick-1.8.1/lib/webrick/httpresponse.rb:260: warning: instance variable @upgrade not initialized
/Users/runner/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/webrick-1.8.1/lib/webrick/httpresponse.rb:260: warning: instance variable @upgrade not initialized
/Users/runner/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/webrick-1.8.1/lib/webrick/httpresponse.rb:260: warning: instance variable @upgrade not initialized
/Users/runner/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/webrick-1.8.1/lib/webrick/httpresponse.rb:260: warning: instance variable @upgrade not initia

unwanched files must be removed (5 occurences)

  • non-deterministic
  • seen twice on macos with ruby 3.1
  • seen on macos with ruby 3.0
  • seen twice on macos with ruby 2.7
Failure: test_unwatched_files_should_be_removed(TailInputTest::path)
/Users/runner/work/fluentd/fluentd/test/plugin/test_in_tail.rb:1616:in `test_unwatched_files_should_be_removed'
     1613:       waiting(20) { sleep 0.1 until Dir.glob("#{@tmp_dir}/*.txt").size == 0 } # Ensure file is deleted on Windows
     1614:       waiting(5) { sleep 0.1 until d.instance.instance_variable_get(:@tails).keys.size <= 0 }
     1615: 
  => 1616:       assert_equal(
     1617:         {
     1618:           files: [],
     1619:           tails: []
<{:files=>[], :tails=>[]}> expected but was
<{:files=>[],
 :tails=>
  ["/Users/runner/work/fluentd/fluentd/test/plugin/../tmp/tail/6a193554ceaebd802237/tail.txt"]}>

diff:
? {:files=>[], :tails=>[]}
+  :tails=>
+   ["/Users/runner/work/fluentd/fluentd/test/plugin/../tmp/tail/6a193554ceaebd802237/tail.txt"]}

IOError: closed stream

  • non-deterministic
  • seen on macos with ruby 3.1
Error: test: call receive_lines once when short line(less than 8192)(IntailIOHandlerTest::when limit is 5): IOError: closed stream
/Users/runner/work/fluentd/fluentd/test/plugin/in_tail/test_io_handler.rb:98:in `write'
/Users/runner/work/fluentd/fluentd/test/plugin/in_tail/test_io_handler.rb:98:in `block (2 levels) in <class:IntailIOHandlerTest>'

recv not defined

  • non-deterministic
  • seen on macos with ruby 3.1
Failure: test: raises no error when another thread closes a socket(AckHandlerTest):
Error: 09-29 08:35:38 +0000 [error]: unexpected error while receiving ack message error_class=NameError error="method `recv' not defined in #<Class:#<IO:0x000000010a067c18>>"
Error: 023-09-29 08:35:38 +0000 [error]: /Users/runner/hostedtoolcache/Ruby/3.1.4/x64/lib/ruby/gems/3.1.0/gems/rr-3.1.0/lib/rr/injections/double_injection.rb:143:in `remove_method'
Error: 023-09-29 08:35:38 +0000 [error]: /Users/runner/hostedtoolcache/Ruby/3.1.4/x64/lib/ruby/gems/3.1.0/gems/rr-3.1.0/lib/rr/injections/double_injection.rb:143:in `block in recv'
Error: 023-09-29 08:35:38 +0000 [error]: /Users/runner/hostedtoolcache/Ruby/3.1.4/x64/lib/ruby/gems/3.1.0/gems/rr-3.1.0/lib/rr/injections/double_injection.rb:142:in `class_eval'
Error: 023-09-29 08:35:38 +0000 [error]: /Users/runner/hostedtoolcache/Ruby/3.1.4/x64/lib/ruby/gems/3.1.0/gems/rr-3.1.0/lib/rr/injections/double_injection.rb:142:in `recv'
Error: 023-09-29 08:35:38 +0000 [error]: /Users/runner/work/fluentd/fluentd/lib/fluent/plugin/out_forward/ack_handler.rb:115:in `read_ack_from_sock'
Error: 023-09-29 08:35:38 +0000 [error]: /Users/runner/work/fluentd/fluentd/lib/fluent/plugin/out_forward/ack_handler.rb:71:in `block in collect_response'
Error: 023-09-29 08:35:38 +0000 [error]: /Users/runner/work/fluentd/fluentd/lib/fluent/plugin/out_forward/ack_handler.rb:70:in `each'
Error: 023-09-29 08:35:38 +0000 [error]: /Users/runner/work/fluentd/fluentd/lib/fluent/plugin/out_forward/ack_handler.rb:70:in `collect_response'
Error: 023-09-29 08:35:38 +0000 [error]: /Users/runner/work/fluentd/fluentd/test/plugin/out_forward/test_ack_handler.rb:124:in `block (3 levels) in <class:AckHandlerTest>'
  .
  <false> expected but was
  <true>
/Users/runner/work/fluentd/fluentd/test/plugin/out_forward/test_ack_handler.rb:131:in `block in <class:AckHandlerTest>'
     128:       end
     129: 
     130:       assert_true threads.map{ |t| t.join(10) }.all?
  => 131:       assert_false(
Error: 2:         $log.out.logs.any? { |log| log.include?('[error]') },
Error: 3:         $log.out.logs.select{ |log| log.include?('[error]') }.join('\n')
     134:       )

supervisor signal not handled

  • non-deterministic
  • seen on ruby 3.0
Failure: test_cont_in_main_process_signal_handlers(SupervisorTest):
  class()
  Called 0 times.
  Expected 1 times.
/Users/runner/hostedtoolcache/Ruby/3.0.6/x64/lib/ruby/gems/3.0.0/gems/sigdump-0.2.5/lib/sigdump.rb:76:in `block in dump_object_count'
/Users/runner/hostedtoolcache/Ruby/3.0.6/x64/lib/ruby/gems/3.0.0/gems/sigdump-0.2.5/lib/sigdump.rb:75:in `each_object'
/Users/runner/hostedtoolcache/Ruby/3.0.6/x64/lib/ruby/gems/3.0.0/gems/sigdump-0.2.5/lib/sigdump.rb:75:in `dump_object_count'
/Users/runner/hostedtoolcache/Ruby/3.0.6/x64/lib/ruby/gems/3.0.0/gems/sigdump-0.2.5/lib/sigdump.rb:18:in `block in dump'
/Users/runner/hostedtoolcache/Ruby/3.0.6/x64/lib/ruby/gems/3.0.0/gems/sigdump-0.2.5/lib/sigdump.rb:138:in `open'
/Users/runner/hostedtoolcache/Ruby/3.0.6/x64/lib/ruby/gems/3.0.0/gems/sigdump-0.2.5/lib/sigdump.rb:138:in `_open_dump_path'
/Users/runner/hostedtoolcache/Ruby/3.0.6/x64/lib/ruby/gems/3.0.0/gems/sigdump-0.2.5/lib/sigdump.rb:14:in `dump'
/Users/runner/work/fluentd/fluentd/lib/fluent/supervisor.rb:923:in `dump_non_windows'
/Users/runner/work/fluentd/fluentd/lib/fluent/supervisor.rb:839:in `block in install_main_process_signal_handlers'
/Users/runner/work/fluentd/fluentd/test/test_supervisor.rb:220:in `kill'
/Users/runner/work/fluentd/fluentd/test/test_supervisor.rb:220:in `test_cont_in_main_process_signal_handlers'
     217:     sv = Fluent::Supervisor.new({})
     218:     sv.send(:install_main_process_signal_handlers)
     219: 
  => 220:     Process.kill :CONT, Process.pid
     221: 
     222:     sleep 1
     223: 
===============================================================================
===============================================================================
Failure: test_cont_in_supervisor_signal_handler(SupervisorTest):
  class()
  Called 0 times.
  Expected 1 times.
/Users/runner/hostedtoolcache/Ruby/3.0.6/x64/lib/ruby/gems/3.0.0/gems/sigdump-0.2.5/lib/sigdump.rb:76:in `block in dump_object_count'
/Users/runner/hostedtoolcache/Ruby/3.0.6/x64/lib/ruby/gems/3.0.0/gems/sigdump-0.2.5/lib/sigdump.rb:75:in `each_object'
/Users/runner/hostedtoolcache/Ruby/3.0.6/x64/lib/ruby/gems/3.0.0/gems/sigdump-0.2.5/lib/sigdump.rb:75:in `dump_object_count'
/Users/runner/hostedtoolcache/Ruby/3.0.6/x64/lib/ruby/gems/3.0.0/gems/sigdump-0.2.5/lib/sigdump.rb:18:in `block in dump'
/Users/runner/hostedtoolcache/Ruby/3.0.6/x64/lib/ruby/gems/3.0.0/gems/sigdump-0.2.5/lib/sigdump.rb:138:in `open'
/Users/runner/hostedtoolcache/Ruby/3.0.6/x64/lib/ruby/gems/3.0.0/gems/sigdump-0.2.5/lib/sigdump.rb:138:in `_open_dump_path'
/Users/runner/hostedtoolcache/Ruby/3.0.6/x64/lib/ruby/gems/3.0.0/gems/sigdump-0.2.5/lib/sigdump.rb:14:in `dump'
/Users/runner/work/fluentd/fluentd/lib/fluent/supervisor.rb:923:in `dump_non_windows'
/Users/runner/work/fluentd/fluentd/lib/fluent/supervisor.rb:839:in `block in install_main_process_signal_handlers'
/Users/runner/work/fluentd/fluentd/test/test_supervisor.rb:302:in `kill'
/Users/runner/work/fluentd/fluentd/test/test_supervisor.rb:302:in `test_cont_in_supervisor_signal_handler'
     299:     server = DummyServer.new
     300:     server.install_supervisor_signal_handlers
     301: 
  => 302:     Process.kill :CONT, Process.pid
     303: 
     304:     sleep 1
     305: 

can execute external command many times, which finishes immediately

  • non-deterministic
  • seen on macos with ruby 3.1
 Failure: test: can execute external command many times, which finishes immediately(ChildProcessTest):
  actual elapsed: 2.677816310999958.
  <true> expected but was
  <false>
/Users/runner/work/fluentd/fluentd/test/plugin_helper/test_child_process.rb:332:in `block (3 levels) in <class:ChildProcessTest>'
     329:         sleep 0.1 while ary.size < i
     330:         elapsed = Fluent::Clock.now - start_time
     331:         assert_equal(i, ary.size)
  => 332:         assert_true(elapsed > i && elapsed < i + 0.5,
     333:                     "actual elapsed: #{elapsed}")
     334:       end
     335:       assert_equal [], @d.log.out.logs
/Users/runner/work/fluentd/fluentd/test/plugin_helper/test_child_process.rb:328:in `upto'
/Users/runner/work/fluentd/fluentd/test/plugin_helper/test_child_process.rb:328:in `block (2 levels) in <class:ChildProcessTest>'
/Users/runner/hostedtoolcache/Ruby/3.1.4/x64/lib/ruby/3.1.0/timeout.rb:107:in `block in timeout'
/Users/runner/hostedtoolcache/Ruby/3.1.4/x64/lib/ruby/3.1.0/timeout.rb:36:in `block in catch'
/Users/runner/hostedtoolcache/Ruby/3.1.4/x64/lib/ruby/3.1.0/timeout.rb:36:in `catch'
/Users/runner/hostedtoolcache/Ruby/3.1.4/x64/lib/ruby/3.1.0/timeout.rb:36:in `catch'
/Users/runner/hostedtoolcache/Ruby/3.1.4/x64/lib/ruby/3.1.0/timeout.rb:123:in `timeout'
/Users/runner/work/fluentd/fluentd/test/plugin_helper/test_child_process.rb:323:in `block in <class:ChildProcessTest>'

@daipom
Copy link
Contributor

daipom commented Oct 4, 2023

Thanks for checking CI.
I am concerned about an error like this.
This seems to happen occasionally, regardless of Ruby version.
I'm trying to find out if this error can happen before or whether it is related to this PR.

https://github.com/fluent/fluentd/actions/runs/6315629556/job/17148563719?pr=4302

#<Thread:0x0000000111d5c020@child_process_callback /Users/runner/work/fluentd/fluentd/lib/fluent/plugin_helper/thread.rb:70 run> terminated with exception (report_on_exception is true):
/Users/runner/work/fluentd/fluentd/lib/fluent/plugin_helper/child_process.rb:337:in `block in child_process_execute_once': undefined method `exit_status=' for nil:NilClass (NoMethodError)

            @_child_process_processes[pid].exit_status = wait_thread.value # with join
                                          ^^^^^^^^^^^^^^
	from /Users/runner/work/fluentd/fluentd/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'
[BUG] invalid keeping_mutexes: Attempt to unlock a mutex which is not locked
ruby 3.1.4p223 (2023-03-30 revision 957bb7cb81) [x86_64-darwin20]

-- Crash Report log information --------------------------------------------
   See Crash Report log file in one of the following locations:
     * ~/Library/Logs/DiagnosticReports
     * /Library/Logs/DiagnosticReports
   for more details.
Don't forget to include the above Crash Report log file in bug reports.

-- Control frame information -----------------------------------------------
c:0001 p:---- s:0003 e:000002 (none) [FINISH]


-- C level backtrace information -------------------------------------------
/Users/runner/hostedtoolcache/Ruby/3.1.4/x64/lib/libruby.3.1.dylib(rb_vm_bugreport+0x73d) [0x10c9fec0d]
/Users/runner/hostedtoolcache/Ruby/3.1.4/x64/lib/libruby.3.1.dylib(rb_bug_without_die+0x178) [0x10c812fd8]
/Users/runner/hostedtoolcache/Ruby/3.1.4/x64/lib/libruby.3.1.dylib(rb_bug+0x6f) [0x10ca3060a]
/Users/runner/hostedtoolcache/Ruby/3.1.4/x64/lib/libruby.3.1.dylib(thread_start_func_2+0x6a8) [0x10c99fa78]
/Users/runner/hostedtoolcache/Ruby/3.1.4/x64/lib/libruby.3.1.dylib(thread_start_func_1+0x11b) [0x10c99f21b]
/usr/lib/system/libsystem_pthread.dylib(_pthread_start+0x7d) [0x7ff809c824e1]

-- Other runtime information -----------------------------------------------

* Loaded script: /Users/runner/hostedtoolcache/Ruby/3.1.4/x64/lib/ruby/gems/3.1.0/gems/rake-13.0.6/lib/rake/rake_test_loader.rb

@@ -87,7 +87,7 @@ def run(timeout: nil, start: true, shutdown: true, &block)

sleep_with_watching_threads = ->(){
if @instance.respond_to?(:_threads)
@instance._threads.values.each{|t| t.join(0) }
@instance._threads.each_value{|t| t.join(0) }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this error did not happen before.

I'm checking if this change has any impact.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, it has the impact. I am trying to understand how this relates.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that it is caused by the fact that the code

              @instance._threads.each_value{|t| t.join(0) }

is accessing @_threads instance varible from PluginHelperThread mixin, that is otherwise guarded by @_threads_mutex, but not in this case.

I will study a code a bit more, but in the mean time, I will remove this affected chunk from the PR.

Thank You for your help, @daipom

Copy link
Contributor

@daipom daipom Oct 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!
Looks like this error no longer occurs.

I don't understand why this change causes this error and why this error is specific to macOS...
If you find out anything, please let us know. Thanks!
In the meantime, how about adding some comments to this line? such as

# Can not use each_value here. Somehow, it makes tests on macOS unstable.
# https://github.com/fluent/fluentd/pull/4302#issuecomment-1746490368

For now, I think the current modifications are fine.
I will merge this later.

@daipom daipom merged commit fcbcb56 into fluent:master Oct 10, 2023
13 of 16 checks passed
@daipom
Copy link
Contributor

daipom commented Oct 10, 2023

Thanks for the improvements!

@isimluk isimluk deleted the minor-perf branch October 10, 2023 08:43
@daipom daipom added this to the v1.17.0 milestone Oct 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants