-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve handling of SIGCONT after SIGTERM #4034
Improve handling of SIGCONT after SIGTERM #4034
Conversation
Signed-off-by: abetomo <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not familiar with SIGNAL but I think this fix looks nice!
It looks like this works well with ServerEngine.
lib/fluent/supervisor.rb
Outdated
term_handler = trap :TERM do | ||
# After SIGTERM, disable the sigdump file by ignoring SIGCONT. | ||
trap(:CONT, nil) | ||
term_handler.call |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This term_handler.call()
will call Server.stop()
in ServerEngine.
I think this is a nice way.
Thanks for the review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point, the followings are what I have noticed.
lib/fluent/supervisor.rb
Outdated
term_handler = trap :TERM do | ||
# After SIGTERM, disable the sigdump file by ignoring SIGCONT. | ||
trap(:CONT, nil) | ||
term_handler.call |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it safer to consider the possibility that term_hanler
is nil?
(It seems to be usually impossible though. So I understand the idea of not needing a nil check here.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, term_handler
may be a string.
I will add a check to see if it is a Proc
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see! This can be String
or Proc
or nil
according to the following document.
https://docs.ruby-lang.org/ja/latest/method/Signal/m/trap.html
I will add a check to see if it is a Proc.
So this looks good!
test/test_supervisor.rb
Outdated
@@ -213,6 +214,52 @@ def test_main_process_signal_handlers | |||
$log.out.reset if $log && $log.out && $log.out.respond_to?(:reset) | |||
end | |||
|
|||
def test_cont_in_main_processsignal_handlers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def test_cont_in_main_processsignal_handlers | |
def test_cont_in_main_process_signal_handlers |
test/test_supervisor.rb
Outdated
File.delete(@sigdump_path) if File.exist?(@sigdump_path) | ||
end | ||
|
||
# TODO Sending SIGTERM terminates the test itself. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, it seems to depend on the order of the test execution.
I haven't been able to check it clearly, but it looks like the proc obj of term_handler
is set to that of the previous test.
This may cause unexpected test results.
We may need some measures for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. I was mistaken.
Signed-off-by: abetomo <[email protected]>
..._main_processsignal_handlers -> ..._main_process_signal_handlers Signed-off-by: abetomo <[email protected]>
How about just disabling dumping after stop instead of changing signal handlers? diff --git a/lib/fluent/supervisor.rb b/lib/fluent/supervisor.rb
index b372f922..7dd5fc80 100644
--- a/lib/fluent/supervisor.rb
+++ b/lib/fluent/supervisor.rb
@@ -355,6 +355,10 @@ module Fluent
{ conf: @fluentd_conf }
end
+ def dump
+ super unless @stop
+ end
+
private
def reopen_log
@@ -413,6 +417,10 @@ module Fluent
def after_start
(config[:worker_pid] ||= {})[@worker_id] = @pm.pid
end
+
+ def dump
+ super unless @stop
+ end
end
class Supervisor
@@ -754,12 +762,6 @@ module Fluent
end
def run_worker
- begin
- require 'sigdump/setup'
- rescue Exception
- # ignore LoadError and others (related with signals): it may raise these errors in Windows
- end
-
Process.setproctitle("worker:#{@system_config.process_name}") if @process_name
if @standalone_worker && @system_config.workers != 1
@@ -940,6 +942,10 @@ module Fluent
trap :USR2 do
reload_config
end
+
+ trap :CONT do
+ dump_non_windows
+ end
end
end
@@ -969,7 +975,7 @@ module Fluent
reload_config
when "DUMP"
$log.debug "fluentd main process get #{cmd} command"
- dump
+ dump_windows
else
$log.warn "fluentd main process get unknown command [#{cmd}]"
end
@@ -1020,7 +1026,15 @@ module Fluent
end
end
- def dump
+ def dump_non_windows
+ begin
+ Sigdump.dump unless @finished
+ rescue => e
+ $log.error("failed to dump: #{e}")
+ end
+ end
+
+ def dump_windows
Thread.new do
begin
FluentSigdump.dump_windows |
Signed-off-by: abetomo <[email protected]>
@kou Thanks. I will try it. |
@daipom What do you think about #4034 (comment)? |
I prefer @kou's idea, too. |
I think so too! |
sv.send(:install_main_process_signal_handlers) | ||
|
||
begin | ||
Process.kill :CONT, $$ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Process.pid
is preferred than $$
for readability.
Process.kill :CONT, $$ | |
Process.kill :CONT, Process.pid |
begin | ||
Process.kill :CONT, $$ | ||
rescue | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove this begin ... rescue ... end
?
|
||
sleep 1 | ||
|
||
assert_true File.exist?(@sigdump_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Power assert style is preferred for better failure message:
assert_true File.exist?(@sigdump_path) | |
assert do | |
File.exist?(@sigdump_path) | |
end |
assert_true ...
shows just File.exist?(@sigdump_path)
is not true
on failure.
But assert {...}
shows not only File.exist?(@sigdump_path)
value but also @sigdump_path
and so on.
|
||
sleep 1 | ||
|
||
debug_msg = '[debug]: fluentd main process get SIGTERM' + "\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
debug_msg = '[debug]: fluentd main process get SIGTERM' + "\n" | |
debug_msg = "[debug]: fluentd main process get SIGTERM\n" |
Process.kill :TERM, $$ | ||
Process.kill :CONT, $$ | ||
rescue | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove this begin ... rescue ... end
?
|
||
debug_msg = '[debug]: fluentd main process get SIGTERM' + "\n" | ||
logs = $log.out.logs | ||
assert{ logs.any?{|log| log.include?(debug_msg) } } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert{ logs.any?{|log| log.include?(debug_msg) } } | |
assert{ logs.any?{|log| log.include?(debug_msg) } } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to be the same code.
I would like to know if there is a better way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, sorry. This is garbage.
logs = $log.out.logs | ||
assert{ logs.any?{|log| log.include?(debug_msg) } } | ||
|
||
assert_false File.exist?(@sigdump_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert_false File.exist?(@sigdump_path) | |
assert do | |
not File.exist?(@sigdump_path) | |
end |
|
||
assert_false File.exist?(@sigdump_path) | ||
ensure | ||
$log.out.reset if $log && $log.out && $log.out.respond_to?(:reset) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
$log.out.reset if $log && $log.out && $log.out.respond_to?(:reset) | |
$log.out.reset if $log&.out&.respond_to?(:reset) |
Which issue(s) this PR fixes:
Ideas for #3932 fixes.
What this PR does / why we need it:
After SIGTERM, disable SIGCONT handler.
This will prevent sigdump files from being generated.
Docs Changes:
Release Note: