test(controller): race condition in handleSigterm()#5821
test(controller): race condition in handleSigterm()#5821vflaux wants to merge 1 commit intokubernetes-sigs:masterfrom
Conversation
|
Hi @vflaux. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
09a2b7c to
ad9a3b1
Compare
ad9a3b1 to
b1c4916
Compare
This comment was marked as outdated.
This comment was marked as outdated.
b1c4916 to
04b6ada
Compare
04b6ada to
5f58b71
Compare
Pull Request Test Coverage Report for Build 23676867886Details
💛 - Coveralls |
|
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
5f58b71 to
781df55
Compare
Fix a race condition between the signal handler setup and the signal send in the test for handleSigterm()
781df55 to
60ab953
Compare
|
Need to better understand the benefits. /ok-to-test |
ivankatliarchuk
left a comment
There was a problem hiding this comment.
Worth to execute binary and terminate it. To validate behaviour. This sigterms, I'm not sure
| case sig := <-sigChan: | ||
| assert.Equal(t, syscall.SIGTERM, sig) | ||
| case <-time.After(5 * time.Second): | ||
| t.Fatal("signal was not recieved") |
There was a problem hiding this comment.
| t.Fatal("signal was not recieved") | |
| t.Fatal("signal was not received") |
| select { | ||
| case sig := <-sigChan: | ||
| assert.Equal(t, syscall.SIGTERM, sig) | ||
| case <-time.After(5 * time.Second): | ||
| t.Fatal("signal was not recieved") | ||
| } |
There was a problem hiding this comment.
case sig := <-sigChan:
assert.Equal(t, syscall.SIGTERM, sig) // Always true if sigChan fires
| defer signal.Reset(syscall.SIGTERM) | ||
| setupSigtermHandler(cancel) |
There was a problem hiding this comment.
Goroutine leak in tests: The goroutine spawned by setupSigtermHandler is blocked on <-signals. After defer signal.Reset(syscall.SIGTERM) runs, nothing will ever send on that channel, so the goroutine leaks for the duration of the test binary
| } | ||
| } | ||
|
|
||
| func TestSetupSigtermHandler(t *testing.T) { |
There was a problem hiding this comment.
maybe similar, hard to say
func TestSetupSigtermHandler(t *testing.T) {
cancelCalled := make(chan bool, 1)
cancel := func() { cancelCalled <- true }
hook := logtest.LogsUnderTestWithLogLevel(log.InfoLevel, t)
defer signal.Reset(syscall.SIGTERM)
setupSigtermHandler(cancel)
err := syscall.Kill(syscall.Getpid(), syscall.SIGTERM)
assert.NoError(t, err)
select {
case <-cancelCalled:
logtest.TestHelperLogContainsWithLogLevel("Received SIGTERM. Terminating...", log.InfoLevel, hook, t)
case <-time.After(5 * time.Second):
t.Fatal("cancel function was not called")
}
}
| <-signals | ||
| log.Info("Received SIGTERM. Terminating...") | ||
| cancel() | ||
| go func() { |
There was a problem hiding this comment.
The goroutine leaks a registration with the signal package. After receiving, we should deregister:
go func() {
<-signals
signal.Stop(signals)
log.Info("Received SIGTERM. Terminating...")
cancel()
}()
| // setupSigtermHandler start a routine that listens for a SIGTERM signal and triggers the provided cancel function | ||
| // to gracefully terminate the application. It logs a message when the signal is received. | ||
| func handleSigterm(cancel func()) { | ||
| func setupSigtermHandler(cancel func()) { |
There was a problem hiding this comment.
Can we not simply replace the whole logic with signal.NotifyContext?
in Execute()
// Before
ctx, cancel := context.WithCancel(context.Background())
setupSigtermHandler(cancel)
// After
ctx, stop := signal.NotifyContext(context.Background(), syscall.SIGTERM, syscall.SIGINT)
defer stop()
There was a problem hiding this comment.
+
context.AfterFunc(ctx, func() {
log.Info("Received SIGTERM. Terminating...")
})context.AfterFunc runs the callback in its own goroutine once the context is cancelled (for any reason — signal or otherwise), so the log line is preserved without a custom handler.
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
What does it do ?
Fix a race condition between the signal handler setup and the signal send in the test for
handleSigterm().In the test, in rare case the SIGTERM signal could be send before the
handleSigterm()callssignal.Notify().In this case, the cancel function is never called.
The test always pass because it silently ignore this with the select case
sig := <-sigChan.I think we need to test that the signal received is a SIGTERM and that the cancel func has been called. Not one of the two conditions.
Motivation
This primarily concerns testing and ensuring correctness. There is most likely no issue when the controller run.
I could reproduce this with
go test -timeout 30s -count 1000 -run ^TestHandleSigterm$ sigs.k8s.io/external-dns/controllerafter removing thesig := <-sigChanselect case in the current code.(edit: test was removed in #5816)
Details
After this patch:
More