-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test: Fix flaky TestCloseThroughContext test #1265
Conversation
Codecov Report
@@ Coverage Diff @@
## develop #1265 +/- ##
===========================================
- Coverage 70.17% 70.10% -0.08%
===========================================
Files 184 184
Lines 17394 17392 -2
===========================================
- Hits 12206 12192 -14
- Misses 4251 4260 +9
- Partials 937 940 +3
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH I am not sure how to test this, but I am assuming you probably have.
The failure is pretty common without the (removed) chan call, and without the sleep. It is rare with the chan call, and should be really hard to hit with a 10ms sleep (quite massive compared to the chan) |
For some reason I highly doubt that that is what's happening. Where you able to track the execution flow to match what you're saying? |
With an interactive debugger? No, but you can see how it is possible in the code. And if the context close is handled after the database close there will be no error reported. It is also very easy to reproduce if you remove the strange chan call (2nd commit) in the test without the explicit sleep. |
Indeed and that's why the channel receiver is there. To make sure that the context close call closes the datastore before the next close call. |
I'm pretty sure that changing from func (d *Datastore) Close() error {
d.closeOnce.Do(func() {
close(d.closing)
})
d.closeLk.Lock()
defer d.closeLk.Unlock()
if d.closed {
return ErrClosed
}
d.closed = true
close(d.commit)
return nil
} to func (d *Datastore) Close() error {
d.closeLk.Lock()
defer d.closeLk.Unlock()
if d.closed {
return ErrClosed
}
d.closeOnce.Do(func() {
close(d.closing)
})
d.closed = true
close(d.commit)
return nil
} would solve the issue without using a sleep call. |
That leaves a race between the if and the Do you flagged, moving the do to after the if would also solve it, but the internal chan receiver on an internal chan is an odd mechanic to use for the test (especially when undocumented). The Sleep will work regardless of datastore internals, and doesn't add additional references to internal stuff (which hinders implementation change). |
4131d60
to
9dde17b
Compare
The only impact it had on the test was to slightly delay things before calling s.Close making the test a little less flaky.
9dde17b
to
114cc2f
Compare
Also removed the Do as it becomes pointless on this side of the lock.
114cc2f
to
737f291
Compare
Relevant issue(s)
Resolves #1253
Description
Fixes flaky TestCloseThroughContext test.
The test was failing because within the memory datastore ctx.Done is being handled by the same loop that d.commit is being read. There was no guarantee that ctx.Done will be handled before d.commit, which would result in an exit from the control loop on d.commit, before the handling of ctx.Done has had chance to close the datastore.
Things could be improved by breaking up that select-loop, but even then there is no guarentee that the context close would be handled before
s.Close
and so the sleep would still be required.It would happen very infrequently, but I have also only ever heard of this test failing once, so that kind of makes sense.