-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Race condition or deadlock in Close() #72
Comments
This seems to happen when a file event comes in after |
There is at least one deadlock going on here. When I was able to work around it by dumping the 'extra' events around the call to case <-time.After(200 * time.Millisecond):
if haveChange {
logger.Println("we have enough changes")
quit := make(chan bool)
go func() {
for {
select {
case event := <-w.Event:
logger.Println("extra event:", event.Name())
case <-quit:
return
}
}
}()
w.Close()
quit <- true
return
} It feels like this is something that watcher should handle. If I have time I'll see if I can fix it in a more elegant way inside of watcher, and submit a pr. |
Even I am facing same issue with my application. Application hangs and unable to process new event .I believe this should be handled inside watcher package . |
It's not documented, but essentially in the current implementation it's not supported to call |
Regarding allowing |
@jpatters @ykumar-rb @tul - Thanks for your research and ideas on this issue, helped me arrive at a solution for my use case. In short, I have a situation where I need to be able to gracefully "leave" the watcher making sure all go routines complete in the following situations:
Here is the solution I came up with for anyone that might stumble upon this in the future - feedback welcomed! Note In my case, I have the maximum number of events set to one (1) ( func listenForEvents(w *watcher.Watcher, abort <-chan struct{}) <-chan error {
term := make(chan error, 1)
done := make(chan error, 1)
cancel := make(chan os.Signal, 1)
quit := make(chan struct{}, 1)
signal.Notify(cancel, os.Interrupt, syscall.SIGTERM, syscall.SIGTSTP)
go func() {
defer func() {
// Note that if a panic happened there is nothing we can do at this point
close(term)
close(done)
close(cancel)
close(quit)
}()
// must drain the queue in a different go routine than the one calling Close()
// we drain here instead of in the go routine below because in a panic situation
// our listening loop will have exited unexpectedly so handle both expected
// and unexpected scenarios in one place
drainQueue := func() {
go func() {
for {
select {
case wErr := <-w.Error:
logger.Debugf("Skipping processing of watcher error because in the processing of closing: %v", wErr)
case event := <-w.Event:
logger.Debugf("Skipping processing of watcher %v event for file %v because in the processing of closing", event.Op, event.Path)
case <-w.Closed:
logger.Debug("Skipping processing of watcher closed because in the processing of closing")
case <-quit:
return
}
}
}()
}
wErr := <-term
drainQueue()
w.Close()
quit <- struct{}{}
done <- wErr
}()
go func() {
term <- func() (err error) {
defer func() {
if recovered := recover(); recovered != nil {
err = fmt.Errorf("panic occurred: %v", recovered)
return
}
}()
for {
select {
case event := <-w.Event:
logger.Infof("event received: %v", event.Op)
case wErr := <-w.Error:
return wErr
case <-w.Closed:
return nil
case <-cancel:
return nil
case <-abort:
return nil
}
}
}()
}()
return done
}
func watchFiles() error {
w := watcher.New()
// configure watcher....
abort := make(chan struct{}, 1)
defer close(abort)
done := listenForEvents(w, abort)
if err := w.Start(time.Millisecond * 100); err != nil {
return err
}
if err := <-done; err != nil {
return err
}
return nil
} |
I am having a pretty terrible time trying to debug this. Things seem to get hung up in multiple places inside of
Close()
.There are times when it fails to obtain a lock and just gets stuck there forever. And there are times when it can't seem to send on
w.close
. Any help would be appreciated.I've added some logging to
Close()
like so.In some instances, the last message I get is
=== starting close
. And in some instances it is=== sending close
. And then other times it works just fine.Here is the code I am using. Let me know if you need to see more.
It is probably worth noting that the directory I am watching contains about 6000 files.
Thanks in advance.
The text was updated successfully, but these errors were encountered: