-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
time, runtime: scheduled timer may never fire if GOMAXPROCS is reduced #45716
Comments
If I'm reading this issue correctly, it is already fixed. Should we just close it? |
I think it's fixed, but it would be great if someone who knows the timer code more closely can verify that https://go-review.googlesource.com/c/go/+/300610/ fixes this. I haven't been able to reproduce this with 1.16.3, but don't know the code well enough to confirm that it's definitely fixed. I'm not sure if this is possible or is worth doing, but I would have liked to see something like this mentioned in the 1.16.3 release notes, since it's not just that a timer doesn't fire for 2 minutes, but a timer may never fire. It also reproduces on 1.15.11, can the fix be backported to 1.15.x? |
Yes, it seems plausible that CL 300610 will fix this problem. I think it's quite rare that people reduce GOMAXPROCS while a program is running; I don't think this is the kind of thing that we would normally put into the release notes. Certainly we didn't think of this case. |
@gopherbot Please open backport to 1.15 branch. |
Backport issue(s) opened: #45731 (for 1.15). Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://golang.org/wiki/MinorReleases. |
Closing this issue as it is fixed on tip and 1.16 branch. |
This issue is related to #44868 (and seems to be resolved with the same fix) but I wanted to file a separate issue as the impact is different, and may help others running into the same issue.
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Not yet,, go1.16.3 seems fine (only tested with simple repro, not on production). This does reproduce with 1.15.11.
This is likely fixed by https://go-review.googlesource.com/c/go/+/300610/.
What operating system and processor architecture are you using (
go env
)?linux, amd64
go env
OutputWhat did you do?
Created a background goroutine to do some periodic work every 500ms. The background goroutine created a ticker, and used a for loop that waited on the ticker.
Concurrently, the main goroutine reduced GOMAXPROCS (based on some system config).
Repro:
https://gist.github.com/prashantv/fdd710e270244efa23ba7a3fcb45b088
This needs to be run in a loop as it happens quite infrequently, it should eventually freeze with a log like:
The repro happens in a few seconds on go1.16.2.
The process is then blocked waiting on the ticker's channel, and allows the process to be debugged.
What did you expect to see?
Ticker channel to receive ticks periodically, ensuring periodic work run roughly twice a second.
What did you see instead?
Ticker did not tick at all, and periodic work did not happen.
More Context
In an internal production service, we noticed some periodic work was not happening as expected. After ruling out other possibilities, the only explanation left was that the ticker was not ticking. Using /debug/pprof/goroutines, I was able to find "stuck" select calls the appropriate stacktrace; the ticker was supposed to fire every few hundred ms, but the select was stuck for many minutes (including thousands of minutes).
Using delve with a core, I discovered that the timer was added to a P that was dead. The same can be reproduced with the repro, and an example delve sessions showing the state is below:
delve debugging
The text was updated successfully, but these errors were encountered: