reaper: Add a periodic reaper funcion#767
reaper: Add a periodic reaper funcion#767openshift-merge-robot merged 1 commit intoopenshift:masterfrom
Conversation
|
@smarterclayton ptal. If we can confirm that this works well with router, I can optimize the code further for parsing /proc. Also, we can consolidate both types of reaping with ReaperOptions and a single function or keep it separate like the current approach. |
|
/hold Can you test vendor this into a router PR and look at the log output? If this fixes it, we should get none of the "can't wait for process anymore" |
|
@smarterclayton sure, I can do that. |
de978fb to
8682157
Compare
|
Opened openshift/router#111 |
8682157 to
e10c451
Compare
|
Updated. |
|
I have a couple of logs remaining that I will take out after one more round of testing on the router PR. |
e10c451 to
dda569a
Compare
|
Updated the PR. This is ready. |
pkg/proc/proc_linux.go
Outdated
| for { | ||
| zs, err = parseProcForZombies() | ||
| if err != nil { | ||
| klog.V(4).Infof(err.Error()) |
There was a problem hiding this comment.
Give this an error prefix so we know it was from reaping
|
|
||
| // StartReaper starts a goroutine to reap processes periodically if called | ||
| // from a pid 1 process. | ||
| func StartReaper(period time.Duration) { |
There was a problem hiding this comment.
If period is less than or equal to zero set it to five seconds and add a comment above. Also describe why you might change the default period (if you create tens of thousands of defunct processes every five seconds).
eb5a6cb to
9989b9e
Compare
|
@smarterclayton could you take another look? |
pkg/proc/proc_linux.go
Outdated
|
|
||
| // StartReaper starts a goroutine to reap processes periodically if called | ||
| // from a pid 1 process. | ||
| // If period is less than 5 seconds, then it is set to 5 seconds. |
There was a problem hiding this comment.
I would have it be only when 0, not when less than five. If you have a very fast subprocess creating app, you might need to set it faster.
pkg/proc/proc_linux.go
Outdated
| func StartReaper(period time.Duration) { | ||
| if os.Getpid() == 1 { | ||
| const minReaperPeriodSeconds = 5 | ||
| if period.Seconds() < minReaperPeriodSeconds { |
There was a problem hiding this comment.
if period == 0 { period = 5 * time.Second }`
There was a problem hiding this comment.
sounds good! Default to 5.
This commit modifies the reaper behavior to periodically scan the procfs for zombies and then reap them in the next cycle. This should work better for container processes that spawn child processes through os.Exec Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
9989b9e to
fe841e6
Compare
|
/lgtm |
|
/hold cancel |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mrunalp, smarterclayton The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
reaper: Add a periodic reaper funcion
Switch the router process reaper to use the new periodic reaper brought in by openshift/library-go#767
Switch the router process reaper to use the new periodic reaper brought in by openshift/library-go#767
Bump openshift/library-go dep to include the new periodic reaper function merged in openshift/library-go#767
Bump openshift/library-go dep to include the new periodic reaper function merged in openshift/library-go#767
Switch the router process reaper to use the new periodic reaper brought in by openshift/library-go#767
Bump openshift/library-go dep to include the new periodic reaper function merged in openshift/library-go#767
Bump openshift/library-go dep to include the new periodic reaper function merged in openshift/library-go#767
Switch the router process reaper to use the new periodic reaper brought in by openshift/library-go#767
This commit adds a periodic reaper function that is intended
to be used when the go process uses the os/exec package to
launch child processes. The period argument could be adjusted
according to how long the os/exec calling code waits to
gather exit status from child processes.
Signed-off-by: Mrunal Patel mrunalp@gmail.com