Skip to content

Conversation

@Redent0r
Copy link

@Redent0r Redent0r commented Apr 23, 2024

Merge Checklist
  • Followed patch format from upstream recommendation: https://github.com/kata-containers/community/blob/main/CONTRIBUTING.md#patch-format
    • Included a single commit in a given PR - at least unless there are related commits and each makes sense as a change on its own.
  • Aware about the PR to be merged using "create a merge commit" rather than "squash and merge" (or similar)
  • genPolicy only: Ensured the tool still builds on Windows
  • genPolicy only: Updated sample YAMLs' policy annotations, if applicable
  • The upstream-missing label (or upstream-not-needed) has been set on the PR.
Summary

Cherry picked from upstream kata-containers@5492316

isClhRunning uses signal 0 to test whether the process is still alive or not. This doesn't work because the process is a direct child of the shim. Once it is dead the process becomes zombie.
Since no one waits for it the process lingers until its parent dies and init reaps it. Hence sending signal 0 in isClhRunning will always return success whether the process is dead or not.
This patch calls wait to reap the process, if it succeeds that means it is our child process, if not we send the signal.

Fixes: kata-containers#9431

Test Methodology

https://dev.azure.com/mariner-org/mariner/_build/results?buildId=557064&view=results

@Redent0r Redent0r added the upstream/merged PRs that have been merged upstream label Apr 23, 2024
@Redent0r Redent0r force-pushed the saulparedes/wait_for_clh branch from c48e1bc to b8983ba Compare April 25, 2024 21:01
The PID needs to be initialized before calling isClhRunning.
waitVMM() uses isClhRunning and is called by launchClh() just
before returning from function.

Fixes: kata-containers#9230

Signed-off-by: Alexandru Matei <[email protected]>
isClhRunning uses signal 0 to test whether the process is
still alive or not. This doesn't work because the process is a
direct child of the shim. Once it is dead the process becomes
zombie.
Since no one waits for it the process lingers until
its parent dies and init reaps it. Hence sending signal 0 in
isClhRunning will always return success whether the process is
dead or not.
This patch calls wait to reap the process, if it succeeds that
means it is our child process, if not we send the signal.

Fixes: kata-containers#9431

Signed-off-by: Alexandru Matei <[email protected]>
@Redent0r Redent0r force-pushed the saulparedes/wait_for_clh branch from b8983ba to 5e0ec90 Compare April 30, 2024 21:16
@Redent0r
Copy link
Author

Redent0r commented Apr 30, 2024

Signed-off-by: Saul Paredes <[email protected]>
@Redent0r Redent0r marked this pull request as ready for review May 1, 2024 18:14
@Redent0r Redent0r requested review from a team as code owners May 1, 2024 18:14
@Redent0r Redent0r merged commit 597200d into msft-main May 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

upstream/merged PRs that have been merged upstream

Projects

None yet

Development

Successfully merging this pull request may close these issues.

clh: isClhRunning always waits 10 seconds when clh exits

5 participants