Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libct/cg/fs/freezer: fix freezing race #2774

Merged
merged 1 commit into from
Feb 1, 2021

Conversation

kolyshkin
Copy link
Contributor

Before this PR, Set() used GetState() to check the freezer state
and retry the operation if the actual state still differs from requested.
This should help with the situation when a new process (such as one
added by runc exec) is added to the container's cgroup while it's being
freezed by the kernel, but it's not working as it should.

The problem is, GetState() never returns FREEZING state, looping until
the state is either FROZEN or THAWED, so Set() does not have a chance
to repeat the freeze attempt.

As a result, the container might end up stuck in a FREEZING state,
with GetState() never returning (which in turn blocks some other
operations).

One way to fix this would be to have GetState returning FREEZING state
instead of retrying ad infinitum. It would result in changing the public
API, and no callers of GetState expects it to return this.

To fix, let's not use GetState() from Set(). Instead, read the
freezer.state file directly and act accordingly -- return success
on FROZEN, retry on FREEZING, and error out on any other (unexpected)
value.

While at it, further improve the code:

  • limit the number of retries;
  • if retries are exceeded, thaw and return an error;
  • don't retry (or read the state back) on THAW.

I played a lot with various reproducers for this bug, including

  • parallel runc execs and runc pause/resumes
  • parallel runc execs and runc --systemd-cgroup update
    (the latter performs freeze/unfreeze);
  • continuously running /bin/printf inside container
    in parallel with runc pause/resume;
  • running pthread bomb (from criu test suite) in parallel
    with runc pause/resume;

and I was not able to make freeze work 100%, meaning sometimes
runc pause fails, or runc --systemd-cgroup update produces a warning.

With that said, it's still a big improvement over the previous
state of affairs where container is stuck in FREEZING state,
and GetState() (and all its users) are also stuck.

For more info, please see #2753

This is a minimal fix that I think is ready and should be included into rc93.

Fixes: #2753

@kolyshkin
Copy link
Contributor Author

I have a number of tests / reproducers written but since this PR does not fix the issue 100% the test case will be flaky, so I am proposing to include this without a test.

cyphar
cyphar previously approved these changes Feb 1, 2021
Copy link
Member

@cyphar cyphar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, with some clarifying comments.

@kolyshkin kolyshkin force-pushed the freeze-race branch 3 times, most recently from 2ff76db to 2c34040 Compare February 1, 2021 19:33
@kolyshkin
Copy link
Contributor Author

Rebased, updated the comments as per #2774 (comment) and #2774 (comment)

mrunalp
mrunalp previously approved these changes Feb 1, 2021
Before this commit, Set() used GetState() to check the freezer state
and retry the operation if the actual state still differs from requested.
This should help with the situation when a new process (such as one
added by runc exec) is added to the container's cgroup while it's being
freezed by the kernel, but it's not working as it should.

The problem is, GetState() never returns FREEZING state, looping until
the state is either FROZEN or THAWED, so Set() does not have a chance
to repeate the freeze attempt.

As a result, the container might end up stuck in a FREEZING state,
with GetState() never returning (which in turn blocks some other
operations).

One way to fix this would be to have GetState returning FREEZING state
instead of retrying ad infinitum. It would result in changing the public
API, and no callers of GetState expects it to return this.

To fix, let's not use GetState() from Set(). Instead, read the
freezer.state file directly and act accordingly -- return success
on FROZEN, retry on FREEZING, and error out on any other (unexpected)
value.

While at it, further improve the code:
 - limit the number of retries;
 - if retries are exceeded, thaw and return an error;
 - don't retry (or read the state back) on THAW.

I played a lot with various reproducers for this bug, including

 - parallel runc execs and runc pause/resumes
 - parallel runc execs and runc --systemd-cgroup update
   (the latter performs freeze/unfreeze);
 - continuously running /bin/printf inside container
   in parallel with runc pause/resume;
 - running pthread bomb (from criu test suite) in parallel
   with runc pause/resume;

and I was not able to make freeze work 100%, meaning sometimes
runc pause fails, or runc --systemd-cgroup update produces a warning.

With that said, it's still a big improvement over the previous
state of affairs where container is stuck in FREEZING state,
and GetState() (and all its users) are also stuck.

Signed-off-by: Kir Kolyshkin <[email protected]>
Copy link
Member

@cyphar cyphar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, waiting for CI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Runc init process step on DISK Sleep Status When freezen cgroup
3 participants