Kubelet: Always restart container in unknown state. #22607

Random-Liu · 2016-03-06T02:03:09Z

Now if container is in some state which kubelet doesn't handle, it will be set as ContainerStateUnknown.

For docker, there are 6 states: "paused", "restarting", "running", "dead", "created", "exited". (See https://github.com/docker/docker/blob/master/container/state.go#L73)

In current implementation:

"paused", "restarting" and "running" container => ContainerStateRunning => Kubelet never restarts it.
"dead", "exited", "created" (docker start failed, container ExitCode become non-zero) => ContainerStateExited => Kubelet restarts it with respect to RestartPolicy
"created" (docker start hasn't been called) => ContainerStateUnknown => Undefined.

Before #21349, container in unknown state:

Will always be restarted if there is no exited historical container.
Will be restarted with respect to RestartPolicy if there is exited historical container.

This behavior is a little wired and unclear.

After #21349, container in unknown state will always be restarted with respect to RestartPolicy. However, @ncdc reports that it will cause container created but failed to be started never be restated again.

This PR makes sure that we always restart container in unknown state.

RFC:

How should we define the map from docker state to kubelet container state?
How should we deal with container in unknown state?
Does "always restart container in unknown state" work for other container runtimes?

@kubernetes/sig-node

k8s-github-robot · 2016-03-06T02:08:07Z

Labelling this PR as size/S

k8s-bot · 2016-03-06T02:34:59Z

GCE e2e build/test passed for commit 45064d7.

ncdc · 2016-03-07T15:48:13Z

cc @kubernetes/rh-cluster-infra @kubernetes/rh-platform-management @smarterclayton

aveshagarwal · 2016-03-07T16:12:15Z

pkg/kubelet/container/helpers.go

+	// Always restart container in unknown state now
+	if status.State == ContainerStateUnknown {
+		return true
+	}


shouldn't this check be after restart policy checks?

No. If you do it after the restart policy checks, then the restart policy is honored, which isn't what we want here.

ncdc · 2016-03-07T16:13:23Z

Assuming it's possible to distinguish "created but failed to start" from "created and we're about to try to start", I would like to treat "created but failed to start" as something that is always retried.

aveshagarwal · 2016-03-07T16:20:15Z

Trying to understand, how "created but failed to start" is different from dead?

ncdc · 2016-03-07T16:39:33Z

@aveshagarwal "created but failed to start" means the state is still created. In my test that I'm trying to reproduce, there is some issue in Docker where starting the container fails because it can't find a cgroup file.

"dead" is created, started without issue, ran something, then exited. "Ran something" could return an error (executable not found, something else, whatever). But the container runtime successfully "started" the container prior to it exiting.

aveshagarwal · 2016-03-07T16:40:59Z

It seems to me that what you are suggesting could be possible:
https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/dockertools/manager.go#L373

At this point, a new state (createdbutfailedtostart) could be assigned instead of ContainerStateExited and then it could be retried. So that seems possible.
Why we have to worry about created but about to try to start, as there does not seem anything wrong yet?

ncdc · 2016-03-07T16:42:50Z

I'm not sure we need a new state necessarily, although it probably would make things clearer. However, this PR as-is does fix the problem I ran into.

ncdc · 2016-03-07T16:43:39Z

We need to be able to distinguish between "created/failed to start" and "created/haven't started yet" so we don't inadvertently try to restart the latter.

ncdc · 2016-03-07T16:49:01Z

The following gist contains the e2e log and the portion of the kubelet log spanning the lifecycle of the pod in the e2e test. It did take just under 20 seconds for the replacement container to start for some reason.

ncdc · 2016-03-07T17:04:55Z

I just realized that the failure in my e2e test loop wasn't 100% identical to before. This time, it was a failure to start the infra container (System error: The minimum allowed cpu-shares is 1024). It looks like the actual value set was 2. I guess this means I need to keep running my test loop to make sure I get the error condition I was expecting, and that this PR + @pmorie's PR to fix the downward API pod IP flake together fix the flake.

aveshagarwal · 2016-03-07T17:31:50Z

@ncdc, If I understand correctly, you are saying, "created/haven't started yet" should not be restarted? Right? But this seems to be doing exactly this: As per comment in the code in ./pkg/kubelet/dockertools/manager.go

        } else {
                // Non-running containers that are not terminatd could be pasued, or created (but not yet
                // started), etc. Kubelet doesn't handle these scenarios yet.
                status.State = kubecontainer.ContainerStateUnknown
        }

"created/haven't started yet" is an ContainerStateUnknown and this PR is restarting in this scenario.

ncdc · 2016-03-07T17:32:09Z

@aveshagarwal I want it to restart

Random-Liu · 2016-03-07T18:33:41Z

We need to be able to distinguish between "created/failed to start" and "created/haven't started yet" so we don't inadvertently try to restart the latter.

I am also a little confused with this one.

To clarify a bit, there are 2 kinds of created state:

Container created, but the docker start call failed. Kubelet sets the container to ContainerStateExited, because the ExitCode is non-zero (I tried this, and found that the ExitCode is always? -1). https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/dockertools/manager.go#L375
Container created, but the docker start call hasn't been called yet. Kubelet sets the container to ContainerStateUnknown, because the ExitCode is zero. https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/dockertools/manager.go#L399

For 1), the restart policy is always honored, before and now. The behavior is never changed.
For 2), the behavior is not well defined, and is changed in #21349 and this PR.

@ncdc It looks like what you met should be 2), right?
In fact I don't even understand how 2) would happen, because the docker create and docker start are called one after the other https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/dockertools/manager.go#L623. Could it be said that the ExitCode may still be zero after docker start failed?

ncdc · 2016-03-07T18:38:50Z

@Random-Liu I'm talking about 1. Here's what I saw in the past:

Container is created (docker container state = created)
Try to start container
Container fails to start (cgroup race)
1. docker container state = created
2. exit code = -1

Here is sample docker inspect output for this scenario: https://gist.github.com/ncdc/5fd0a96dff76b0513186#file-infra-inspect-log-L9-L16

In the past, when I encountered this sequence of events, the Kubelet created another container to replace the first container that was created but failed to start. I am 100% certain of this.

Random-Liu · 2016-03-07T18:59:13Z

@ncdc Thanks, I'll look into it.

yujuhong · 2016-03-07T21:10:59Z

In the past, when I encountered this sequence of events, the Kubelet created another container to replace the first container that was created but failed to start. I am 100% certain of this.

This makes sense and aligns with what I remember about the past behavior.

@ncdc, has your issue resolved with this PR?

ncdc · 2016-03-07T21:12:45Z

@yujuhong I've yet to reproduce the exact same failure scenario as before. I will keep trying. Hopefully it will show up soon!

yujuhong · 2016-03-07T21:22:32Z

For docker, there are 6 states: "paused", "restarting", "running", "dead", "created", "exited". (See https://github.com/docker/docker/blob/master/container/state.go#L73)

kubelet doesn't handle paused and restarting containers since we don't restart containers. We can potentially add support to recognize these containers and delete/recreate them. However, it's a bit of stretch to assume kubelet can handle them correctly.

AFAIK, we used to check if the container is running, and lump all non-running containers to one category and treat them as "exited" container.

In current implementation:

"paused", "restarting" and "running" container => ContainerStateRunning => Kubelet never restarts it.
"dead", "exited", "created" (docker start failed, container ExitCode become non-zero) => ContainerStateExited => Kubelet restarts it with respect to RestartPolicy
"created" (docker start hasn't been called) => ContainerStateUnknown => Undefined.
Before #21349, container in unknown state:

Will always be restarted if there is no exited historical container.
Will be restarted with respect to RestartPolicy if there is exited historical container.
This behavior is a little wired and unclear.

After #21349, container in unknown state will always be restarted with respect to RestartPolicy. However, @ncdc reports that it will cause container created but failed to be started never be restated again.

This PR makes sure that we always restart container in unknown state.

RFC:

How should we define the map from docker state to kubelet container state?

Ideally we should be able to handle all states. In reality, our SyncPod does a million of changes in one iteration, and observing intermediate state changes caused by one iteration don't help (yet).

How should we deal with container in unknown state?

We can treat them as exited containers like before, with the caveat of compatibility with rkt (below).

Does "always restart container in unknown state" work for other container runtimes?

I am a bit concerned that rkt may report a non-running intermediate state after syncing the pod for the first time. @yifan-gu, is that possible?

k8s-github-robot · 2016-03-08T02:36:48Z

@k8s-bot test this

Tests are more than 48 hours old. Re-running tests.

k8s-bot · 2016-03-08T03:09:20Z

GCE e2e build/test passed for commit 45064d7.

k8s-github-robot · 2016-03-08T11:43:36Z

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

k8s-bot · 2016-03-08T12:15:14Z

GCE e2e build/test passed for commit 45064d7.

k8s-github-robot · 2016-03-08T12:28:19Z

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

k8s-bot · 2016-03-08T12:59:41Z

GCE e2e build/test passed for commit 45064d7.

k8s-github-robot · 2016-03-08T13:02:35Z

Automatic merge from submit-queue

…iner Auto commit by PR queue bot

Always restart container in unknown state.

45064d7

Random-Liu added area/kubelet sig/node Categorizes an issue or PR as relevant to SIG Node. labels Mar 6, 2016

Random-Liu assigned yujuhong Mar 6, 2016

googlebot added the cla: yes label Mar 6, 2016

k8s-github-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Mar 6, 2016

This was referenced Mar 7, 2016

"cpu.shares: no such file or directory" error in e2e test - rc, dc, database pod, etc fail to start openshift/origin#7616

Closed

Finalize Kube items 3.2 openshift/origin#6766

Closed

ncdc mentioned this pull request Mar 7, 2016

e2e flake: Downward API should provide pod IP as an env var #20403

Closed

aveshagarwal reviewed Mar 7, 2016
View reviewed changes

pmorie mentioned this pull request Mar 7, 2016

Fix flake in pod IP as env var e2e #22666

Merged

yujuhong added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 7, 2016

bgrant0607 changed the title ~~[RFC]Kubelet: Always restart container in unknown state.~~ Kubelet: Always restart container in unknown state. Mar 7, 2016

dchen1107 added kind/bug Categorizes issue or PR as related to a bug. cherrypick-candidate labels Mar 8, 2016

dchen1107 added this to the v1.2 milestone Mar 8, 2016

k8s-github-robot pushed a commit that referenced this pull request Mar 8, 2016

Merge pull request #22607 from Random-Liu/restart-unknown-state-conta…

8b18699

…iner Auto commit by PR queue bot

k8s-github-robot merged commit 8b18699 into kubernetes:master Mar 8, 2016

Random-Liu deleted the restart-unknown-state-container branch March 8, 2016 15:55

eparis removed the cherrypick-candidate label Mar 9, 2016

Random-Liu mentioned this pull request Mar 30, 2016

Kubelet sometimes creates a second container when it shouldn't #23636

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kubelet: Always restart container in unknown state. #22607

Kubelet: Always restart container in unknown state. #22607

Random-Liu commented Mar 6, 2016

k8s-github-robot commented Mar 6, 2016

k8s-bot commented Mar 6, 2016

ncdc commented Mar 7, 2016

aveshagarwal Mar 7, 2016

ncdc Mar 7, 2016

ncdc commented Mar 7, 2016

aveshagarwal commented Mar 7, 2016

ncdc commented Mar 7, 2016

aveshagarwal commented Mar 7, 2016

ncdc commented Mar 7, 2016

ncdc commented Mar 7, 2016

ncdc commented Mar 7, 2016

ncdc commented Mar 7, 2016

aveshagarwal commented Mar 7, 2016

ncdc commented Mar 7, 2016

Random-Liu commented Mar 7, 2016

ncdc commented Mar 7, 2016

Random-Liu commented Mar 7, 2016

yujuhong commented Mar 7, 2016

ncdc commented Mar 7, 2016

yujuhong commented Mar 7, 2016

k8s-github-robot commented Mar 8, 2016

k8s-bot commented Mar 8, 2016

k8s-github-robot commented Mar 8, 2016

k8s-bot commented Mar 8, 2016

k8s-github-robot commented Mar 8, 2016

k8s-bot commented Mar 8, 2016

k8s-github-robot commented Mar 8, 2016

Kubelet: Always restart container in unknown state. #22607

Kubelet: Always restart container in unknown state. #22607

Conversation

Random-Liu commented Mar 6, 2016

k8s-github-robot commented Mar 6, 2016

k8s-bot commented Mar 6, 2016

ncdc commented Mar 7, 2016

aveshagarwal Mar 7, 2016

Choose a reason for hiding this comment

ncdc Mar 7, 2016

Choose a reason for hiding this comment

ncdc commented Mar 7, 2016

aveshagarwal commented Mar 7, 2016

ncdc commented Mar 7, 2016

aveshagarwal commented Mar 7, 2016

ncdc commented Mar 7, 2016

ncdc commented Mar 7, 2016

ncdc commented Mar 7, 2016

ncdc commented Mar 7, 2016

aveshagarwal commented Mar 7, 2016

ncdc commented Mar 7, 2016

Random-Liu commented Mar 7, 2016

ncdc commented Mar 7, 2016

Random-Liu commented Mar 7, 2016

yujuhong commented Mar 7, 2016

ncdc commented Mar 7, 2016

yujuhong commented Mar 7, 2016

k8s-github-robot commented Mar 8, 2016

k8s-bot commented Mar 8, 2016

k8s-github-robot commented Mar 8, 2016

k8s-bot commented Mar 8, 2016

k8s-github-robot commented Mar 8, 2016

k8s-bot commented Mar 8, 2016

k8s-github-robot commented Mar 8, 2016