Skip to content

Commit

Permalink
TTL controller for cleaning up finished resources
Browse files Browse the repository at this point in the history
  • Loading branch information
janetkuo committed Sep 7, 2018
1 parent 1ea8f96 commit c5f7c7a
Show file tree
Hide file tree
Showing 3 changed files with 127 additions and 0 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -247,6 +247,51 @@ spec:
Note that both the Job Spec and the [Pod Template Spec](https://kubernetes.io/docs/concepts/workloads/pods/init-containers/#detailed-behavior) within the Job have an `activeDeadlineSeconds` field. Ensure that you set this field at the proper level.

## Clean Up Finished Jobs Automatically

Finished Jobs are usually no longer needed in the system. Keeping them around in
the system will put pressure on API server. If the Jobs are managed directly by
a higher level controller, such as
[CronJobs](/docs/concepts/workloads/controllers/cron-jobs/), the Jobs can be
cleaned up by CronJobs based on specified cleanup policy.

Another way to clean up finished Jobs (either `Complete` or `Failed`)
automatically is to use a TTL mechanism provided by a
[TTL controller](/docs/concepts/workloads/controllers/ttlafterfinished/) for
finished resources, by specifying the `.spec.ttlSecondsAfterFinished` field of
the Job.

For example:

```yaml
apiVersion: batch/v1
kind: Job
metadata:
name: pi-with-ttl
spec:
spec:
ttlSecondsAfterFinished: 100
template:
spec:
containers:
- name: pi
image: perl
command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
restartPolicy: Never
```

The Job `pi-with-ttl` will be eligible to be automatically deleted, `100`
seconds after it finishes. Note that when the Job is deleted, its lifecycle
guarantees, such as finalizers, will be honored.

If the field is set to `0`, the Job will be eligible to be automatically deleted
immediately after it finishes. If the field is unset, this Jobs won't be cleaned
up by the TTL controller after it finishes.

Note that this TTL mechanism is alpha, with feature gate `TTLAfterFinished`. For
more information, see the documentation for
[TTL controller](/docs/concepts/workloads/controllers/ttlafterfinished/) for
finished resources.

## Job Patterns

Expand Down
80 changes: 80 additions & 0 deletions content/en/docs/concepts/workloads/controllers/ttlafterfinished.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
---
reviewers:
- janetkuo
title: TTL Controller for Finished Resources
content_template: templates/concept
weight: ??
---

{{% capture overview %}}

The TTL controller provides a TTL mechanism to limit the lifetime of resource
objects that have finished execution. Currently, TTL controller only handles
[Jobs](/docs/concepts/workloads/controllers/jobs-run-to-completion/) for
now, and may be expanded to handle other resources that will finish execution,
such as Pods and custom resources.

Alpha Disclaimer: this feature is currently alpha, and can be enabled with
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
`TTLAfterFinished`.


{{% /capture %}}


{{< toc >}}


{{% capture body %}}

## TTL Controller

The TTL controller only supports Jobs for now. You can use this feature to clean
up finished Jobs (either `Complete` or `Failed`) automatically by specifying the
`.spec.ttlSecondsAfterFinished` field of a Job,
see [example](/docs/concepts/workloads/controllers/jobs-run-to-completion/#clean-up-finished-jobs-automatically).
The TTL controller will assume that a resource is eligible to be cleaned up
TTL seconds after the resource has finished, i.e. TTL has expired. When the
resource is deleted, its lifecycle guarantees, such as finalizers, will be
honored.

The TTL seconds can be set at any time -- for example, you can specify it in the
resource manifest, set it at resource creation time, or set it after the
resource has finished. You can also use
[mutating admission webhooks](https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/#admission-webhooks)
to set this field dynamically.

In the future, we plan to expand TTL controller to other resources that will
finish execution, such as Pods and custom resources.

### Caveat

#### Updating TTL Seconds

Note that the TTL period, e.g. `.spec.ttlSecondsAfterFinished` field of Jobs,
can be modified after the resource is created or has finished. However, once the
Job becomes eligible to be deleted (i.e. the TTL has expired), the system won't
guarantee that the Jobs will be kept, even if an update to extend the TTL
returns a successful API response.

#### Time Skew

Because TTL controller uses timestamps stored in the Kubernetes resources to
determine whether the TTL has expired or not, this feature is sensitive to time
skew in the cluster, which may cause TTL controller to clean up resource objects
at the wrong time.

In Kubernetes, it's required to run NTP on all nodes
(see [#6159](https://github.com/kubernetes/kubernetes/issues/6159#issuecomment-93844058))
to avoid time skew. Clocks aren't always correct, but the difference should be
very small. Please be aware of this risk when setting a non-zero TTL.

{{% /capture %}}

{{% capture whatsnext %}}

[Clean up Jobs automatically](/docs/concepts/workloads/controllers/jobs-run-to-completion/#clean-up-finished-jobs-automatically)

[Design doc](https://github.com/kubernetes/community/blob/master/keps/sig-apps/0026-ttl-after-finish.md)

{{% /capture %}}
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,7 @@ different Kubernetes components.
| `TaintNodesByCondition` | `false` | Alpha | 1.8 | |
| `TokenRequest` | `false` | Alpha | 1.10 | |
| `TokenRequestProjection` | `false` | Alpha | 1.11 | |
| `TTLAfterFinished` | `false` | Alpha | 1.12 | |
| `VolumeScheduling` | `false` | Alpha | 1.9 | 1.9 |
| `VolumeScheduling` | `true` | Beta | 1.10 | |
| `VolumeSubpathEnvExpansion` | `false` | Alpha | 1.11 | |
Expand Down Expand Up @@ -246,6 +247,7 @@ Each feature gate is designed for enabling/disabling a specific feature:
- `TokenRequest`: Enable the `TokenRequest` endpoint on service account resources.
- `TokenRequestProjection`: Enable the injection of service account tokens into
a Pod through the [`projected` volume](/docs/concepts/storage/volumes/#projected).
- `TTLAfterFinished`: Allow a [TTL controller](/docs/concepts/workloads/controllers/ttlafterfinished/) to clean up resources after they finish execution.
- `VolumeScheduling`: Enable volume topology aware scheduling and make the
PersistentVolumeClaim (PVC) binding aware of scheduling decisions. It also
enables the usage of [`local`](/docs/concepts/storage/volumes/#local) volume
Expand Down

0 comments on commit c5f7c7a

Please sign in to comment.