Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add Restart=on-failure for inner docker systemd service #9775

Merged
merged 3 commits into from
Nov 26, 2020

Conversation

medyagh
Copy link
Member

@medyagh medyagh commented Nov 24, 2020

closes #9691

Fixes the case when docker dies and we never restart it. and then when users run minkube docker-env minikube sees docker sotpped and tries to restart it and that had caused Api server container to be restarting as well.

it was partially improved by previous PR that changed restart to reload) but still didnt fix actual problem that docker was dying and nobody was birning it back to life

this PR will make Docker service to start after it is failed automatically by Systemd without sending it to an endless loop

in addition to restart always I added StartLimitIntervalSec=interval and , StartLimitBurst=burst to the systemd file to avoid infinite restart loops

systemd docs https://www.freedesktop.org/software/systemd/man/systemd.unit.html#StartLimitIntervalSec=

StartLimitIntervalSec=interval, StartLimitBurst=burst
Configure unit start rate limiting. Units which are started more than burst times within an interval time interval are not permitted to start any more. Use StartLimitIntervalSec= to configure the checking interval (defaults to DefaultStartLimitIntervalSec= in manager configuration file, set it to 0 to disable any kind of rate limiting). Use StartLimitBurst= to configure how many starts per interval are allowed (defaults to DefaultStartLimitBurst= in manager configuration file). These configuration options are particularly useful in conjunction with the service setting Restart= (see systemd.service(5)); however, they apply to all kinds of starts (including manual), not just those triggered by the Restart= logic. Note that units which are configured for Restart= and which reach the start limit are not attempted to be restarted anymore; however, they may still be restarted manually at a later point, after the interval has passed. From this point on, the restart logic is activated again. Note that systemctl reset-failed will cause the restart rate counter for a service to be flushed, which is useful if the administrator wants to manually start a unit and the start limit interferes with that. Note that this rate-limiting is enforced after any unit condition checks are executed, and hence unit activations with failing conditions do not count towards this rate limit. This setting does not apply to slice, target, device, and scope units, since they are unit types whose activation may either never fail, or may succeed only a single time.

When a unit is unloaded due to the garbage collection logic (see above) its rate limit counters are flushed out too. This means that configuring start rate limiting for a unit that is not referenced continuously has no effect.


link to Docker's recommended setting

https://github.com/moby/moby/blob/e1b15e1e5bf3a512ac7298ed29fcd4126c45c1ae/contrib/init/systemd/docker.service#L29-L31

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Nov 24, 2020
@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 24, 2020
@medyagh
Copy link
Member Author

medyagh commented Nov 24, 2020

/ok-to-test

@k8s-ci-robot k8s-ci-robot added the ok-to-test Indicates a non-member PR verified by an org member that is safe to test. label Nov 24, 2020
@kubernetes kubernetes deleted a comment from minikube-pr-bot Nov 24, 2020
@minikube-pr-bot
Copy link

kvm2 Driver
Times for minikube: 57.9s 56.6s 58.6s
Average time for minikube: 57.7s

Times for Minikube (PR 9775): 59.6s 59.1s 54.2s
Average time for Minikube (PR 9775): 57.6s

Averages Time Per Log

+--------------------------------+----------+--------------------+
|              LOG               | MINIKUBE | MINIKUBE (PR 9775) |
+--------------------------------+----------+--------------------+
| * minikube v1.15.1 on Debian   | 0.1s     | 0.0s               |
|                           9.11 |          |                    |
| * Using the kvm2 driver based  | 0.0s     | 0.0s               |
| on user configuration          |          |                    |
| * Starting control plane node  | 0.0s     | 0.0s               |
| minikube in cluster minikube   |          |                    |
| * Creating kvm2 VM (CPUs=2,    | 34.3s    | 33.8s              |
| Memory=3700MB, Disk=20000MB)   |          |                    |
| ...                            |          |                    |
| * Preparing Kubernetes v1.19.4 | 21.5s    | 21.3s              |
| on Docker 19.03.13 ...         |          |                    |
| * Verifying Kubernetes         | 1.5s     | 1.6s               |
| components...                  |          |                    |
| * Enabled addons:              | 0.2s     | 0.9s               |
| storage-provisioner,           |          |                    |
| default-storageclass           |          |                    |
|                                | 0.0s     | 0.8s               |
|   - Want kubectl v1.19.4? Try  |          |                    |
| 'minikube kubectl -- get pods  |          |                    |
| -A'                            |          |                    |
| * Done! kubectl is now         |          |                    |
| configured to use "minikube"   |          |                    |
| cluster and "default"          |          |                    |
| namespace by default           |          |                    |
+--------------------------------+----------+--------------------+

docker Driver
Times for minikube: 28.9s 29.2s 29.0s
Average time for minikube: 29.0s

Times for Minikube (PR 9775): 28.6s 28.5s 28.5s
Average time for Minikube (PR 9775): 28.5s

Averages Time Per Log

+--------------------------------+----------+--------------------+
|              LOG               | MINIKUBE | MINIKUBE (PR 9775) |
+--------------------------------+----------+--------------------+
| * minikube v1.15.1 on Debian   | 0.2s     | 0.2s               |
|                           9.11 |          |                    |
| * Using the docker driver      | 0.1s     | 0.1s               |
| based on user configuration    |          |                    |
| * Starting control plane node  | 0.1s     | 0.1s               |
| minikube in cluster minikube   |          |                    |
| * Creating docker container    | 8.9s     | 8.9s               |
| (CPUs=2, Memory=3700MB) ...    |          |                    |
| * Preparing Kubernetes v1.19.4 | 18.5s    | 17.9s              |
| on Docker 19.03.13 ...         |          |                    |
| * Verifying Kubernetes         | 1.2s     | 1.2s               |
| components...                  |          |                    |
| * Enabled addons:              | 0.1s     | 0.1s               |
| storage-provisioner,           |          |                    |
| default-storageclass           |          |                    |
|                                | 0.0s     | 0.0s               |
|   - Want kubectl v1.19.4? Try  |          |                    |
| 'minikube kubectl -- get pods  |          |                    |
| -A'                            |          |                    |
| * Done! kubectl is now         |          |                    |
| configured to use "minikube"   |          |                    |
| cluster and "default"          |          |                    |
| namespace by default           |          |                    |
+--------------------------------+----------+--------------------+

@minikube-pr-bot
Copy link

kvm2 Driver
Times for minikube: 57.1s 55.3s 61.1s
Average time for minikube: 57.9s

Times for Minikube (PR 9775): 56.1s 60.9s 60.1s
Average time for Minikube (PR 9775): 59.0s

Averages Time Per Log

+--------------------------------+----------+--------------------+
|              LOG               | MINIKUBE | MINIKUBE (PR 9775) |
+--------------------------------+----------+--------------------+
| * minikube v1.15.1 on Debian   | 0.1s     | 0.1s               |
|                           9.11 |          |                    |
| * Using the kvm2 driver based  | 0.0s     | 0.0s               |
| on user configuration          |          |                    |
| * Starting control plane node  | 0.0s     | 0.0s               |
| minikube in cluster minikube   |          |                    |
| * Creating kvm2 VM (CPUs=2,    | 33.3s    | 34.2s              |
| Memory=3700MB, Disk=20000MB)   |          |                    |
| ...                            |          |                    |
| * Preparing Kubernetes v1.19.4 | 22.6s    | 22.8s              |
| on Docker 19.03.13 ...         |          |                    |
| * Verifying Kubernetes         | 1.5s     | 1.5s               |
| components...                  |          |                    |
| * Enabled addons:              | 0.3s     | 0.4s               |
| storage-provisioner,           |          |                    |
| default-storageclass           |          |                    |
|                                | 0.0s     | 0.5s               |
|   - Want kubectl v1.19.4? Try  |          |                    |
| 'minikube kubectl -- get pods  |          |                    |
| -A'                            |          |                    |
| * Done! kubectl is now         |          |                    |
| configured to use "minikube"   |          |                    |
| cluster and "default"          |          |                    |
| namespace by default           |          |                    |
+--------------------------------+----------+--------------------+

docker Driver
Times for minikube: 28.8s 28.9s 29.3s
Average time for minikube: 29.0s

Times for Minikube (PR 9775): 28.3s 29.8s 28.5s
Average time for Minikube (PR 9775): 28.9s

Averages Time Per Log

+--------------------------------+----------+--------------------+
|              LOG               | MINIKUBE | MINIKUBE (PR 9775) |
+--------------------------------+----------+--------------------+
| * minikube v1.15.1 on Debian   | 0.2s     | 0.2s               |
|                           9.11 |          |                    |
| * Using the docker driver      | 0.1s     | 0.1s               |
| based on user configuration    |          |                    |
| * Starting control plane node  | 0.1s     | 0.1s               |
| minikube in cluster minikube   |          |                    |
| * Creating docker container    | 9.2s     | 9.2s               |
| (CPUs=2, Memory=3700MB) ...    |          |                    |
| * Preparing Kubernetes v1.19.4 | 18.2s    | 18.0s              |
| on Docker 19.03.13 ...         |          |                    |
| * Verifying Kubernetes         | 1.2s     | 1.2s               |
| components...                  |          |                    |
| * Enabled addons:              | 0.1s     | 0.1s               |
| storage-provisioner,           |          |                    |
| default-storageclass           |          |                    |
|                                | 0.2s     | 0.0s               |
|   - Want kubectl v1.19.4? Try  |          |                    |
| 'minikube kubectl -- get pods  |          |                    |
| -A'                            |          |                    |
| * Done! kubectl is now         |          |                    |
| configured to use "minikube"   |          |                    |
| cluster and "default"          |          |                    |
| namespace by default           |          |                    |
+--------------------------------+----------+--------------------+

@medyagh
Copy link
Member Author

medyagh commented Nov 25, 2020

/retest-this-please

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: medyagh, sharifelgamal

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [medyagh,sharifelgamal]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@medyagh medyagh merged commit 322e138 into kubernetes:master Nov 26, 2020
@medyagh medyagh changed the title restart inner docker daemon using systemd add Restart=on-failure for inner docker systemd service Nov 26, 2020
@medyagh medyagh deleted the restart_docker_systemd branch March 2, 2021 21:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Flake Fail: TestFunctional/parallel/DockerEnv
4 participants