Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubelet: add image GC threshold settings #2219

Merged

Conversation

mchaker
Copy link
Contributor

@mchaker mchaker commented Jun 15, 2022

Issue number:
Closes #2216
Closes #2217
Closes #2065

Description of changes:
Adds two kubelet settings: imageGCHighThresholdPercent and imageGCLowThresholdPercent.
Also adds validation for those two settings.

From the kubernetes repo (https://pkg.go.dev/k8s.io/kubernetes/pkg/kubelet/apis/config#KubeletConfiguration):

// imageGCHighThresholdPercent is the percent of disk usage after which
// image garbage collection is always run. The percent is calculated as
// this field value out of 100.

// imageGCLowThresholdPercent is the percent of disk usage before which
// image garbage collection is never run. Lowest disk usage to garbage
// collect to. The percent is calculated as this field value out of 100.

Some Q&A:

1. Why validate twice: at the apiserver level, and the template rendering level?
1. There are two kinds of invalidity for the settings this PR adds: independent invalidity and interdependent invalidity. Independent invalidity would be a value outside of the defined 0-100 threshold. Interdependent invalidity is when the High threshold percent is lower than the Low threshold percent, and is only calculable by comparing (or knowing) both settings at the same time.
1. The independent invalidity can and should be checked as soon as possible and stop the user from entering invalid values. That kind of validation can and is done at the apiserver level.
1. The interdependent invalidity is only able to be checked when the template is about to be rendered -- when both setting values are known (since the user can provide only one of the two new settings in an apiserver request, and these settings also have default kubelet values to initially compare against).
1. Why does your template code contain ~ characters at each end of the helper function call?
1. That is a handlebars feature that removes whitespace (including newlines) before or after that ~ character.
1. In the case of this PR, the helper function image_gc_threshold_percent actually handles the rendering of necessary whitespace after writing out the setting values (specifically: the newlines after each setting), so the template rendering engine/handlebars should remove the existing newline it sees at the end of the helper function call (after the }} at the end of the line). Therefore, the ~ feature of handlebars was used.

Testing done:

  • Upgrade/downgrade for variant: aws-k8s-1.22
  • [ ] Upgrade/downgrade for variant: metal 1.22
  • Rendering and setting updates for variant: vmware 1.22
  • Manual testing of valid and invalid values, in varying order, on a bottlerocket aws-k8s-1.22 node in a k8s 1.22 cluster

Terms of contribution:

By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.

@mchaker mchaker force-pushed the mchaker/2022/06/eks-gc-thresholds branch from 962b267 to 6afa956 Compare June 24, 2022 19:50
@mchaker
Copy link
Contributor Author

mchaker commented Jun 24, 2022

Latest force push: consolidate branch history. Summary of changes below.

apiserver and schnauzer: Add validation for new kubelet settings imageGCHighThresholdPercent and imageGCLowThresholdPercent.

Description

  • apiserver: add new setting types to the model with basic apiserver-level validation
  • schnauzer: add template rules and more in-depth validation for new settings

What settings?

in the kubelet-config template:

  • imageGCHighThresholdPercent and imageGCLowThresholdPercent

How and why

Add value validation to both the apiserver and template (via schnauzer), since these settings are interdependently validated. Meaning, since the High Image GC Threshold Percent must be greater than the Low Image GC Threshold Percent, both settings must be checked against each other. That is not possible at the apiserver level, or at least is much easier to do in the template rendering level (once all the values from the datastore are ready to be written).

@mchaker mchaker marked this pull request as ready for review June 24, 2022 20:27
@mchaker mchaker force-pushed the mchaker/2022/06/eks-gc-thresholds branch from d7286b0 to ec2699a Compare June 27, 2022 22:55
@mchaker mchaker requested review from webern and etungsten and removed request for webern June 28, 2022 16:49
packages/kubernetes-1.19/kubelet-config Outdated Show resolved Hide resolved
sources/models/src/modeled_types/kubernetes.rs Outdated Show resolved Hide resolved
sources/models/src/modeled_types/kubernetes.rs Outdated Show resolved Hide resolved
sources/models/src/modeled_types/kubernetes.rs Outdated Show resolved Hide resolved
sources/models/src/modeled_types/kubernetes.rs Outdated Show resolved Hide resolved
sources/models/src/modeled_types/kubernetes.rs Outdated Show resolved Hide resolved
sources/api/schnauzer/src/helpers.rs Outdated Show resolved Hide resolved
sources/api/schnauzer/src/helpers.rs Outdated Show resolved Hide resolved
sources/api/schnauzer/src/helpers.rs Outdated Show resolved Hide resolved
sources/api/schnauzer/src/helpers.rs Outdated Show resolved Hide resolved
@mchaker mchaker force-pushed the mchaker/2022/06/eks-gc-thresholds branch from ec2699a to 671cb43 Compare June 30, 2022 21:41
@mchaker
Copy link
Contributor Author

mchaker commented Jun 30, 2022

Latest force push: remove setting validation from template rendering layer. Remove helper function in template rendering layer. Remove interdependent validation entirely, but keep independent validation. Clarify kubelet config template (leave setting names and defaults in the template, instead of in Rust code).

@mchaker mchaker force-pushed the mchaker/2022/06/eks-gc-thresholds branch from 671cb43 to ad277a7 Compare June 30, 2022 21:55
@mchaker
Copy link
Contributor Author

mchaker commented Jun 30, 2022

Latest force push: remove remaining unused constants.

sources/models/src/modeled_types/kubernetes.rs Outdated Show resolved Hide resolved
sources/models/src/modeled_types/kubernetes.rs Outdated Show resolved Hide resolved
@bcressey bcressey self-requested a review July 6, 2022 20:55
@mchaker mchaker force-pushed the mchaker/2022/06/eks-gc-thresholds branch from ad277a7 to fce4d67 Compare July 6, 2022 21:28
@mchaker
Copy link
Contributor Author

mchaker commented Jul 6, 2022

Latest force push: clean up overkill tests (remove unnecessary for loop), remove unnecessary comments, squashed commits to 1 commit, and chose to keep the {{#if}} path of settings validation instead of also using the default helper function

@mchaker mchaker force-pushed the mchaker/2022/06/eks-gc-thresholds branch from fce4d67 to 112dc75 Compare July 6, 2022 22:18
@mchaker
Copy link
Contributor Author

mchaker commented Jul 6, 2022

Latest force push: fix commit message

@bcressey
Copy link
Contributor

bcressey commented Jul 7, 2022

Code change LGTM.

For writing commit messages, I tend to roughly follow this guide with some minor changes:
https://cbea.ms/git-commit/

Minor changes:

  • I don't capitalize the subject
  • I start with a "subsystem: blah" tag unless the change is very cross-cutting

So for this commit I would say:

kubelet: add image GC threshold settings

And combine your current subject and body into a new body wrapped at 70 characters.

@mchaker mchaker changed the title Add kubelet config options: imageGCHighThresholdPercent, imageGCLowThresholdPercent kubelet: add image GC threshold settings Jul 7, 2022
@mchaker mchaker force-pushed the mchaker/2022/06/eks-gc-thresholds branch from 112dc75 to f9a31af Compare July 7, 2022 15:19
@mchaker
Copy link
Contributor Author

mchaker commented Jul 7, 2022

Code change LGTM.

For writing commit messages, I tend to roughly follow this guide with some minor changes: https://cbea.ms/git-commit/

Minor changes:

  • I don't capitalize the subject
  • I start with a "subsystem: blah" tag unless the change is very cross-cutting

So for this commit I would say:

kubelet: add image GC threshold settings

And combine your current subject and body into a new body wrapped at 70 characters.

Thank you @bcressey ! I've clarified the commit message and PR title.

Copy link
Contributor

@etungsten etungsten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please update the PR description to reflect the latest state of the PR? Some parts of it is out of date.

sources/api/schnauzer/src/helpers.rs Outdated Show resolved Hide resolved
sources/api/schnauzer/src/helpers.rs Outdated Show resolved Hide resolved
@mchaker mchaker force-pushed the mchaker/2022/06/eks-gc-thresholds branch 3 times, most recently from 6525d31 to d398b10 Compare July 11, 2022 05:51
@mchaker
Copy link
Contributor Author

mchaker commented Jul 11, 2022

Last 3 git force-pushes: added README documentation for the new kubernetes settings and updated Release.toml to include the latest migration.

README.md Outdated Show resolved Hide resolved
@mchaker mchaker force-pushed the mchaker/2022/06/eks-gc-thresholds branch from 28b839a to 0af4fa5 Compare July 11, 2022 17:35
@mchaker
Copy link
Contributor Author

mchaker commented Jul 11, 2022

Latest force push: remove linter formatting-on-save unrelated changes

@mchaker mchaker force-pushed the mchaker/2022/06/eks-gc-thresholds branch 2 times, most recently from 0787a72 to 1c5f9d5 Compare July 11, 2022 21:31
@mchaker
Copy link
Contributor Author

mchaker commented Jul 11, 2022

Latest force push: Fixed README.md getting over-tuned by my linter. Also recovered my PR from having too many commits squashed into it accidentally.

Copy link
Contributor

@etungsten etungsten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🍨

@mchaker mchaker force-pushed the mchaker/2022/06/eks-gc-thresholds branch from 1c5f9d5 to 985adcb Compare July 12, 2022 17:58
@mchaker
Copy link
Contributor Author

mchaker commented Jul 12, 2022

Latest force push: had to rebase my branch/PR due to merge conflicts with develop/HEAD

README.md Outdated
@@ -411,6 +411,8 @@ The following settings are optional and allow you to further configure your clus
* `settings.kubernetes.topology-manager-policy`: Specifies the topology manager policy. Possible values are `none`, `restricted`, `best-effort`, and `single-numa-node`. Defaults to `none`.
* `settings.kubernetes.topology-manager-scope`: Specifies the topology manager scope. Possible values are `container` and `pod`. Defaults to `container`. If you want to group all containers in a pod to a common set of NUMA nodes, you can set this setting to `pod`.
* `settings.kubernetes.pod-pids-limit`: The maximum number of processes per pod.
* `settings.kubernetes.image-gc-high-threshold-percent`: The disk utilization percentage where `kubelet` begins garbage collection over the stored container images. In short, this is the "disk is too full" mark.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't sound correct; my reading of the documentation is that the garbage collector will run less frequently - not always, but not never - if disk usage percentage falls between high and low.

I would also avoid phrases like "In short". Maybe just strike the second sentence from each since you're introducing an abstract concept - whether or not the disk is too full or empty enough - to what would otherwise be a precise description of how the setting controls the kubelet garbage collector.

Copy link
Contributor Author

@mchaker mchaker Jul 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct.

EDIT: Will use the following, as per the kubelet config documentation:

settings.kubernetes.image-gc-high-threshold-percent: the percent of disk usage after which image garbage collection is always run.

settings.kubernetes.image-gc-low-threshold-percent: the percent of disk usage before which image garbage collection is never run.

@mchaker mchaker force-pushed the mchaker/2022/06/eks-gc-thresholds branch from 985adcb to 37ee773 Compare July 13, 2022 00:05
@mchaker
Copy link
Contributor Author

mchaker commented Jul 13, 2022

Latest force push: make README documentation accurate to kubelet behavior as per its documentation

Add kubelet config options: `imageGCHighThresholdPercent`,
`imageGCLowThresholdPercent` and validation for both
settings to both the apiserver and template rendering code.
@mchaker mchaker force-pushed the mchaker/2022/06/eks-gc-thresholds branch from 37ee773 to 30e6f6c Compare July 14, 2022 18:34
@mchaker
Copy link
Contributor Author

mchaker commented Jul 14, 2022

Latest force push: merge in the latest from develop

@mchaker mchaker merged commit 4fb1e04 into bottlerocket-os:develop Jul 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants