Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KEP-24: Promote AppArmor to GA #4417

Merged
merged 2 commits into from
Feb 8, 2024
Merged

Conversation

tallclair
Copy link
Member

Add a retroactive KEP for AppArmor, and GA promotion criteria

Supercedes #3298

/sig node
/milestone v1.30
/cc @dchen1107

@k8s-ci-robot k8s-ci-robot added this to the v1.30 milestone Jan 19, 2024
@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jan 19, 2024

#### Pod Creation

If no AppArmor annotations or fields are specified, no action is necessary.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we keep the annotations support virtually forever?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of migrating the annotations and then removing them.

💭 This could even cover a Pod that has been happily running since v1.8 or something and the node is still there accumulating uptime. Maybe all of the containers have been restarted over the years and none of the original code is running, but it's still the same Pod and the AppArmor profile still constrains it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the discussion we had for seccomp annotations: kubernetes/kubernetes#109819 (comment). I think we should follow the same path here. I'll add a section for it.

This could even cover a Pod that has been happily running since v1.8 or something and the node is still there accumulating uptime. Maybe all of the containers have been restarted over the years and none of the original code is running, but it's still the same Pod and the AppArmor profile still constrains it.

I thought we don't support in-place kubelet upgrades, so we don't need to worry about this case?

https://github.com/kubernetes/kubernetes/blob/5c664599a9a80f01c30d924702f7ad799f7dd956/pkg/kubelet/nodestatus/setters.go#L530-L536

The ready condition is certainly not the right place for this message, but more generally the
Kubelet does not broadcast the status of every optional feature. For GA, this message will simply be
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to make sure to update relonotes in case there are any dependencines on this.

If built-in support for multiple runtime profiles is needed in the future, a new
KEP will be created to cover its details.

#### Validation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the feature is not clearly explaining what it means that AppArmor profile will be applied. The check for AppArmor being enabled or not is quite simple and not account for AppArmor service to be running. I actually don't know what will happen if one disabled the service. Will profile will still take an effect? I wonder what guarantees K8s node can give on whether AppArmor is working or not.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure exactly what you mean by this? The apparmor.IsEnabled() check checks whether the kernel module is loaded. We used to try and do this with the pod status, but I think if a user really wants to guarantee that apparmor is being enforced, they're better off using a 3rd party monitoring service that will actually confirm the process status (e.g. via /proc/$PID/attr/current/)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be a good idea to add this note. So there is no expectations that k8s will somehow guarantee that AppArmor was not turned off and still being applied

// LocalhostProfile indicates a loaded profile on the node that should be used.
// The profile must be preconfigured on the node to work.
// Must match the loaded name of the profile.
// Must only be set if type is "Localhost".
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this kubernetes/kubernetes#117020 applies here equally? Sorry I didn't check, maybe you can answer off the top of your head.

Also wording may be improved to say "must be set when type is localhost and must not be set in any other case"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seccomp already applies validation that LocalhostProfile is set IFF the type is Localhost: https://github.com/kubernetes/kubernetes/blob/0f817dcd6577b663f6cb22cd5080f14138e8b6ac/pkg/apis/core/validation/validation.go#L4612-L4622

I'm not sure the tightening validation was handled correctly for that, but it's been merged since 1.19 at this point.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe in seccomp case it was an actual security problem (minor), this is why the IFF was added.

@SergeyKanzhelev SergeyKanzhelev mentioned this pull request Jan 23, 2024
19 tasks
Copy link
Contributor

@sftim sftim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might seem weird, but I recommend splitting the PR into 2 commits:

  • add the actual KEP as beta
  • a small second commit to mark it as ready to graduate to GA

keps/sig-node/24-apparmor/README.md Outdated Show resolved Hide resolved
keps/sig-node/24-apparmor/README.md Outdated Show resolved Hide resolved
keps/sig-node/24-apparmor/README.md Show resolved Hide resolved
Comment on lines 119 to 121
as compile-time feature which may be disabled as well. With the GA API the error handling will be
unchanged, and behave exactly the same as the current error propagation paths (see
[Failure and Fallback Strategy](#failure-and-fallback-strategy)).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The KEP should describe the difference between the world before the enhancement and the GA outcome we intend to deliver.

keps/sig-node/24-apparmor/README.md Outdated Show resolved Hide resolved
keps/sig-node/24-apparmor/README.md Outdated Show resolved Hide resolved
keps/sig-node/24-apparmor/README.md Outdated Show resolved Hide resolved
keps/sig-node/24-apparmor/README.md Show resolved Hide resolved
keps/sig-node/24-apparmor/README.md Outdated Show resolved Hide resolved
keps/sig-node/24-apparmor/README.md Outdated Show resolved Hide resolved

## Alternatives

N/A
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My opinion: we could have a beta release that switches AppArmor to use API fields for Pod, and do the graduation to GA in the minor release after that.

Jumping in one go gives people no option to opt out of a field that wasn't there at all in the previous release.

I suggest we document this more careful approach so that it's clear what we ruled out and why.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the only difference with your suggestion that we'd make all the changes proposed here, but the feature gate would still be labeled as Beta? Then next release the only change is to update the feature gate status? Why does that separation matter?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can disable beta feature gates, and you can't disable GA feature gates. It'd give people a chance to discover whether the new field is problematic and opt out, without forcing them to downgrade if it is.

@tallclair tallclair added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Jan 27, 2024
@tallclair
Copy link
Member Author

tallclair commented Jan 27, 2024

@sftim thanks for the review. There actually is a design doc that predates the KEP process: https://github.com/kubernetes/design-proposals-archive/blob/main/auth/apparmor.md. Given that, do you think we still need to separate out a retrospective KEP commit at this point?

@sftim
Copy link
Contributor

sftim commented Jan 28, 2024

@sftim thanks for the review. There actually is a design doc that predates the KEP process: https://github.com/kubernetes/design-proposals-archive/blob/main/auth/apparmor.md. Given that, do you think we still need to separate out a retrospective KEP commit at this point?

I think so - if we drop the graduation to stable for the next release, it makes a revert of just that part much easier.

@tallclair
Copy link
Member Author

ok, I added a retroactive version as the first commit.

@tallclair tallclair removed the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Jan 30, 2024
@dchen1107
Copy link
Member

/approve to trigger PRR approval

@dchen1107 dchen1107 added the lead-opted-in Denotes that an issue has been opted in to a release label Feb 6, 2024
@BenTheElder
Copy link
Member

/assign
[PRR Shadow]

Copy link
Member

@BenTheElder BenTheElder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[PRR Shadow]
PRR LGTM

keps/sig-node/24-apparmor/README.md Outdated Show resolved Hide resolved
planned at this moment. The way to determine usage is by checking whether the
pods/containers have a AppArmorProfile set.

###### How can someone using this feature know that it is working for their instance?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess maybe also not apparmor itself but the field <> annotation bit?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I don't quite understand this. Do you mean how can they be sure that the pod had the correct API bits set? As in, get the pod yaml and confirm the annotation & fields match?

keps/prod-readiness/sig-node/24.yaml Outdated Show resolved Hide resolved

The annotation validations will be carried over to the field API, and the following additional
validations are proposed:
1. Fields must match the corresponding annotations when both are present, except for ephemeral containers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just checking my understanding-- When annotations are set, non-ephemeral containers fields must match the annotations, but ephemeral containers fields may be set to any desired values?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. This is due to the fact that the annotations are currently immutable, and that restriction wasn't relaxed when ephemeral containers were introduced. The fields will still be immutable, but when an ephemeral container is added, it may set an AppArmor profile on its security context. In this case, the annotations won't be synced.


The feature is built into the kubelet and api server components. No metric is
planned at this moment. The way to determine usage is by checking whether the
pods/containers have a AppArmorProfile set.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's okay not to have a metric. I don't see a way to check this with kube-state-metrics either (but maybe I missed it?)

Is there anything logged by the kubelet an admin could check?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not aware of anything. I could add it, but I don't think we make a habit of logging every pod security setting, so it seems sort of out-of-place to do so.

I think the right way to handle this as an admin is: if you just care about configuration, you can read that from the pod configuration. If you want to check the actual runtime configuration, use a third-party monitoring program that looks at the process state.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can read that from the pod configuration. If you want to check the actual runtime configuration, use a third-party monitoring program that looks at the process state.

SGTM. Could we include this in the PRR section for posterity?

Copy link
Member

@SergeyKanzhelev SergeyKanzhelev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 7, 2024
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 7, 2024
@jpbetz
Copy link
Contributor

jpbetz commented Feb 8, 2024

/approve PRR
One minor comment left I'd like to see incorporated into the KEP, but it's non-blocking

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 8, 2024
Copy link
Member

@SergeyKanzhelev SergeyKanzhelev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 8, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dchen1107, jpbetz, SergeyKanzhelev, tallclair

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit f37d950 into kubernetes:master Feb 8, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory lead-opted-in Denotes that an issue has been opted in to a release lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/node Categorizes an issue or PR as relevant to SIG Node. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
Development

Successfully merging this pull request may close these issues.

8 participants