Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add initial selinux implementation #683

Merged
merged 11 commits into from
Feb 21, 2020
Merged

add initial selinux implementation #683

merged 11 commits into from
Feb 21, 2020

Conversation

bcressey
Copy link
Contributor

@bcressey bcressey commented Jan 24, 2020

Issue #, if available:
N/A

Description of changes:
This implements a policy for SELinux aimed at closing vectors for persistence of malicious code across container restarts and reboots.

The initial goals are that containerized processes:

  • cannot modify API settings in /var/lib/thar
  • cannot modify layer archives for containers

Testing done:
Processes on a running system are labeled as indicated in subject.cil, and files and directories have the labels indicated in object.cil.

Launched a new node and ran sonobuoy conformance tests. No AVC denials were logged.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

tools/rpm2img Show resolved Hide resolved
@iliana iliana modified the milestones: Public Preview, v0.3.0 Jan 27, 2020
packages/kernel/config-thar Outdated Show resolved Hide resolved
@bcressey bcressey force-pushed the selinux branch 7 times, most recently from 2176ab7 to e8999bd Compare February 17, 2020 18:22
@bcressey
Copy link
Contributor Author

Changes since initial draft:

Dropped SELinux enablement for Docker. This was done for consistency with containerd, which does not yet support assigning random MCS labels to pods.

Widened access to /local, leaving it open to containers except for certain blacklisted paths. The whitelist approach was hard to match up with the ecosystem of plugins for kubelet. It may return in a different form later.

Refocused policy to deny writes to critical files: containerd's pristine layer archives and the settings for our API.

Expanded the policy to cover all file-like classes and permissions.

@bcressey bcressey marked this pull request as ready for review February 17, 2020 18:38
@bcressey bcressey changed the title RFC: add initial selinux implementation add initial selinux implementation Feb 17, 2020
@bcressey bcressey force-pushed the selinux branch 2 times, most recently from 077ea9f to 530120f Compare February 17, 2020 20:27
Various applications expect this path to exist in order to detect the
active SELinux policy and to discover contexts for labeling processes
and files.

Signed-off-by: Ben Cressey <[email protected]>
We enable the audit subsystem in order to log AVC denials.

The SELinux options are mandated by the kernel config, but including
them on the kernel command line makes the behavior visible to simple
tools.

Signed-off-by: Ben Cressey <[email protected]>
These directories are created for us by tmpfiles.d.

Signed-off-by: Ben Cressey <[email protected]>
@bcressey
Copy link
Contributor Author

Hit a couple of denials when testing the EFS CSI driver:

[ 1984.521571] audit: type=1400 audit(1582000601.532:3): avc: denied { mount } for pid=23455 comm="mount.nfs4" name="/" dev="0:323" ino=3728010751851903760 scontext=system_u:system_r:container_t:s0-s0:c0.c1023 tcontext=system_u:object_r:unlabeled_t:s0 tclass=filesystem permissive=1
...
[ 2174.430136] audit: type=1400 audit(1582000791.440:4): avc: denied { mounton } for pid=27667 comm="mount.nfs4" path="/var/lib/kubelet/pods/106d0a5c-5207-11ea-bac1-0646452c901a/volumes/kubernetes.io~csi/efs-pv/mount" dev="0:323" ino=3728010751851903760 scontext=system_u:system_r:container_t:s0-s0:c0.c1023 tcontext=system_u:object_r:unlabeled_t:s0 tclass=dir permissive=1

Copy link
Contributor

@iliana iliana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine to merge. I'd like to see issues opened to track what we're going to remove permissive bits around and at what point we feel confident removing those.

Also sets noatime, nodev, and nosuid for our squashfs mounts, for
consistency with the others.

Signed-off-by: Ben Cressey <[email protected]>
"spc_t" has elevated privileges relative to "container_t", and fits
the intended use case of "breaking glass" to debug the host system
with minimal friction.

Signed-off-by: Ben Cressey <[email protected]>
Disabling them removes surface area from containerd, and lets us
defer the work of figuring out how they should interact with our
SELinux policy.

Signed-off-by: Ben Cressey <[email protected]>
Copy link

@abby-fuller abby-fuller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm good with this, assuming we add the docs/comments we just talked about/you have notes on :)

@bcressey
Copy link
Contributor Author

Notes:

  • Create issue with next steps for SELinux implementation
  • Document what's in scope for the current implementation - SELinux section in SECURITY.md equivalent
  • Document assumptions such as shared responsibility model
  • Provide pod security policy excerpts e.g. block running as super_t

@bcressey
Copy link
Contributor Author

Added issues for related items: #764, #765, #766.

@bcressey bcressey merged commit d2a0d66 into develop Feb 21, 2020
@tjkirch tjkirch deleted the selinux branch February 21, 2020 22:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants