-
Notifications
You must be signed in to change notification settings - Fork 651
snap-confine, snap-seccomp: utilize new seccomp logging features #3998
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The seccomp policy has historically used SECCOMP_RET_KILL to forcefully kill a snap process that bumps into the walls of the sandbox. However, killing the snap is not very user friendly. Changing the policy to use SECCOMP_RET_ERRNO to return -1 with errno set to EPERM has been desired but the kernel would not log those denials which could leave users and developers confused about why their applications were experiencing errors. The 4.14 Linux kernel contains new seccomp logging controls which allows snapd to request SECCOMP_RET_ERRNO to be logged. This patch makes use of the new logging controls and switches the default action of the seccomp policy to SECCOMP_RET_ERRNO so that snaps aren't killed when the perform an illegal system call. Signed-off-by: Tyler Hicks <[email protected]>
No seccomp policy was loaded for snaps installed in devmode since seccomp didn't have the concept of "allow but log". The 4.14 Linux kernel now has that functionality with the SECCOMP_RET_LOG action. This patch changes the seccomp policy for devmode to log and allow any system calls that would not be allowed if the snap was installed in jailmode. Fixes: https://launchpad.net/bugs/1567597 Signed-off-by: Tyler Hicks <[email protected]>
|
squee! |
zyga
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything looks good. Note that we have a convention to use curly braces around if / else statements to ensure there's no chance of mistaking the nesting.
While not strictly could we also take the change to kill the entire process rather than just the offending thread of execution? This would help with "mysterious" problems in multi-threaded applications. AFAIK the kernel has this feature now.
jdstrand
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is following the agreements between the snapd and security teams and looks good on the whole. Thanks! Please see comments inline though.
I'm going to leave this as 'Request changes' for now so I can review the final PR.
| debug("kernel doesn't support the seccomp(2) syscall"); | ||
| else if (errno == EINVAL) | ||
| debug("kernel may not support the SECCOMP_FILTER_FLAG_LOG flag"); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be where the intended future check could be better (ie, using 'may'). What I was hoping for was that if SECCOMP_FILTER_FLAG_LOG is not available, then snapd would log that seccomp is mediating with EPERM but without logging so people understand the behavior change. Before, commands run under strict mode confinement with access violations would be killed with logging, but now on systems without SECCOMP_FILTER_FLAG_LOG support, they will be returned EPERM and continue to run with no logging, which may be confusing since the new denied behavior looks more like the previous allowed behavior than the previous denied behavior.
All that said, logging the fallback behavior IMO should happen at the snapd level once per snapd invocation since logging in snap-seccomp or in snap-confine would be way too chatty. @mvo5 - I suspect this would be more for whoever picks this up than for Tyler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point that logging the fallback shouldn't happen in snap-seccomp or snap-confine as it would be spammy in the logs.
libseccomp upstream was against exporting a function specifically for checking the validity of a filter flag or action. Applications linking against libseccomp will have the option to check the validity of the SECCOMP_FILTER_FLAG_LOG flag indirectly but using an "API level" based check that is currently being implemented. However, that check lumps together multiple features and, in this case, a single API level will be used to represent SECCOMP_FILTER_FLAG_LOG, SECCOMP_RET_LOG, and the unrelated SECCOMP_RET_KILL_PROCESS. That means that the kernel, libseccomp, and libseccomp-golang will all have to be in sync for libseccomp to report that the API level is supported.
Therefore, I think snapd would want to directly use the seccomp(2) syscall for checking if a filter flag is supported. Snapd could just use the technique that libseccomp is going to be using behind the scenes. I added a kernel selftest for the technique so that we can be sure that the kernel doesn't violate the agreement with userspace about how to check if a filter flag is supported. You can see the kernel selftest change here. Calling seccomp(SECCOMP_SET_MODE_FILTER, SECCOMP_FILTER_FLAG_LOG, NULL) should return -1 with errno set to EFAULT if the flag is supported by the kernel. Other errno values mean other things such as EINVAL meaning that the flag is not supported and ENOSYS meaning that the seccomp(2) syscall isn't even supported (so there's no way to even specify a filter flag).
cmd/snap-seccomp/main.go
Outdated
| return err | ||
| } | ||
|
|
||
| if fallback { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a comment here stating that this fallback is resorting to the old pre-logging behavior of just allowing everything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, good idea.
cmd/snap-seccomp/main.go
Outdated
| } else if line == "@complain" { | ||
| var fallback bool = false | ||
|
|
||
| secFilter, err = seccomp.NewFilter(seccomp.ActLog) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While it would be a bug in the policy to have '@complain' in it more than once, the previous code stopped processing rules as soon as it found '@complain', but this code will add a NewFilter for every '@complain'. I think we should simply skip (perhaps logging with debug) additional '@complain' lines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is no different than if one or more white listed system calls are placed above the first @complain directive. In that case, a new filter would be allocated when @complain was encountered and all the previously white listed system calls would be forgotten.
If we're going to handle multiple @complain directives, we should handle the case that I described, too. Unfortunately, libseccomp doesn't allow us to change the default action of an existing filter via seccomp_attr_set(..., SCMP_FLTATR_ACT_DEFAULT, SCMP_ACT_LOG) so an entirely new filter must be allocated.
What is your preferred solution here?
- Don't worry about the case that I mentioned. Address your concern by skipping
NewFilter()when libseccomp-golang'sGetDefaultAction()returnsActLog. - Treat both situations as an error. (I don't know what it means to error out of the
compile()function.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What snap-confine would do before the move to snap-seccomp is it would preprocess the file, looking for @unrestricted or @complain. The rewrite took a shortcut and simply says "forget anything I did before if I find either of these". That was fine before when they were equivalent. Profile authors will sometimes put these directives anywhere in the file, so I think we need to account for that.
Instead of the two listed options, I suggest the previous behavior: preprocess the file, if @unrestricted found 1 or more times, store that off, if @complain found 1 or more times, store that off. Eg something along the lines of:
// Preprocess to allow @unrestricted and @complain to appear anywhere in the profile
unrestricted, complain := preprocess()
if unrestricted {
seccomp.NewFilter(seccomp.ActAllow)
} else if complain {
seccomp.NewFilter(seccomp.ActLog)
} else {
seccomp.NewFilter(seccomp.ActErrno.SetReturnCode(C.EPERM))
}
addSecondaryArches(secFilter)
if not unrestricted {
for scanner.Scan() {
line := strings.TrimSpace(scanner.Text())
parseLine(line, secFilter)
...
}
}
...
This has the added benefit of adding the secondary arch in one spot.
@mvo5 - what do you think about preprocessing?
UPDATE: I adjusted the logic to be only 'if not unrestricted' after noticing the error
@zyga I agree that the new |
|
\o/ thanks @tyhicks, very pleased to see this! |
Signed-off-by: Tyler Hicks <[email protected]>
It isn't immediately clear why seccomp rule compilation must end immediately after falling back to ActAllow as the default action for devmode. Leave a comment for easier code reading. Signed-off-by: Tyler Hicks <[email protected]>
|
I started looking into updating the tests. This will require some serious work because right now the tests are a bit naive. They rely on the fact that the kernel kills a process that violates the confinement. With the change to errno the syscallRunner and the tests itself need to be smater at identifying why a syscall has an error result (i.e. was it bogus input data which we give right now in most tests or was it a permission denied). |
The pervious tests simply checked if the process got killed when doing an invalid syscall. The new seccomp behaviour is to use EPERM as the errno for a denied syscall. This is now reflected in the tests as well.
We no longer use main.SeccompRetKill in the seccomp code and the
name in the tests is misleading. Instead use {Allow,Deny}.
|
This is now failing only because fedora uses the official libseccomp-golang branch - so once the ActLog bits are in the upstream libseccomp-golang fedora should be unblocked. |
|
@tyhicks This is lingering for quite a long time. Should we close it until the work is ready to be reviewed and merged? |
|
@mvo What do you think? (just now noticed you've pushed changes too) |
|
@niemeyer @mvo5 I was told that that the Snappy team would take this PR over from here so I don't have plans to do any further work on it. I would hate for the PR to be closed since the final dependency piece (libseccomp SRU in Ubuntu 16.04) should land next week (it is currently in xenial-proposed if anyone would like to test with it). |
|
@mvo5 is there anything else that you need me to do for this pull request? From an Ubuntu distro standpoint, the libseccomp SRU in Xenial still needs to go through. There's been renewed progress in verifying that SRU and it should happen very soon. I don't believe that there's anything else that I need to do in this pull request but I'd like to confirm that. |
|
libseccomp |
|
Sorry for the delay. We indeed waited for the sru update initially and missed when it happend. I updated the PR again and resolved conflicts, fixed build failures. One challange is how to make fedora work with this PR. Fedora uses the upstream seccomp-golang (not our version which contains your patches already). So this means that upstream needs to merge your patches and the package in fedora needs updating to fully support this. Or we need to write the code in seccomp.go in such a way that it deals with a missing "seccomp.ActLog". |
|
@mvo5 - we have a fallback mechanism to use KILL if the kernel doesn't support the new features. I wonder if we can detect at build time if we have the necessary api or if we can pass a build time flag and conditionally build this, falling back to the old method if the seccomp-golang doesn't support it. |
|
Upstream doesn't seemed to have moved forward. I continue to see questions about setpriority and chown'ing to oneself in the forum. Did anything came from the idea of runtime detection of golang feature support as asked in #3998? |
|
We could potentially split up this PR into two separate PRs. The split would happen in correspondence to the two features described in the description for this PR:
The first feature can be landed without any dependencies on libseccomp or libseccomp-golang since we only need to set the The second feature is what has a dependency on libseccomp and libseccomp-golang. Of course, it may be much easier to conditionalize the use of seccomp.ActLog based on whether or not libseccomp-golang defines that constant but I'm not sure how to do that in Go. |
@mvo5 do you have ideas or examples you could link to regarding how this could be done? |
Allow some distros to make use of the ActLog action for devmode, despite the upstream libseccomp-golang project not yet cutting a release, by guessing at the const value. The guess is validated by looking at the string representation of the action to make sure that it is the expected string for ActLog. This is possible because the mvo5 github fork of libseccomp-golang contains the minimal changes necessary to support ActLog. The system must also have a new enough libseccomp and kernel to support the log action. Signed-off-by: Tyler Hicks <[email protected]>
|
@mvo5 I've pushed a new commit that conditionally uses ActLog for devmode if libseccomp-golang is new enough to know about ActLog. It is a somewhat hacky way of detection but it should be very reliable. Important: You need to merge mvo5/libseccomp-golang#2 into your libseccomp-golang fork before pulling in this PR. In fact, the snapd tests should be restarted once that mvo5/libseccomp-golang PR is merged in order to accurately test this change on Ubuntu. |
|
@Conan-Kudo this change lets consumers of @mvo5's fork of libseccomp-golang to make use of the new features. Those using older/unpatched libseccomp-golang, libseccomp, and/or kernels will continue to get the old behavior of not logging seccomp denials that would occur if the snap was running under strict confinement. This patch makes it so that snapd works in either environment. |
Codecov Report
@@ Coverage Diff @@
## master #3998 +/- ##
=========================================
+ Coverage 78.78% 78.8% +0.02%
=========================================
Files 471 472 +1
Lines 34156 34198 +42
=========================================
+ Hits 26911 26951 +40
- Misses 5068 5070 +2
Partials 2177 2177
Continue to review full report at Codecov.
|
jdstrand
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code changes look fine with some nitpicks from me regarding comments. I'll push those up.
We are also lacking unit tests for ActLog. It would be nice to have spread tests for complain mode and fallback, but that is going to be tricky since it isn't easily mocked and is dependent on the host.
I did blackbox test the changes and found that this is working as per design of the feature for strict mode. Ie,
- on 16.04 with the updated kernel, libseccomp and snapd (with updated golang-seccomp) strict mode snap receives EPERM instead of kill and the denial is logged
- on 16.04 with non-updated kernel but updated libseccomp/snapd, strict snap receives EPERM instead of kill with no logging
- on 16.04 with updated kernel/snap but non-updated libseccomp, strict snap receives EPERM instead of kill with logging (this is expected due to the ifndef in snap-confine)
- on 16.04 with updated snapd but non-updated libseccomp/kernel, strict snap receives EPERM instead of kill with no logging
This also works correctly with complain mode:
- on a 16.04 with updated kernel, libseccomp and snapd/golang-seccomp, devmode snap policy violation correctly logs
- on 16.04 with non-updated kernel but updated libseccomp/snapd, devmode snap policy violation receives seccomp KILL until seccomp policy is recompiled (since the kernel doesn't ActLog. This is Tyler's second bulletpoint that needs addressing). After recompile, policy violation is allowed with no logging
- on 16.04 with updated kernel/snap but non-updated libseccomp, devmode snap policy violation correctly logs (this is expected due to the ifndef in snap-confine)
- on 16.04 with updated snapd but non-updated libseccomp/kernel, devmode snap policy violation receives seccomp KILL until seccomp policy is recompiled (since the kernel doesn't ActLog. This is Tyler's second bulletpoint that needs addressing). After recompile, policy violation is allowed with no logging
At this point, we need to incorporate a check for ActLog in system-key so devmode snaps aren't receiving kill rebooting into kernels without ActLog from kernels with ActLog
cmd/snap-confine/seccomp-support.c
Outdated
| }; | ||
| if (prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &prog) != 0) { | ||
| die("cannot apply seccomp profile"); | ||
| if (seccomp(SECCOMP_SET_MODE_FILTER, SECCOMP_FILTER_FLAG_LOG, &prog)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We typically like to use '!= 0' as part of the coding style (see prctl call below).
| return actLog | ||
| } | ||
|
|
||
| return seccomp.ActAllow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A comment here for why returning seccomp.AcctAllow would be nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
|
||
| if complainAct == seccomp.ActAllow { | ||
| unrestricted = true | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If complainAct evaluated to seccomp.ActAllow and seccomp.NewFIlter() completed without error, then we set unrestricted to true here, but didn't write '@unrestricted' to the buffer like we do in the previous case statement. This is done this way to fallback to the old behavior. A comment here would be nice. Eg:
// Set unrestricted to 'true' to fallback to the pre-ActLog behavior of
// simply setting the allow filter without adding any rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| syscall(l[0], l[1], l[2], l[3], l[4], l[5], l[6]); | ||
| syscall(SYS_exit, 0, 0, 0, 0, 0, 0); | ||
| syscall_ret = syscall(l[0], l[1], l[2], l[3], l[4], l[5], l[6]); | ||
| if (syscall_ret < 0 && errno == 911) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A comment that 911 is the mocked value would be nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| } | ||
| l := strings.Split(string(output), "\n") | ||
| return fmt.Sprintf("Showing last 10 lines of dmesg:\n%s", strings.Join(l[len(l)-10:], "\n")) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It isn't clear why you dropped this since it is meant to only assist with debug output in the event of unexpected failure or success.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
FYI, I implemented the system-key bits so that devmode snaps have their policy recompiled when booting into a kernel without logging support. |
|
@tyhicks - can you review my commits to this PR which adds some complain mode testing and the rebooting into devmode? I retested all scenarios and this is working as expected. |
| } | ||
|
|
||
| // SecCompActions returns a sorted list of seccomp actions like | ||
| // []string{"allow", "errno", "kill", "log", "trace", "trap"}. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is worth pointing out that the following kernel commit changes kill to kill_thread (in 4.14 and newer):
This means that seccomp policy will be recompiled when booting between a kernel older than 4.14 and a kernel equal to or newer than 4.14. However, I think that's generally a good process to follow when such a low level kernel seccomp change is detected.
Your additional commits all look correct to me. Thanks for those! |
|
FYI, I looked at sqlite sources and this PR should obviate the need for using the snapcraft-preload for sqlite users. I also found the glib file IO (chown) works too: https://forum.snapcraft.io/t/chown-to-current-user/2831/9 as well as 0ad (setpriority): https://forum.snapcraft.io/t/auto-connecting-the-process-control-interface-for-the-0ad-snap/507/20. |
79e4d96 to
3ff75cd
Compare
|
FYI, the failing test is weird. It is in interfaces/seccomp/backend_test.go, but in an unrelated test and the failures are only on debian-9 and linode ubuntu 16.04 32bit, but after 3 retries, those two fail consistently. I found this puzzling since there shouldn't be anything in this PR's changes that is specific to those releases that would make them fail AFAICS. Since @zyga wrote the initial tests, I've asked him to take a look too. |
|
Also, I tested this with an electron app that had trouble with chown, and this PR makes that snap work too. |
Signed-off-by: Zygmunt Krynicki <[email protected]>
zyga
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 with minor comments
|
|
||
| func SecCompSupportsAction(action string) bool { | ||
| i := sort.SearchStrings(SecCompActions, action) | ||
| if i < len(SecCompActions) && SecCompActions[i] == action { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: we have strutil.SortedListContains for this
| #define SECCOMP_FILTER_FLAG_LOG 2 | ||
| #endif | ||
|
|
||
| #ifndef seccomp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is seccomp defined in the pre-processor usually?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the seccomp() syscall may not have a glibc wrapper:
https://lwn.net/Articles/655028/
It is important that we wrap it in that situation.
| //#endif | ||
| // | ||
| //#ifndef SECCOMP_RET_LOG | ||
| //#define SECCOMP_RET_LOG 0x7ffc0000U |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked up the value to this patch so +1
https://www.redhat.com/archives/linux-audit/2017-February/msg00005.html
…onical#3998) * snap-confine, snap-seccomp: Default to SECCOMP_RET_ERRNO The seccomp policy has historically used SECCOMP_RET_KILL to forcefully kill a snap process that bumps into the walls of the sandbox. However, killing the snap is not very user friendly. Changing the policy to use SECCOMP_RET_ERRNO to return -1 with errno set to EPERM has been desired but the kernel would not log those denials which could leave users and developers confused about why their applications were experiencing errors. The 4.14 Linux kernel contains new seccomp logging controls which allows snapd to request SECCOMP_RET_ERRNO to be logged. This patch makes use of the new logging controls and switches the default action of the seccomp policy to SECCOMP_RET_ERRNO so that snaps aren't killed when the perform an illegal system call. Signed-off-by: Tyler Hicks <[email protected]>
Important: This PR is a work in progress. The snapd tests need updating since this patch changes the default seccomp confinement from "kill" to "return an errno". @mvo5 asked me to go ahead and propose the merge and that the Snap folks could help see it to completion.
Two new seccomp logging features, introduced in Linux 4.14, allow for better snap confinement:
SECCOMP_FILTER_FLAG_LOGallows for all actions taken by a filter, except forSECCOMP_RET_ALLOW, to be logged. Previously, onlySECCOMP_RET_KILLactions were logged. This kept snap confinement from usingSECCOMP_RET_ERRNObecause applications would experience errors returned from illegal system calls but users and application developers would not have visibility into the decision making of when system calls should be denied. This new filter flag allows us to use theSECCOMP_RET_ERRNOaction to return -1 with errno set toEPERMrather than killing snap processes that make illegal system calls.SECCOMP_RET_LOGis a new action that logs the syscall and then allows it. This was the missing piece that prevented would-be seccomp denials from being logged in devmode. Previously, enabling devmode simply meant that a seccomp filter would not be loaded due to the lack of this feature. Now we can load a filter that defaults toSECCOMP_RET_LOGfor any system calls that aren't white listed.The corresponding upstream libseccomp pull request has been merged. The corresponding upstream libseccomp-golang pull request is still pending final review. Upstream libseccomp and libseccomp-golang requested more complex, ABI friendly features but we can move forward with minimal changes to libseccomp and libseccomp-golang and then take advantage of the more robust "API level" checks in the future once the new versions of libseccomp and libseccomp-golang are released.
I've SRUed a minimal libseccomp change in Ubuntu Artful (
2.3.1-2.1ubuntu2) and Ubuntu Zesty (2.3.1-2.1ubuntu2~17.04.1). The Ubuntu Xenial libseccomp SRU is still pending in xenial-proposed (2.3.1-2.1ubuntu2~16.04.1). I've proposed a minimal libseccomp-golang change to @mvo5's fork of libseccomp-golang, as that's what snapd is using.To be clear, you can apply the patches in this PR today and test out the new seccomp features in Ubuntu Artful or Zesty.
Known issues that need to be addressed before this can be merged:
SECCOMP_RET_ERRNOSECCOMP_RET_LOGaction. The kernel treats unknown actions asSECCOMP_RET_KILLwhich is appropriate from a security standpoint but is far from ideal considering that many devmode snaps would no longer work. Thesnap-seccompchange falls back toSECCOMP_RET_ALLOWwhenSECCOMP_RET_LOGis not supported by the currently running kernel.2.3.1-2.1ubuntu2~16.04.1needs to migrate from xenial-proposed to xenial-updates