systemd-confinement: Fix with services using restrictive UMasks#147639
systemd-confinement: Fix with services using restrictive UMasks#147639aszlig wants to merge 2 commits intoNixOS:masterfrom
Conversation
This is just in preparation to fixing NixOS#147599, because when using confinement with a non-root user and a restrictive UMask, the leading directories for all of the bind mounts we need in order to execute the program in question will get that UMask as well. In that issue, I also suggested a workaround, which is to set the following option: TemporaryFileSystem = [ "/" "/nix" "/nix/store" ]; While this works, @martined has already opened an upstream issue at systemd/systemd#21548 so before we jump to introduce such ugly things as the workaround above, let's first see what upstream has to say. Nevertheless, this test case is there so that once we have a definitive answer, we already have a regression test. Signed-off-by: aszlig <aszlig@nix.build>
This is only a code style change, since with the port to Python in 1320f23, the test became less readable because the description attribute got inlined in the individual testScript snippet in a series of "with subtest(...):". While I do welcome the port to Python, I think this also made the code harder to follow because every testScript now contains additional boilerplate. The reason why the descriptions got inlined was probably because at the time we were using "Black" (a Python code formatter), which would barf if a description would exceed a certain amount of characters. However, since Black is no longer used and we switched to a linter a long time ago (c362a28), so we can now re-introduce the description attribute. Signed-off-by: aszlig <aszlig@nix.build>
|
Thanks for this! It looks like I had been beaten to it ( systemd/systemd#19899 ) and something got merged just a couple of weeks ago to fix it, for what will be v250. Unfortunately I don't think that's something they'd backport, so we're still stuck for stable release for a while -- I'm thinking the workaround you suggested while ugly might be something we can do waiting for the proper fix? Not sure honestly... |
Well, or we backport/apply the upstream fix to our version of systemd, which we then can drop when upgrading to either 249.X or 250. When using a restrictive |
|
@martinetd: Just opened a backport pull request in systemd/systemd-stable#148 in the hopes that they accept the backport. If it doesn't work out we can still apply the commits downstream. Addendum: Since I wasn't sure whether you tested systemd/systemd#21320 specifically or just systemd |
|
Thanks! I've glanced at the PR and most of it is just adding a new macro and using it, but the actual code change only added a umask()+cleanup call in two places so it's actually much a smaller change than I thought it was. Let's see how it goes. |
|
Aaand the backport PR has just been merged...! systemd-stable v249.8 has been released with it. @aszlig since you added a test I'll let you bump systemd version? |
|
@martinetd: Since there is #156096, I'm not sure whether we should bump to 249.8 or wait for 250. @jonringer: Do you have a rough estimate when you expect this to land in |
I was going to include it after the next staging cycle began. So about a month. |
|
I would also be okay with you adding the commits to the other PR |
|
I think it depends a bit on what we want for 21.11 -- for that branch we'd want 249.8 and not 250 if we want the fixes there? |
21.11 should not be receiving the 250 bump. So 249.8 makes sense for stable. |
|
Thanks for confirming! The question was more about do we want it or not, but given the other fixes I'd guess we would. |
|
Yea, the only thing that complicates this is that systemd250 is close to ready for staging. cc @vcunat on thoughts. |
|
FWIW I've tested locally and am running 249.8 on my machines without problem so far; I've already been recompiling systemd for an older bug might as well get both fixes. The included patches applied without any change (not sure how to test musl patches though), here's the rebased version with these 2 patches + systemd bump, even if that's the easy part: |
|
Are you sure this will be worth trying to fix in 21.11? On a quick glance I don't expect it would be easy to be certain that you don't introduce some minor breakage. |
|
I've hijacked this PR but the main reason I'm building my own systemd is that my pid1 sometimes crashes when upgraded on one machine due to #147103, and that seems worth the trouble even if the conditions to hit the crash are probably not that common (for me it happens because of some udev thing wants to umount dvds on upgrade for some reason, and systemd gets lost in the dependency tree). It's double the pain when services like hostapd get stopped before the crash and physical access is awful, I'm not taking the risk that it happens again. But once again I'm on unstable and can keep rebuilding systemd until 250 gets in, so I don't really care. Anyway yes, there's no way to be certain it won't break something else instead. I'd say that's systemd maintainers jobs to not backport risky patches, but I've also had a look at the git log and while I don't see anything suspicious in the 109 commits there it's impossible to say it won't break a thing. (mandatory xkcd here: https://xkcd.com/1172/ ) |
NOTE: This currently is just a failing test in preparation for the fix, do not merge.
This is just in preparation to fixing #147599, because when using confinement with a non-root user and a restrictive UMask, the leading directories for all of the bind mounts we need in order to execute the program in question will get that UMask as well.
In that issue, I also suggested a workaround, which is to set the following option:
While this works, @martined has already opened an upstream issue at systemd/systemd#21548 so before we jump to introduce such ugly things as the workaround above, let's first see what upstream has to say.
Nevertheless, this test case is there so that once we have a definitive answer, we already have a regression test.