rootless: single user namespace #2706

giuseppe · 2019-03-19T15:34:34Z

simplify the rootless implementation to use a single user namespace for all the running containers.

This makes the rootless implementation behave more like root Podman, where each container is created in the host environment.

There are multiple advantages to it:

much simpler implementation as there is only one namespace to join.
we can join namespaces owned by different containers.
commands like ps won't be limited to what container they can access as previously we either had access to the storage from a new namespace or access to /proc when running from the host.
rootless varlink works.
there are only two ways to enter in a namespace, either by creating a new one if no containers are running or joining the existing one from any container.

Containers created by older Podman versions must be restarted. Should we set the rpm update to need a reboot?

openshift-ci-robot · 2019-03-19T15:34:39Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: giuseppe

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [giuseppe]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

rhatdan · 2019-03-19T15:42:17Z

I don't want to require a reboot. With this change can I run multiple different User Namepaces?

giuseppe · 2019-03-19T15:47:22Z

With this change can I run multiple different User Namepaces?

we can still run "sub" user namespaces for each pod/container. It will work more similar to rootful Podman, where all the containers are created from the same user namespace (the host user namespace).

And it gets us closer to the sub-split we were discussing. We won't need special handling for rootless containers

giuseppe · 2019-03-19T15:51:11Z

I don't want to require a reboot.

is there a way to restart all containers for all users on an update?

rhatdan · 2019-03-19T15:58:34Z

No. Will running containers break? Or you just won't be able to exec into them?

giuseppe · 2019-03-19T16:02:09Z

you will just not be able to exec into them

TomSweeneyRedHat · 2019-03-19T16:33:54Z

FWIW, if this gets in before the next podman release, we ought to be sure to bump the release by a number, i.e. 1.2->1.3 rather than 1.2-1 or some such.

giuseppe · 2019-03-19T18:06:26Z

tests are passing, I'll think more what we can do to detect and restart old containers

rhatdan · 2019-03-19T19:43:02Z

I think we should hold this PR Off untile after 1.2 release

giuseppe · 2019-03-19T20:31:23Z

I am fine with that, this is a quite significant change. Although we will keep dragging further the complexity and the issues we currently have with rootless containers

rh-atomic-bot · 2019-03-27T15:39:16Z

☔ The latest upstream changes (presumably #2762) made this pull request unmergeable. Please resolve the merge conflicts.

AkihiroSuda · 2019-03-27T15:45:44Z

slirp4netns can be also shared?

giuseppe · 2019-03-27T15:51:21Z

slirp4netns can be also shared?

the network namespace will still be created per container or per pod.

But it will be simpler to join an existing one, as for root containers we can join the namespaces of multiple containers not be limited to one

giuseppe · 2019-03-27T15:52:56Z

dropped the RFC tag. I am quite convinced this is the right approach to drop most of differences we currently have with root containers.

rh-atomic-bot · 2019-03-28T14:21:14Z

☔ The latest upstream changes (presumably #2789) made this pull request unmergeable. Please resolve the merge conflicts.

in the few places where we care about skipping the storage initialization, we can simply use the process effective UID, instead of relying on a global boolean flag. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>

giuseppe · 2019-04-01T11:38:49Z

@haraldh with this PR there will be only one user namespace, and rootless varlink will work in the same way as root

simplify the rootless implementation to use a single user namespace for all the running containers. This makes the rootless implementation behave more like root Podman, where each container is created in the host environment. There are multiple advantages to it: 1) much simpler implementation as there is only one namespace to join. 2) we can join namespaces owned by different containers. 3) commands like ps won't be limited to what container they can access as previously we either had access to the storage from a new namespace or access to /proc when running from the host. 4) rootless varlink works. 5) there are only two ways to enter in a namespace, either by creating a new one if no containers are running or joining the existing one from any container. Containers created by older Podman versions must be restarted. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>

mheon · 2019-04-01T19:55:09Z

Alright, 1.2 is landed. I'm going to review this so we can think about merging it.

mheon · 2019-04-01T20:03:28Z

cmd/podman/main.go

+		}
+		defer runtime.Shutdown(false)
+
+		ctrs, err := runtime.GetRunningContainers()


Oh dear. The performance implications here are unpleasant - make a runtime, tear it down, retrieve all containers, make a new runtime for the command itself...

this is what we already do in most cases :( we create a runtime from the podman instance running with uid != 0 and then evaluate if we need a new userns or to join an existing one. The difference is that now we do this before evaluating each command.

Hopefully it leaves some margin for improvements, as there is not much logic now to find out what userns must be joined.

if might be interesting to brain storm on this, but longer term, maybe we should keep a db table for container namespaces only so we don't have to discover them? I have some other ideas too.

giuseppe · 2019-04-04T13:22:56Z

given that the current cost is not different than what we have now (in some cases it is even less, as we do the re-exec much earlier), is there anything more blocking this PR?

rhatdan · 2019-04-04T14:17:07Z

LGTM

rhatdan · 2019-04-04T14:17:27Z

What does this do on a Podman Remote machine?

mheon · 2019-04-04T14:26:24Z

If we want to cut a 1.2.1, I'd want to wait until we do to merge this, but I'm less convinced we should now.

rhatdan · 2019-04-04T16:02:23Z

I think we should just get this in, and move forward. I would want this tested well before it ends up in RHEL8.

mheon · 2019-04-04T16:08:09Z

@rhatdan I'm fine with this, but I vote that we promote Podman 1.2 out of Koji before we cut a release with this in it, so we have a stable 1.2 out there if this proves problematic.

mheon · 2019-04-04T16:08:26Z

/lgtm

giuseppe · 2019-04-04T19:02:55Z

What does this do on a Podman Remote machine?

podman remote will run in the same user namespace as other podman instances. That means we can finally use varlink without any difference with root containers. I've already tried locally and it works fine

openshift-ci-robot requested review from mrunalp and umohnani8 March 19, 2019 15:34

openshift-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/XL labels Mar 19, 2019

giuseppe force-pushed the rootless-single-usernamespace branch 2 times, most recently from 379eaf0 to 4df44d0 Compare March 19, 2019 16:28

giuseppe force-pushed the rootless-single-usernamespace branch 2 times, most recently from 0ec0c6d to cdf1041 Compare March 19, 2019 17:03

giuseppe force-pushed the rootless-single-usernamespace branch 2 times, most recently from 9b11379 to 5ad8651 Compare March 20, 2019 09:02

openshift-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL labels Mar 20, 2019

giuseppe changed the title ~~[RFC] rootless: single user namespace~~ [RFC][WIP] rootless: single user namespace Mar 20, 2019

giuseppe force-pushed the rootless-single-usernamespace branch from 5ad8651 to 3e6a214 Compare March 27, 2019 15:48

giuseppe changed the title ~~[RFC][WIP] rootless: single user namespace~~ [WIP] rootless: single user namespace Mar 27, 2019

openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 27, 2019

mheon added the Release Notes 1.3.0 label Mar 29, 2019

rootless: remove SkipStorageSetup()

ed32620

in the few places where we care about skipping the storage initialization, we can simply use the process effective UID, instead of relying on a global boolean flag. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>

giuseppe force-pushed the rootless-single-usernamespace branch from 3e6a214 to af2cd92 Compare April 1, 2019 11:30

giuseppe changed the title ~~[WIP] rootless: single user namespace~~ rootless: single user namespace Apr 1, 2019

openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 1, 2019

giuseppe force-pushed the rootless-single-usernamespace branch from af2cd92 to 72382a1 Compare April 1, 2019 13:33

mheon reviewed Apr 1, 2019

View reviewed changes

This was referenced Apr 3, 2019

WIP: fix podman pod rm -a race #2737

Closed

test-race: podman pod rm -a doesn't remove a running container #2733

Closed

openshift-ci-robot assigned mheon Apr 4, 2019

openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Apr 4, 2019

openshift-merge-robot merged commit 1759eb0 into containers:master Apr 4, 2019

This was referenced Apr 4, 2019

fix bug podman cp directory #2845

Merged

Rework named volumes in DB #2774

Merged

github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 26, 2023

github-actions bot locked as resolved and limited conversation to collaborators Sep 26, 2023

rootless: single user namespace #2706

rootless: single user namespace #2706

Uh oh!

Conversation

giuseppe commented Mar 19, 2019

Uh oh!

openshift-ci-robot commented Mar 19, 2019

Uh oh!

rhatdan commented Mar 19, 2019

Uh oh!

giuseppe commented Mar 19, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

giuseppe commented Mar 19, 2019

Uh oh!

rhatdan commented Mar 19, 2019

Uh oh!

giuseppe commented Mar 19, 2019

Uh oh!

TomSweeneyRedHat commented Mar 19, 2019

Uh oh!

giuseppe commented Mar 19, 2019

Uh oh!

rhatdan commented Mar 19, 2019

Uh oh!

giuseppe commented Mar 19, 2019

Uh oh!

rh-atomic-bot commented Mar 27, 2019

Uh oh!

AkihiroSuda commented Mar 27, 2019

Uh oh!

giuseppe commented Mar 27, 2019

Uh oh!

giuseppe commented Mar 27, 2019

Uh oh!

rh-atomic-bot commented Mar 28, 2019

Uh oh!

giuseppe commented Apr 1, 2019

Uh oh!

mheon commented Apr 1, 2019

Uh oh!

mheon Apr 1, 2019

Choose a reason for hiding this comment

Uh oh!

giuseppe Apr 1, 2019

Choose a reason for hiding this comment

Uh oh!

baude Apr 2, 2019

Choose a reason for hiding this comment

Uh oh!

giuseppe commented Apr 4, 2019

Uh oh!

rhatdan commented Apr 4, 2019

Uh oh!

rhatdan commented Apr 4, 2019

Uh oh!

mheon commented Apr 4, 2019

Uh oh!

rhatdan commented Apr 4, 2019

Uh oh!

mheon commented Apr 4, 2019

Uh oh!

mheon commented Apr 4, 2019

Uh oh!

giuseppe commented Apr 4, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

giuseppe commented Mar 19, 2019 •

edited

Loading