Skip to content

Conversation

@giuseppe
Copy link
Member

simplify the rootless implementation to use a single user namespace for all the running containers.

This makes the rootless implementation behave more like root Podman, where each container is created in the host environment.

There are multiple advantages to it:

  • much simpler implementation as there is only one namespace to join.
  • we can join namespaces owned by different containers.
  • commands like ps won't be limited to what container they can access as previously we either had access to the storage from a new namespace or access to /proc when running from the host.
  • rootless varlink works.
  • there are only two ways to enter in a namespace, either by creating a new one if no containers are running or joining the existing one from any container.

Containers created by older Podman versions must be restarted. Should we set the rpm update to need a reboot?

@openshift-ci-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: giuseppe

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/XL labels Mar 19, 2019
@rhatdan
Copy link
Member

rhatdan commented Mar 19, 2019

I don't want to require a reboot. With this change can I run multiple different User Namepaces?

@giuseppe
Copy link
Member Author

giuseppe commented Mar 19, 2019

With this change can I run multiple different User Namepaces?

we can still run "sub" user namespaces for each pod/container. It will work more similar to rootful Podman, where all the containers are created from the same user namespace (the host user namespace).

And it gets us closer to the sub-split we were discussing. We won't need special handling for rootless containers

@giuseppe
Copy link
Member Author

I don't want to require a reboot.

is there a way to restart all containers for all users on an update?

@rhatdan
Copy link
Member

rhatdan commented Mar 19, 2019

No. Will running containers break? Or you just won't be able to exec into them?

@giuseppe
Copy link
Member Author

you will just not be able to exec into them

@giuseppe giuseppe force-pushed the rootless-single-usernamespace branch 2 times, most recently from 379eaf0 to 4df44d0 Compare March 19, 2019 16:28
@TomSweeneyRedHat
Copy link
Member

FWIW, if this gets in before the next podman release, we ought to be sure to bump the release by a number, i.e. 1.2->1.3 rather than 1.2-1 or some such.

@giuseppe giuseppe force-pushed the rootless-single-usernamespace branch 2 times, most recently from 0ec0c6d to cdf1041 Compare March 19, 2019 17:03
@giuseppe
Copy link
Member Author

tests are passing, I'll think more what we can do to detect and restart old containers

@rhatdan
Copy link
Member

rhatdan commented Mar 19, 2019

I think we should hold this PR Off untile after 1.2 release

@giuseppe
Copy link
Member Author

I am fine with that, this is a quite significant change. Although we will keep dragging further the complexity and the issues we currently have with rootless containers

@giuseppe giuseppe force-pushed the rootless-single-usernamespace branch 2 times, most recently from 9b11379 to 5ad8651 Compare March 20, 2019 09:02
@openshift-ci-robot openshift-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL labels Mar 20, 2019
@giuseppe giuseppe changed the title [RFC] rootless: single user namespace [RFC][WIP] rootless: single user namespace Mar 20, 2019
@rh-atomic-bot
Copy link
Collaborator

☔ The latest upstream changes (presumably #2762) made this pull request unmergeable. Please resolve the merge conflicts.

@AkihiroSuda
Copy link
Collaborator

slirp4netns can be also shared?

@giuseppe giuseppe force-pushed the rootless-single-usernamespace branch from 5ad8651 to 3e6a214 Compare March 27, 2019 15:48
@giuseppe
Copy link
Member Author

slirp4netns can be also shared?

the network namespace will still be created per container or per pod.

But it will be simpler to join an existing one, as for root containers we can join the namespaces of multiple containers not be limited to one

@giuseppe giuseppe changed the title [RFC][WIP] rootless: single user namespace [WIP] rootless: single user namespace Mar 27, 2019
@openshift-ci-robot openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 27, 2019
@giuseppe
Copy link
Member Author

dropped the RFC tag. I am quite convinced this is the right approach to drop most of differences we currently have with root containers.

@rh-atomic-bot
Copy link
Collaborator

☔ The latest upstream changes (presumably #2789) made this pull request unmergeable. Please resolve the merge conflicts.

in the few places where we care about skipping the storage
initialization, we can simply use the process effective UID, instead
of relying on a global boolean flag.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
@giuseppe giuseppe force-pushed the rootless-single-usernamespace branch from 3e6a214 to af2cd92 Compare April 1, 2019 11:30
@giuseppe giuseppe changed the title [WIP] rootless: single user namespace rootless: single user namespace Apr 1, 2019
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 1, 2019
@giuseppe
Copy link
Member Author

giuseppe commented Apr 1, 2019

@haraldh with this PR there will be only one user namespace, and rootless varlink will work in the same way as root

simplify the rootless implementation to use a single user namespace
for all the running containers.

This makes the rootless implementation behave more like root Podman,
where each container is created in the host environment.

There are multiple advantages to it: 1) much simpler implementation as
there is only one namespace to join.  2) we can join namespaces owned
by different containers.  3) commands like ps won't be limited to what
container they can access as previously we either had access to the
storage from a new namespace or access to /proc when running from the
host.  4) rootless varlink works.  5) there are only two ways to enter
in a namespace, either by creating a new one if no containers are
running or joining the existing one from any container.

Containers created by older Podman versions must be restarted.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
@giuseppe giuseppe force-pushed the rootless-single-usernamespace branch from af2cd92 to 72382a1 Compare April 1, 2019 13:33
@mheon
Copy link
Member

mheon commented Apr 1, 2019

Alright, 1.2 is landed. I'm going to review this so we can think about merging it.

}
defer runtime.Shutdown(false)

ctrs, err := runtime.GetRunningContainers()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh dear. The performance implications here are unpleasant - make a runtime, tear it down, retrieve all containers, make a new runtime for the command itself...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is what we already do in most cases :( we create a runtime from the podman instance running with uid != 0 and then evaluate if we need a new userns or to join an existing one. The difference is that now we do this before evaluating each command.

Hopefully it leaves some margin for improvements, as there is not much logic now to find out what userns must be joined.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if might be interesting to brain storm on this, but longer term, maybe we should keep a db table for container namespaces only so we don't have to discover them? I have some other ideas too.

@giuseppe
Copy link
Member Author

giuseppe commented Apr 4, 2019

given that the current cost is not different than what we have now (in some cases it is even less, as we do the re-exec much earlier), is there anything more blocking this PR?

@rhatdan
Copy link
Member

rhatdan commented Apr 4, 2019

LGTM

@rhatdan
Copy link
Member

rhatdan commented Apr 4, 2019

What does this do on a Podman Remote machine?

@mheon
Copy link
Member

mheon commented Apr 4, 2019

If we want to cut a 1.2.1, I'd want to wait until we do to merge this, but I'm less convinced we should now.

@rhatdan
Copy link
Member

rhatdan commented Apr 4, 2019

I think we should just get this in, and move forward. I would want this tested well before it ends up in RHEL8.

@mheon
Copy link
Member

mheon commented Apr 4, 2019

@rhatdan I'm fine with this, but I vote that we promote Podman 1.2 out of Koji before we cut a release with this in it, so we have a stable 1.2 out there if this proves problematic.

@mheon
Copy link
Member

mheon commented Apr 4, 2019

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Apr 4, 2019
@openshift-merge-robot openshift-merge-robot merged commit 1759eb0 into containers:master Apr 4, 2019
@giuseppe
Copy link
Member Author

giuseppe commented Apr 4, 2019

What does this do on a Podman Remote machine?

podman remote will run in the same user namespace as other podman instances. That means we can finally use varlink without any difference with root containers. I've already tried locally and it works fine

@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 26, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 26, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants