Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jail capabilities for docker are dangerous for host networking #119

Closed
templehasfallen opened this issue Apr 17, 2024 · 8 comments · Fixed by #121
Closed

Jail capabilities for docker are dangerous for host networking #119

templehasfallen opened this issue Apr 17, 2024 · 8 comments · Fixed by #121

Comments

@templehasfallen
Copy link
Contributor

I noticed a severe issue with the current "docker_compatible" flag.

From what I see, --capabilities=all is passed to systemd-nspawn which is reckless on many levels and poses various problems and security risks.

This does not happen when docker_compatible=0 as the jail does not have CAP_NET_ADMIN and cannot access host firewall.

As an example, my whole host iptables was wrecked by the hands of the jail. The combination of CAP_NET_ADMIN and host-networking is very dangerous and should at least include a warning if not being disallowed. The result was complete connectivity loss from all clients to the TrueNAS server, which can happen in many scenarios such as:

  • User installs firewall in jail and enables it
  • Program install firewall rules

To be completely clear, I ended up with a bunch of iptables rules which were added inside a jail, on the host.

Steps to reproduce:

  1. Create jail
  2. Make jail docker compatible
  3. Apply any kind of firewall rules inside jail. ex. iptables -A INPUT -s 1.2.3.4 -j DROP
  4. View rule on host iptables -L | grep 1.2.3.4

Resolution Proposals:

  1. Only add the capabilities for docker to the jail and warn the user if using host networking that it can be destructive to host networking
  2. Do not allow host networking when using docker_compatible, insist on macvlan or bridge networking
  3. No.2 and drop all capabilities for the jail as they should not be required

Docker is able to be installed and run without any capabilities if using macvlan or bridge interfaces and setting --setenv=SYSTEMD_SECCOMP=0

@Jip-Hop
Copy link
Owner

Jip-Hop commented Apr 17, 2024

Sorry to hear you ran into issues while using jailmaker. Thanks for reporting though!

You're right, this area needs improvement. I plan to remove the docker_compatible option in the future. The new docker config template already shows how to setup a jail for docker usage without --capabilities=all.

To provide some context, jailmaker evolved from my workaround to run docker on the TrueNAS host directly (when the docker binaries were still included on the base system). This is equivalent to running docker inside a jail with host networking and --capabilities=all.

I never ran into the issues you describe, even when I was running docker inside a jail with docker_compatible=0 and using host networking. Nevertheless there's the potential to wreck the host from inside the jail. That's why I've added the security statement and I suppose that's why iX Systems put this warning in the Sandboxes docs:

There is significant risk that using Jailmaker causes conflicts with the built-in Apps framework within SCALE. Do not mix the two features unless you are capable of self-supporting and resolving any issues caused by using this solution.

By the way, instead of disabling seccomp completely:

Docker is able to be installed and run without any capabilities if using macvlan or bridge interfaces and setting --setenv=SYSTEMD_SECCOMP=0

You could also add: --system-call-filter='add_key keyctl bpf'

Out of curiosity, what exactly did you do which caused:

my whole host iptables was wrecked by the hands of the jail.

This may be a good example scenario to put in a warning.

Were you not aware of the fact that the jail is using host networking by default?

@templehasfallen
Copy link
Contributor Author

Sorry to hear you ran into issues while using jailmaker. Thanks for reporting though!

You're right, this area needs improvement. I plan to remove the docker_compatible option in the future. The new docker config template already shows how to setup a jail for docker usage without --capabilities=all.

To provide some context, jailmaker evolved from my workaround to run docker on the TrueNAS host directly (when the docker binaries were still included on the base system). This is equivalent to running docker inside a jail with host networking and --capabilities=all.

Thanks, I'm aware of all this, I actually read through virtually every single line in this repo already

I never ran into the issues you describe, even when I was running docker inside a jail with docker_compatible=0 and using host networking.

I assume you mean docker_compatible=1 here. When using that in combination with host networking, the host network interfaces are widely exposed and anything can interfere with bridges, apps, libvirt vm routing etc.

Nevertheless there's the potential to wreck the host from inside the jail. That's why I've added the security statement and I suppose that's why iX Systems put this warning in the Sandboxes docs:

There is significant risk that using Jailmaker causes conflicts with the built-in Apps framework within SCALE. Do not mix the two features unless you are capable of self-supporting and resolving any issues caused by using this solution.

This sadly affects everything, including bridges on the host, VMs etc, not only apps.

By the way, instead of disabling seccomp completely:

Docker is able to be installed and run without any capabilities if using macvlan or bridge interfaces and setting --setenv=SYSTEMD_SECCOMP=0

You could also add: --system-call-filter='add_key keyctl bpf'

Thank you, I will test it out.

Out of curiosity, what exactly did you do which caused:

my whole host iptables was wrecked by the hands of the jail.

Basically I ran a jail with host networking and docker_compatible=1 and installed a couple programs that enable and add firewall rules. Those rules were added directly to the iptables of the host, basically denying everything but ssh connections and the app itself. Suddenly not even the TrueNAS WebUI worked.

This may be a good example scenario to put in a warning.

Were you not aware of the fact that the jail is using host networking by default?

I was aware that it was using host networking, I was unaware that literally all capabilities were enabled in the jail - I didn't expect it and I'm sure others won't expect it either. My point is there should be a fair warning when combining host networking and CAP_NET_ADMIN or --capability=all. Realistically --capability=all is not required at all, and if you want to use host networking and docker, add the specific required capabilities and not all of them (CAP_NET_ADMIN, CAP_NET_BIND_SERVICE, CAP_NET_RAW, etc).

@Jip-Hop
Copy link
Owner

Jip-Hop commented Apr 18, 2024

I assume you mean docker_compatible=1

Yes that's what I meant.

When creating a jail with host networking for the purpose of running docker, then docker needs to be able to create firewall rules in the host networking namespace. So the jail does need CAP_NET_ADMIN in this case.

It's no longer is the recommended way of running docker in a jail though so as a first step it would be a good idea to remove the docker_compatible setup question from the interactive create process and refer users to the docker config template instead. What do you think?

@templehasfallen
Copy link
Contributor Author

I assume you mean docker_compatible=1

Yes that's what I meant.

When creating a jail with host networking for the purpose of running docker, then docker needs to be able to create firewall rules in the host networking namespace. So the jail does need CAP_NET_ADMIN in this case.

It's no longer is the recommended way of running docker in a jail though so as a first step it would be a good idea to remove the docker_compatible setup question from the interactive create process and refer users to the docker config template instead. What do you think?

Yeah, exactly, I absolutely agree.

Also, if you disallow the combination of host networking and CAP_NET_ADMIN, you could go as far as considering jailmaker safe.

Jip-Hop added a commit that referenced this issue Apr 20, 2024
Remove --property=DeviceAllow= so it won't interfere with DevicePolicy=auto
Added seccomp config option
Deprecated docker_compatible config option
Deprecated gpu_passthrough config option
Removed the docker_compatible question during interactive create
Updated readme and config templates
Closes #119
@Jip-Hop
Copy link
Owner

Jip-Hop commented Apr 20, 2024

@templehasfallen could you review/test #121?

@templehasfallen
Copy link
Contributor Author

@templehasfallen could you review/test #121?

Hey, just tested on both 23.10.2 and 24.04-RC1 without any issues. All of the functionality seems to work both using a template and manually for a docker compatible jail, including running containers and exposing them.

Thanks a lot and great work :)

@Jip-Hop
Copy link
Owner

Jip-Hop commented Apr 22, 2024

Thank you!

@mrstux
Copy link
Contributor

mrstux commented Apr 22, 2024

I never hit this as I immediately used the docker template with bridge networking, to simplify replacing my docker vm

Jip-Hop added a commit that referenced this issue Apr 22, 2024
Remove --property=DeviceAllow= so it won't interfere with DevicePolicy=auto
Added seccomp config option
Deprecated docker_compatible config option
Deprecated gpu_passthrough config option
Removed the docker_compatible question during interactive create
Updated readme and config templates
Closes #119
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants