Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

check before trying to load br_netfilter? #1210

Closed
runcom opened this issue Jun 3, 2016 · 9 comments
Closed

check before trying to load br_netfilter? #1210

runcom opened this issue Jun 3, 2016 · 9 comments

Comments

@runcom
Copy link
Member

runcom commented Jun 3, 2016

on a rhel7.2 system running kernel 3.10.0-327.18.2.el7.x86_64 I always get:

Jun 02 17:33:56 rhel72-00.localdomain forward-journal[2134]: time="2016-06-02T17:33:56.431503709-04:00" level=warning msg="Running modprobe bridge br_netfilter failed with message: modprobe: WARNING: Module br_netfilter not found.\ninsmod /lib/modules/3.10.0-327.13.1.el7.x86_64/kernel/net/llc/llc.ko \ninsmod /lib/modules/3.10.0-327.13.1.el7.x86_64/kernel/net/802/stp.ko \ninsmod /lib/modules/3.10.0-327.13.1.el7.x86_64/kernel/net/bridge/bridge.ko \n, error: exit status 1"

Steps to reproduce (it's 100% reproducible):

  1. remove all docker packages.
  2. reboot the system, so the docker0 will go away.
  3. install all docker packages and start the daemon.

I don't really know what's happening but Docker runs really fine but I always get that annoying warning - is there any way to mute it? like trying not to load br_netfilter under some circumstances?

running the modprobe manually gives me this:

[root@rhel01 ~]# modprobe -va bridge br_netfilter
insmod /lib/modules/3.10.0-327.18.2.el7.x86_64/kernel/net/llc/llc.ko 
insmod /lib/modules/3.10.0-327.18.2.el7.x86_64/kernel/net/802/stp.ko 
insmod /lib/modules/3.10.0-327.18.2.el7.x86_64/kernel/net/bridge/bridge.ko 
[  157.473588] Bridge firewalling registered
modprobe: WARNING: Module br_netfilter not found.

ping @mavenugo @aboch @mrjana

@runcom
Copy link
Member Author

runcom commented Jun 3, 2016

I guess the question is: if br_netfilter isn't strictly required to run the Docker daemon, why do we log at Warning level? why not just Debug level? Ppl in production always report those warning but Docker is fine and runs normally.

@mrjana
Copy link
Contributor

mrjana commented Jun 3, 2016

@runcom I guess it depends on what "Docker is fine and runs normally" means. The reason we do this because in newer kernels we can't apply iptables for bridged traffic (used for linked containers and icc=false) if this module is not loaded and that causes functionality issues for those containers which are linked (but daemon is started with --icc=false). If we perhaps can find out in a reliable way that the br_netfilter module is loaded or builtin to the kernel AND if the modprobe fails for that reason then we don't need to generate a warning. Without that this is the only way to find out why linked containers are not working in some one's host. If we move it to Debug level we will never find out.

@runcom
Copy link
Member Author

runcom commented Jun 3, 2016

@mrjana sorry I don't agree

Without that this is the only way to find out why linked containers are not working in some one's host. If we move it to Debug level we will never find out.

If you are in this situation you'll likely want to restart the daemon with -D to understand more (and probably trying to reproduce) - logging at warning level is not helping here. What you highlighted is "debugging" - not normal run. And by restarting the daemon in debug mode you'll find out instantly.

@mrjana
Copy link
Contributor

mrjana commented Jun 3, 2016

@runcom There is a big difference between doing this in dev and in production.
Scenario:

1. Somebody is running docker in production and they decided to link containers and it doesn't work. 
2. They report the issue in docker/docker
3. We ask to restart the daemon with debug level set and send the output
4. User can't do because he is running docker in production and they cannot bring down services

So if I have to choose between a minor annoyance in a warning message that happens once early in the bootup and not able to address a user concern as a maintainer I will always choose the latter,

Now I am not saying that the current logic can't be improved. If you can push a PR which attempts this modprobe only in kernels where it is required and if that process fails it generates a warning message I will gladly accept that PR.

@runcom
Copy link
Member Author

runcom commented Jun 3, 2016

your scenario isn't right to me - ppl usually run tests and integration before going to production and they can easily find out their app isn't working because container connectivity isn't working so restart with -D in non production is a thing, i don't think ppl go straight to production with an app based on Docker and multi container without testing it.

It's far more important in production not to get false warnings instead - you'll get to deal with customer support and all these stuff all day because ppl get worried on a warning.

@runcom
Copy link
Member Author

runcom commented Jun 3, 2016

It should be instead tightly documented that possibly inter container communication is going to work properly if those modules aren't enabled - I think what you say is more targeted to individuals who tries to run docker without even bothering checking docs and check scripts rather than companies who instead are going to check everything before running something in production (and restarting with debug in a staging environment is likely to be done instantly)

@runcom
Copy link
Member Author

runcom commented Jun 3, 2016

However, ack on creating a PR to try and check if kernels support it before modprobing (I've no clue about this honestly)

@mrjana
Copy link
Contributor

mrjana commented Jun 3, 2016

your scenario isn't right to me

Oh I didn't make up this scenario. This actually happened many times for real.

However, ack on creating a PR to try and check if kernels support it before modprobing

Thanks. That is what we need in order to improve the situation overall.

@GordonTheTurtle
Copy link

@runcom It has been detected that this issue has not received any activity in over 6 months. Can you please let us know if it is still relevant:

  • For a bug: do you still experience the issue with the latest version?
  • For a feature request: was your request appropriately answered in a later version?

Thank you!
This issue will be automatically closed in 1 week unless it is commented on.
For more information please refer to #1926

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants