Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for com.docker.network.host_ipv4 driver label #2454

Merged
merged 1 commit into from
Feb 15, 2020

Conversation

arkodg
Copy link
Contributor

@arkodg arkodg commented Sep 25, 2019

This commit allows a user to specify a Host IP via the
com.docker.network.host_ipv4 label which is used as the
Source IP during SNAT for bridge networks .

The use case is for hosts with multiple interfaces and
this label can dictate which IP will be used as Source IP
for North-South traffic

In the absence of this label, MASQUERADE is used which picks the Source IP
based on Next Hop from the Route Table

Addresses: moby/moby#30053

Signed-off-by: Arko Dasgupta [email protected]

@arkodg
Copy link
Contributor Author

arkodg commented Sep 25, 2019

root@0f408886a28c:/go/src/github.com/docker/docker# docker network create -d bridge --opt com.docker.network.host_ipv4=172.17.0.4 my-bridge
e2aa0cae96f5252d414446a8f1a55a7bbc651ab5d0d3f2ec394347074239d18a
root@0f408886a28c:/go/src/github.com/docker/docker# 
root@0f408886a28c:/go/src/github.com/docker/docker# 
root@0f408886a28c:/go/src/github.com/docker/docker# docker network inspect my-bridge
[
    {
        "Name": "my-bridge",
        "Id": "e2aa0cae96f5252d414446a8f1a55a7bbc651ab5d0d3f2ec394347074239d18a",
        "Created": "2019-09-25T04:20:51.1623661Z",
        "Scope": "local",
        "Driver": "bridge",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": {},
            "Config": [
                {
                    "Subnet": "172.20.0.0/16",
                    "Gateway": "172.20.0.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {},
        "Options": {
            "com.docker.network.host_ipv4": "172.17.0.4"
        },
        "Labels": {}
    }
]
root@0f408886a28c:/go/src/github.com/docker/docker# iptables-save | grep SNAT
-A POSTROUTING -s 172.20.0.0/16 ! -o br-e2aa0cae96f5 -j SNAT --to-source 172.17.0.4
root@0f408886a28c:/go/src/github.com/docker/docker# ifconfig br-e2aa0cae96f5
br-e2aa0cae96f5: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        inet 172.20.0.1  netmask 255.255.0.0  broadcast 172.20.255.255
        ether 02:42:64:9e:c5:e7  txqueuelen 0  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

Thoughts @mavenugo @euanh ?

This commit allows a user to specify a Host IP via the
com.docker.network.host_ipv4 label which is used as the
Source IP during SNAT for bridge networks .

The use case is for hosts with multiple interfaces and
this label can dictate which IP will be used as Source IP
for North-South traffic

In the absence of this label, MASQUERADE is used which picks the Source IP
based on Next Hop from the Route Table

Addresses: moby/moby#30053

Signed-off-by: Arko Dasgupta <[email protected]>
@P4sca1
Copy link

P4sca1 commented Oct 5, 2019

This feature would be great!

@itouch5000
Copy link

Any updates on this? :)

@Mattzi
Copy link

Mattzi commented Nov 27, 2019

Why is other stuff getting merged and not this?

@arkodg
Copy link
Contributor Author

arkodg commented Dec 3, 2019

ping @suwang48404 @euanh

@arkodg
Copy link
Contributor Author

arkodg commented Dec 3, 2019

@Mattzi @itouch5000 @P4sca1 any thoughts for a better label name ?
TOL - does it make more sense to specify an interface in the label (com.docker.network.host_nat_iface=eth0)

@P4sca1
Copy link

P4sca1 commented Dec 3, 2019

@arkodg Im not an expert when it comes to networking and interfaces, but afaik one interface can have multiple ip addresses assigned. In that case we would have the same issue of not being able to define the outgoing ip.

@arkodg
Copy link
Contributor Author

arkodg commented Dec 3, 2019

good catch @P4sca1

@P4sca1
Copy link

P4sca1 commented Dec 17, 2019

@arkodg Anything else that needs to be considered before this one can get merged?

@Mattzi
Copy link

Mattzi commented Jan 16, 2020

Sorry for ping but
PING @mavenugo @euanh @selansen @fcrisciani

@P4sca1
Copy link

P4sca1 commented Feb 11, 2020

Now that the changes are approved, is there any chance of merging the changes? @arkodg

@arkodg
Copy link
Contributor Author

arkodg commented Feb 11, 2020

@P4sca1 we usually look for two LGTMs before we merge, a review from you would be appreciated :)

@P4sca1
Copy link

P4sca1 commented Feb 11, 2020

Alright, good to know.
I am not fluent in go and don’t have a very deep understanding on how libnetwork internally works, but I will try my best to understand and review the changes asap.

@arkodg arkodg merged commit 4e15af8 into moby:master Feb 15, 2020
@P4sca1
Copy link

P4sca1 commented Feb 15, 2020

Great to see this merged! Thank you @arkodg 😁
What are the next steps to get this functionality into docker-ce? Kinda confused how releases are done, because of all the different repos (moby, engine, docker-ce etc).

@arkodg
Copy link
Contributor Author

arkodg commented Feb 15, 2020

This should get vendored into moby/moby soon and is a likely candidate for the 20.03/20.04 major docker-ce release

thaJeztah added a commit to thaJeztah/docker that referenced this pull request Feb 17, 2020
full diff: moby/libnetwork@feeff4f...6659f7f

includes:

- moby/libnetwork#2317 Allow bridge net driver to skip IPv4 configuration of bridge interface
    - adds support for a `com.docker.network.bridge.inhibit_ipv4` label/configuration
    - addresses moby#37430 Prevent bridge network driver from setting IPv4 address on bridge interface
- moby/libnetwork#2454 Support for com.docker.network.host_ipv4 driver label
    - addresses moby#30053 Unable to choose outbound (external) IP for containers
- moby/libnetwork#2491 Improving load balancer performance
    - addresses moby#35082 [SWARM] Very poor performance for ingress network with lots of parallel requests

Signed-off-by: Sebastiaan van Stijn <[email protected]>
thaJeztah pushed a commit to arkodg/moby that referenced this pull request Mar 9, 2020
This PR adds a testcase for the com.docker.network.host_ipv4
label commited via moby/libnetwork#2454

Signed-off-by: Arko Dasgupta <[email protected]>
docker-jenkins pushed a commit to docker-archive/docker-ce that referenced this pull request Mar 9, 2020
This PR adds a testcase for the com.docker.network.host_ipv4
label commited via moby/libnetwork#2454

Signed-off-by: Arko Dasgupta <[email protected]>
Upstream-commit: 2e0762ae44ba631c6943297413728f4daac89563
Component: engine
@devsnek
Copy link

devsnek commented Mar 22, 2020

is this functionality not possible for ipv6?

@SharkWipf
Copy link

Is this PR already in the current release? I've been trying to get it working, but it seems to be ignored completely, and I'm having a hard time finding what libnetwork features are in what Docker releases, as docker version does not mention Libnetwork.

Client:
 Version:           19.03.12
 API version:       1.40
 Go version:        go1.15
 Git commit:        48a66213fe
 Built:             Mon Aug 31 02:19:58 2020
 OS/Arch:           linux/amd64
 Experimental:      false

Server:
 Engine:
  Version:          19.03.12
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.15
  Git commit:       48a66213fe
  Built:            Mon Aug 31 02:19:27 2020
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.2.13
  GitCommit:        35bd7a5f69c13e1563af8a93431411cd9ecf5021
 runc:
  Version:          1.0.0-rc10
  GitCommit:        dc9208a3303feef5b3839f4323d9beb36df0a9dd
 docker-init:
  Version:          0.18.0
  GitCommit:        fec3683b971d9c3ef73f284f176672c44b448662

@sergejostir
Copy link

Any plan on making it possible to set this per container too? Like we have com.docker.network.bridge.host_binding_ipv4 option that can be easily overriden per container.

@iambenmitchell
Copy link

Any plan on making it possible to set this per container too? Like we have com.docker.network.bridge.host_binding_ipv4 option that can be easily overriden per container.

Please, I need to give each container an external IP, the only way I can see this happening is by creating a new network for each container. Considering I have a /28 subnet that means 14 different networks

@SharkWipf
Copy link

Please, I need to give each container an external IP, the only way I can see this happening is by creating a new network for each container. Considering I have a /28 subnet that means 14 different networks

Ha, cute, I'm doing 254 separate docker networks on my /24.
I don't know if it's a problem though, or rather, if setting it per container results in even more firewall rules. With our current setup we're already up to 3.5k firewall rules and it's having noticable impact on performance.

@iambenmitchell
Copy link

Please, I need to give each container an external IP, the only way I can see this happening is by creating a new network for each container. Considering I have a /28 subnet that means 14 different networks

Ha, cute, I'm doing 254 separate docker networks on my /24.
I don't know if it's a problem though, or rather, if setting it per container results in even more firewall rules. With our current setup we're already up to 3.5k firewall rules and it's having noticable impact on performance.

Damn, and I was thinking of writing a script to automate it lol

I think something like this would be better

docker network create \
  --driver=bridge \
  --subnet=x.x.x.0/28 \
  --gateway=x.x.x.1 \
  --isPublic=true \
  bignet

and then

docker run -it --net=bignet --Ip=x.x.x.2 ubuntu bin/bash

when --ip is specified docker should check the network to see if --isPublic = true and if so, it assumes the container should have its IP publicly accessible incoming and outgoing, rather than having to create a new network for each ip

@iambenmitchell
Copy link

I can't get this solution to actually work anyways, my host requires that the IPs are statically routed through the main ip of the server, because of this, the gateway must be the first ip in the subnet. But if i have to create a new network for each ip in a subnet I don't see how I can do so without each one being in a /32 subnet, and in this case the gateway ip is not accessible as it is outside of the subnet..

What do I do?
image

@iambenmitchell
Copy link

image

I forgot to change the name of the network, the actual error is
Error response from daemon: Pool overlaps with other one on this address space

@iambenmitchell
Copy link

iambenmitchell commented Jan 22, 2021

I forgot about ip ranges, that could be my solution, except that I cannot use the same gateway over again
image

and I also can't just not specify a gateway because
Error response from daemon: cannot create network afe8207c3a649f14f8173500c62a237b6033e812c585cc0f45329ef51ccfe077 (br-afe8207c3a64): conflicts with network fac274f74d818cbfe760b5a7394591d20eba1959b7cd11b44b2787810f0619fc (br-fac274f74d81): networks have overlapping IPv4

I also cannot use the network either

docker run -it --net=mail --ip=x.x.242.2 ubuntu bin/bash

docker: Error response from daemon: Address already in use. ERRO[0005] error waiting for container: context canceled

@iambenmitchell
Copy link

@SharkWipf how do you set up your networks? what command do you use, I still haven't figured out how to create one for each ip yet

@SharkWipf
Copy link

@MrBenFTW while this is getting out of scope for comments on a pull request, I'm currently using a fairly hacky script that I run whenever I need to build a container, that checks if a network for the IP already exists, and if not, creates it with a manually specified subnet per network (necessary for my use case, might not be necessary for yours). Then I just assign said network to the container(s) afterwards.

if ! (( $(docker network ls -qf "name=network.${ip_new}$" | wc -l) )); then
  docker network create --driver bridge \
    --subnet "10.10.${ip_new##*.}.1/24" \
    --gateway "10.10.${ip_new##*.}.1" \
    --opt "com.docker.network.host_ipv4=${ip_new}" \
    "network.${ip_new}"
fi

@AlexGrs
Copy link

AlexGrs commented Jan 26, 2021

Is it deployed in docker-ce 20.10? There seems to be no mention of it in the release note?

@iambenmitchell
Copy link

iambenmitchell commented Jan 26, 2021

@SharkWipf Actually there's a much better way of doing it...

Create network

docker network create \
  --driver=bridge \
-o "com.docker.network.bridge.enable_ip_masquerade"="false" \
--subnet=abc.abx.242.0/28 \
--gateway=abc.abc.242.1 \
  bignet

Create containers
docker run -it --net=bignet --ip=abc.abc.242.2 ubuntu bin/bash
docker run -it --net=bignet --ip=abc.abc.242.3 ubuntu bin/bash
docker run -it --net=bignet --ip=abc.abc.242.4 ubuntu bin/bash
docker run -it --net=bignet --ip=abc.abc.242.5 ubuntu bin/bash
and so on
this does exactly what I need it to do. I don't know why you would use the other method, this lets you use a single network.

@SharkWipf
Copy link

@MrBenFTW Your method, I believe, only works if you don't need outbound connections (or at least I couldn't get it working with outbound connections in my quick testing), whereas this PR specifically applies to outbound connections.
If all you need is for your containers to be reachable on specific IPs, you can just pass an IP in -p, no need for any special networking options for that. You can just publish individual ports like -p abc.abc.242.2:yourport:yourport.

@iambenmitchell
Copy link

iambenmitchell commented Jan 29, 2021

@MrBenFTW Your method, I believe, only works if you don't need outbound connections (or at least I couldn't get it working with outbound connections in my quick testing), whereas this PR specifically applies to outbound connections.
If all you need is for your containers to be reachable on specific IPs, you can just pass an IP in -p, no need for any special networking options for that. You can just publish individual ports like -p abc.abc.242.2:yourport:yourport.

Nope.

This specifically applies to inbound and outbound connections.

If I do curl ifconfig.me I will receive the ip of the container on the subnet. Not the host ip. Therefore outbound connections are working :)

Everything is working perfectly now. My mail server resolves to the correct RDNS

@zhuimeng528
Copy link

Hello everyone, my environment is: a server has multiple public IP addresses. I want dockers to use the host mode, and then each docker can specify a different exit IP address. Is it possible? thank you!

@sergejostir
Copy link

If you use the host mode, then the docker does nothing with your networking. Your process has to bind to a proper interface.

@mendorf
Copy link

mendorf commented Mar 22, 2021

@thaJeztah @arkodg did this make into any release of docker-ce / engine? It is not mentioned in any of the release notes. If not, when is this planned? Thanks for clarifying

@thaJeztah
Copy link
Member

@mendorf-ebf yes, looks like it was included in docker 20.10 through moby/moby#40579, but missed in the changelog (at a quick glance); https://docs.docker.com/engine/release-notes/#20100

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.