Cannot access containers by hostname with Docker overlay driver in Swarm Mode #25236

outcoldman · 2016-07-29T16:42:46Z

TL;DR with overlay network cannot access containers by hostnames located on different swarm node

Create swarm cluster with 2 nodes

docker-machine create --driver=virtualbox swarm1
docker-machine create --driver=virtualbox swarm2
(eval $(docker-machine env swarm1) && docker swarm init --advertise-addr $(docker-machine ip swarm1))
(eval $(docker-machine env swarm2) && docker swarm join --listen-addr $(docker-machine ip swarm2) $(docker-machine ip swarm1):2377 --token $(eval $(docker-machine env swarm1) && docker swarm  join-token -q worker ) )
eval $(docker-machine env swarm1)

Create overlay network

docker network create --driver=overlay mynetwork

Start new service with outcoldman/splunk (nothing specific about this image, just use as an example)

docker service create --mode replicated --replicas 2 --name myservice --network mynetwork --env SPLUNK_START_ARGS="--accept-license" outcoldman/splunk

Wait till you will see that service will be deployed to both nodes

docker service ps myservice

Do docker inspect for both containers

(eval $(docker-machine env swarm1) && docker inspect $(docker ps -aq))
(eval $(docker-machine env swarm2) && docker inspect $(docker ps -aq))

On container of swarm1 for example I see

"Networks": {
    "mynetwork": {
        "IPAMConfig": {
            "IPv4Address": "10.0.0.4"
        },
        "Links": null,
        "Aliases": [
            "041182c00923"
        ],
        "NetworkID": "56ec47mblgdviyireaurrlih3",
        "EndpointID": "0b9b674b6295d01ef7d155d4170caa810cac52e2e6acfb4ac9a87a89bc487fb0",
        "Gateway": "",
        "IPAddress": "10.0.0.4",
        "IPPrefixLen": 24,
        "IPv6Gateway": "",
        "GlobalIPv6Address": "",
        "GlobalIPv6PrefixLen": 0,
        "MacAddress": "02:42:0a:00:00:04"
    }
}

So as you can see alias on network mynetwork should be 041182c00923

If we will try to ping this host from other container by hostname

(eval $(docker-machine env swarm2) && docker exec -it $(docker ps -q) entrypoint.sh ping 041182c00923)
ping: unknown host
zsh: exit 1     ( eval $(docker-machine env swarm2) && docker exec $(docker ps -q)  ping ; )

By ip address everything is fine

(eval $(docker-machine env swarm2) && docker exec -it $(docker ps -q) entrypoint.sh ping 10.0.0.4)
PING 10.0.0.4 (10.0.0.4): 56 data bytes
64 bytes from 10.0.0.4: icmp_seq=0 ttl=64 time=0.461 ms
64 bytes from 10.0.0.4: icmp_seq=1 ttl=64 time=0.708 ms
64 bytes from 10.0.0.4: icmp_seq=2 ttl=64 time=0.738 ms
64 bytes from 10.0.0.4: icmp_seq=3 ttl=64 time=0.715 ms
64 bytes from 10.0.0.4: icmp_seq=4 ttl=64 time=0.723 ms

Lets find hostname by ip address with python

(eval $(docker-machine env swarm2) && docker exec -it $(docker ps -q) entrypoint.sh splunk cmd python -c "import socket;print socket.gethostbyaddr('10.0.0.4')")
('myservice.1.4nvx54kvgxzpjy15sepurr478.mynetwork', [], ['10.0.0.4'])

So actually myservice.1.4nvx54kvgxzpjy15sepurr478.mynetwork is used as hostname for network discovery, which is container name, but not the hostname!

The text was updated successfully, but these errors were encountered:

sanimej · 2016-07-29T17:41:35Z

@outcoldman Docker Service Discovery has always been based on the container name and not the hostname of the container. The behavior you are seeing is expected. In 1.12 with the introduction of services, the discovery works for service name as well.

outcoldman · 2016-07-29T17:56:19Z

@sanimej ok, so this is what I was afraid of. Seems like a design bug then. What is the recommended way to setup replication between containers created by same service?

Let's say that I want to setup replication between 3 DB instances, if I want to use docker service create --mode replicated --replicas 3 --name my_db. I can connect client to my 3 db instances by using name my_db, but how I will be able to setup replication between these 3 instances?

By container name? as I understand it might be changed after I do upgrade. And if not - I don't believe that container can see itself by the container name, only hostname.
By ip? Same issue - if you will reconnect network - you might get different IP, so this is not a bullet proof solution.
By hostname?.... not supported by Swarm... Or I need to setup my own DNS server, but I do not see an option in create service which will allow me to do that.

sanimej · 2016-07-29T18:22:55Z

@outcoldman Looks like your requirement is to get the IPs of backing containers of a service. In the service discovery we add a special entry tasks.<service_name> which lists the A records of all the container IPs. Will this work for you ?

There is also an endpoint-mode option in service create which if set to dnsrr will return the IPs of all the backing containers instead of a VIP for the service. Currently this doesn't work with --publish though.

outcoldman · 2016-07-29T19:24:27Z

@sanimej ip addresses are not very helpful. What I really want is to have a way the container itself to know what hostname/address can be used by this container to be advertised. And of course I want to do that without talking to the docker daemon.

I am actually a little bit confused. Docker says that it has built-in orchestration solution (see https://blog.docker.com/2016/06/docker-1-12-built-in-orchestration/), but what you are saying that it does not support pretty simple case of autoscaling replication with docker service scale SERVICE=X, because I just cannot set it up with all built-in tools. Of course I can use something else for orchestration, like Ansible, but in that case I see 0 reasons to use built-in orchestration solutions. So it seems like that the only supported scenario is to autoscale web tier which does not have dependencies on each other. When you will need to make connections between nodes in this tier - there is no way to do that.

mavenugo · 2016-07-29T20:04:14Z

@outcoldman as you pointed out the default hostname of the container is the container short-id and it is also the network-alias and hence communication between the containers using this short-id alias should work. If it doesnt, then it is a bug and must be fixed. (I just tried it and the ping between containers using short-id works within the host but failed across the hosts. I think that is the bug that you referred to above).

But, I would like to understand your use-case better. If the above bug is fixed and the containers can talk to each other using their hostname (which happens to also be the net-alias), will that satisfy your use-case ? (or) you are looking for static ways to setup hostnames for each container even when it is actually front-ended by a service concept ?
Also, how does your replication and self-discovery of these containers work ?

outcoldman · 2016-07-29T21:41:12Z

@mavenugo yes, you are right, the bug I posted is about that. I cannot access containers in my network by the short auto-generated hostnames when containers are located not on the same swarm node.

If the above bug is fixed ... will that satisfy your use-case ?

Yes!

you are looking for static ways to setup hostnames for each container even when it is actually front-ended by a service concept ?

That actually will be useful. I mean I do not want hosts with the same hostname inside my network. But for example if I will create service with name mydb it will be awesome if I will be able to access it by endpoint name mydb with load balancing (this is already working) and also auto generated hostnames on each host will have better human readable name prefixed with service name mydb-X.

how does your replication and self-discovery of these containers work ?

With consul. I have a consul cluster working in containers. Each member has an init script which register itself with consul, also I have auto bootstrap factor (3 by default), when this node see that it is third registered member - it bootstraps the replication cluster, if it sees that number is more than 3 - it adds itself to existing cluster.

I guess another option can be to use zero-configuration services, like Avahi.

mavenugo · 2016-07-30T00:01:55Z

@outcoldman we will get the short-id net-alias resolved. Also I guess based on your response to my question , #24973 will interest you.

Regarding the self-discovery use-case, am still not clear on why you would need any other external tool when the built-in service-discovery has features like tasks.<service-name>. Essentially, this special dns query will return all the IPs of the running tasks of a service. A revere dns lookup of each of the IP will give you the container-name (if you choose to use it). Since all of these are DNS queries, you are not required to call docker APIs. am not certain if that satisfies your requirement. Atleast I would like to expose you to this existing functionality in docker 1.12 which you could leverage if you like. (Also you can choose to replace the VIP based LB with Multiple A-record DNS mechanism, which will behave the same way as I explained here).

outcoldman · 2016-07-31T00:15:35Z

@mavenugo could you point on any documentation regarding which DNS queries are supported?

LK4D4 · 2016-08-05T17:08:33Z

@mavenugo @sanimej do you think this is material for 1.12.1?

sanimej · 2016-08-05T17:17:12Z

@LK4D4 Fix for this issue has already been merged in libnetwork. We will have it in for 1.12.1

LK4D4 · 2016-08-05T17:17:55Z

@sanimej nice, thank you!

sanimej · 2016-08-05T17:37:02Z

@outcoldman #25420 is the doc PR for 1.12 swarm mode networks. It explains the service discovery options as well.

* Fixes moby#25236 * Fixes moby#24789 * Fixes moby#25340 * Fixes moby#25130 * Fixes moby/libnetwork#1387 * Fix external DNS responses > 512 bytes getting dropped * Fix crash when remote plugin returns empty address string * Make service LB work from self * Fixed a few race-conditions Signed-off-by: Madhu Venugopal <[email protected]>

* Fixes moby#25236 * Fixes moby#24789 * Fixes moby#25340 * Fixes moby#25130 * Fixes moby/libnetwork#1387 * Fix external DNS responses > 512 bytes getting dropped * Fix crash when remote plugin returns empty address string * Make service LB work from self * Fixed a few race-conditions Signed-off-by: Madhu Venugopal <[email protected]> (cherry picked from commit 6645ff8) Signed-off-by: Tibor Vass <[email protected]>

* Fixes moby#25236 * Fixes moby#24789 * Fixes moby#25340 * Fixes moby#25130 * Fixes moby/libnetwork#1387 * Fix external DNS responses > 512 bytes getting dropped * Fix crash when remote plugin returns empty address string * Make service LB work from self * Fixed a few race-conditions Signed-off-by: Madhu Venugopal <[email protected]>

manixx · 2017-08-09T12:29:54Z

I have the issue that the reverse lookup of the ip addresses (got by the tasks.* dns request) returns two different addresses (almost randomly).

I wrote a small node script wich does the dns lookup for tasks.* and does the reverse lookup:

const dns = require('dns');
dns.resolve('tasks.rabbitmq', 'A', (err, records) => {
        if(err) throw err;
        records.forEach(ip => {
                setInterval(() => {
                        dns.reverse(ip, (err, addr) => console.log(`reversed lookup ${ip}`, err, addr));
                }, 500);
        });
});

I have two nodes (one master, one slave) and 3 tasks (one service) running. They run in their own network (otherwise the tasks.* request fails). I get two different responses, one is the service name and the other the network-alias/short-id/hostname. But they change randomly. Why does this happen and is there a way to make this consistent?

The output of the log is:

reversed lookup 10.0.0.3 null [ 'e340a4aa4b96.rabbitmq' ]
reversed lookup 10.0.0.5 null [ 'rabbitmq.3.gtrft0tt3ub9mz0hsvewh5vel.rabbitmq' ]
reversed lookup 10.0.0.4 null []

reversed lookup 10.0.0.3 null [ 'rabbitmq.1.vgdky17ykdriqqtbtzyt5zbcx.rabbitmq' ]
reversed lookup 10.0.0.5 null [ 'rabbitmq.3.gtrft0tt3ub9mz0hsvewh5vel.rabbitmq' ]
reversed lookup 10.0.0.4 null []

reversed lookup 10.0.0.4 null []
reversed lookup 10.0.0.3 null [ 'rabbitmq.1.vgdky17ykdriqqtbtzyt5zbcx.rabbitmq' ]
reversed lookup 10.0.0.5 null [ 'f0964f1b2638.rabbitmq' ]

outcoldman mentioned this issue Jul 29, 2016

[1.12 rc4]service discovery failed #24679

Closed

icecrime added area/networking area/swarm labels Jul 29, 2016

sanimej mentioned this issue Aug 3, 2016

Add container short-id as an alias for swarm mode tasks moby/libnetwork#1372

Merged

outcoldman mentioned this issue Aug 5, 2016

Docker Swarm Mode: keeping volumes for updated tasks #25446

Open

LK4D4 added this to the 1.12.1 milestone Aug 5, 2016

mavenugo mentioned this issue Aug 11, 2016

Vendoring libnetwork for 1.12.1-rc1 #25603

Merged

vdemeester closed this as completed in #25603 Aug 11, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot access containers by hostname with Docker overlay driver in Swarm Mode #25236

Cannot access containers by hostname with Docker overlay driver in Swarm Mode #25236

outcoldman commented Jul 29, 2016

sanimej commented Jul 29, 2016

outcoldman commented Jul 29, 2016

sanimej commented Jul 29, 2016 •

edited

Loading

outcoldman commented Jul 29, 2016

mavenugo commented Jul 29, 2016

outcoldman commented Jul 29, 2016 •

edited

Loading

mavenugo commented Jul 30, 2016

outcoldman commented Jul 31, 2016

LK4D4 commented Aug 5, 2016

sanimej commented Aug 5, 2016

LK4D4 commented Aug 5, 2016

sanimej commented Aug 5, 2016

manixx commented Aug 9, 2017

Cannot access containers by hostname with Docker overlay driver in Swarm Mode #25236

Cannot access containers by hostname with Docker overlay driver in Swarm Mode #25236

Comments

outcoldman commented Jul 29, 2016

sanimej commented Jul 29, 2016

outcoldman commented Jul 29, 2016

sanimej commented Jul 29, 2016 • edited Loading

outcoldman commented Jul 29, 2016

mavenugo commented Jul 29, 2016

outcoldman commented Jul 29, 2016 • edited Loading

mavenugo commented Jul 30, 2016

outcoldman commented Jul 31, 2016

LK4D4 commented Aug 5, 2016

sanimej commented Aug 5, 2016

LK4D4 commented Aug 5, 2016

sanimej commented Aug 5, 2016

manixx commented Aug 9, 2017

sanimej commented Jul 29, 2016 •

edited

Loading

outcoldman commented Jul 29, 2016 •

edited

Loading