Skip to content
This repository was archived by the owner on Aug 23, 2023. It is now read-only.

PR #1089 breaks docker/docker-cluster using current settings in metrictank.ini #1096

Closed
robert-milan opened this issue Oct 14, 2018 · 0 comments
Labels

Comments

@robert-milan
Copy link
Contributor

#1089 changes to swim AdvertiseAddr configuration with current metrictank.ini settings breaks clustering in docker-cluster and probably docker-chaos.

Before PR:

metrictank2_1   | 2018-10-13 22:52:05.358 [INFO] CLU Start: Starting cluster on 0.0.0.0:7946
metrictank2_1   | 2018-10-13 22:52:05.363 [INFO] CLU manager: HTTPNode metrictank2 with address 172.19.0.14 has joined the cluster
metrictank2_1   | 2018/10/13 22:52:05 [DEBUG] memberlist: Initiating push/pull sync with: 172.19.0.13:7946
metrictank0_1   | 2018/10/13 22:52:05 [DEBUG] memberlist: Stream connection from=172.19.0.14:47238
**metrictank0_1   | 2018-10-13 22:52:05.375 [INFO] CLU manager: HTTPNode metrictank2 with address 172.19.0.14 has joined the cluster
**metrictank2_1   | 2018-10-13 22:52:05.375 [INFO] CLU manager: HTTPNode metrictank1 with address 172.19.0.16 has joined the cluster
**metrictank2_1   | 2018-10-13 22:52:05.375 [INFO] CLU manager: HTTPNode metrictank3 with address 172.19.0.15 has joined the cluster
**metrictank2_1   | 2018-10-13 22:52:05.375 [INFO] CLU manager: HTTPNode metrictank0 with address 172.19.0.13 has joined the cluster
metrictank1_1   | 2018/10/13 22:52:05 [DEBUG] memberlist: Stream connection from=172.19.0.14:57308
metrictank2_1   | 2018/10/13 22:52:05 [DEBUG] memberlist: Initiating push/pull sync with: 172.19.0.16:7946
metrictank1_1   | 2018-10-13 22:52:05.376 [INFO] CLU manager: HTTPNode metrictank2 with address 172.19.0.14 has joined the cluster
metrictank3_1   | 2018/10/13 22:52:05 [DEBUG] memberlist: Stream connection from=172.19.0.14:56700
metrictank3_1   | 2018-10-13 22:52:05.377 [INFO] CLU manager: HTTPNode metrictank2 with address 172.19.0.14 has joined the cluster
metrictank2_1   | 2018/10/13 22:52:05 [DEBUG] memberlist: Initiating push/pull sync with: 172.19.0.15:7946
metrictank2_1   | 2018-10-13 22:52:05.377 [INFO] CLU Start: joined to 3 nodes in cluster

After PR:

metrictank1_1   | 2018-10-13 22:27:41.023 [WARNING] It is not recommended to run a multi-node cluster with more than 1 input plugin.
metrictank1_1   | 2018-10-13 22:27:41.023 [INFO] CLU Start: Starting cluster on 0.0.0.0:7946
metrictank1_1   | 2018-10-13 22:27:41.025 [INFO] CLU manager: HTTPNode metrictank1 with address 0.0.0.0 has joined the cluster
metrictank1_1   | 2018/10/13 22:27:41 [DEBUG] memberlist: Initiating push/pull sync with: 172.19.0.13:7946
metrictank0_1   | 2018/10/13 22:27:41 [DEBUG] memberlist: Stream connection from=172.19.0.15:48100
**metrictank0_1   | 2018-10-13 22:27:41.029 [INFO] CLU manager: HTTPNode metrictank1 with address 0.0.0.0 has joined the cluster
**metrictank1_1   | 2018-10-13 22:27:41.029 [INFO] CLU manager: HTTPNode metrictank3 with address 0.0.0.0 has joined the cluster
**metrictank1_1   | 2018-10-13 22:27:41.029 [INFO] CLU manager: HTTPNode metrictank0 with address 0.0.0.0 has joined the cluster
metrictank1_1   | 2018/10/13 22:27:41 [DEBUG] memberlist: Failed to join 172.19.0.16: dial tcp 172.19.0.16:7946: connect: connection refused
metrictank3_1   | 2018/10/13 22:27:41 [DEBUG] memberlist: Stream connection from=172.19.0.15:42728
metrictank1_1   | 2018/10/13 22:27:41 [DEBUG] memberlist: Initiating push/pull sync with: 172.19.0.14:7946
metrictank3_1   | 2018-10-13 22:27:41.032 [INFO] CLU manager: HTTPNode metrictank1 with address 0.0.0.0 has joined the cluster
metrictank1_1   | 2018-10-13 22:27:41.032 [INFO] CLU Start: joined to 2 nodes in cluster

Which then leads to endless amounts of this:

metrictank3_1   | 2018/10/13 22:27:42 [ERR] memberlist: Failed to send gossip to 0.0.0.0:7946: write udp [::]:7946->0.0.0.0:7946: sendto: cannot assign requested address
metrictank2_1   | 2018/10/13 22:27:42 [ERR] memberlist: Failed to send gossip to 0.0.0.0:7946: write udp [::]:7946->0.0.0.0:7946: sendto: cannot assign requested address
metrictank2_1   | 2018/10/13 22:27:42 [ERR] memberlist: Failed to send gossip to 0.0.0.0:7946: write udp [::]:7946->0.0.0.0:7946: sendto: cannot assign requested address
metrictank2_1   | 2018/10/13 22:27:42 [ERR] memberlist: Failed to send gossip to 0.0.0.0:7946: write udp [::]:7946->0.0.0.0:7946: sendto: cannot assign requested address
metrictank3_1   | 2018/10/13 22:27:42 [ERR] memberlist: Failed to send ping: write udp [::]:7946->0.0.0.0:7946: sendto: cannot assign requested address
metrictank1_1   | 2018/10/13 22:27:43 [ERR] memberlist: Failed to send ping: write udp [::]:7946->0.0.0.0:7946: sendto: cannot assign requested address
metrictank2_1   | 2018/10/13 22:27:43 [ERR] memberlist: Failed to send ping: write udp [::]:7946->0.0.0.0:7946: sendto: cannot assign requested address
metrictank0_1   | 2018/10/13 22:27:43 [ERR] memberlist: Failed to send ping: write udp [::]:7946->0.0.0.0:7946: sendto: cannot assign requested address
metrictank3_1   | 2018/10/13 22:27:43 [ERR] memberlist: Failed to send ping: write udp [::]:7946->0.0.0.0:7946: sendto: cannot assign requested address

After following the calls through memberlist here is what I have learned:

  • If we supply a valid IP Address it will be used by memberlist (this includes 0.0.0.0)
  • If we supply an empty string memberlist will attempt to find the IP Address of the machine it is running on and use that instead, which is why it has worked in the past

Upon encountering an empty advertise-addr in metrictank.ini we should leave swimAdvertiseAddr a nil pointer (https://github.com/grafana/metrictank/blob/master/cluster/config.go#L130) and then use that to check for validity when the manager is setting up the config (https://github.com/grafana/metrictank/blob/master/cluster/manager.go#L87) and act accordingly.

Open to other ideas.

robert-milan added a commit that referenced this issue Oct 14, 2018
Fixes bug introduced in #1089

Resolves: #1096
See also: #1089
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

1 participant