[BUG][Docker Swarm Multi Node Cluster connectivity issue] #224

hamzaismaeel15 · 2025-02-13T10:42:43Z

Description:

I am facing an issue that my OpenSearch cluster won't connect, I am running 3 virtual machines and created their docker swarm cluster, I want to create one container on each virtual machine and create cluster.

To Reproduce:

Steps to reproduce the behavior:

All you need is to copy the docker-compose file and create docker swarm 3 vm cluster and make sure to add labels.

Copy the docker compose and create docker-compose.yml file
use command to run, "docker stack deploy -c 'file-name' 'cluster-name' e.g docker stack deploy -c docker-compose.yml opensearch

docker-compose.yml

version: '3'
services:
opensearch-node1:
image: opensearchproject/opensearch:2.18.0
#container_name: opensearch-node1
environment:
- cluster.name=opensearch-cluster
- node.name=opensearch-node1
- discovery.seed_hosts=opensearch-node1,opensearch-node2,opensearch-node3
- cluster.initial_cluster_manager_nodes=opensearch-node1,opensearch-node2,opensearch-node3
- bootstrap.memory_lock=true
- "OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m"
- "DISABLE_INSTALL_DEMO_CONFIG=true"
- "DISABLE_SECURITY_PLUGIN=true"
- OPENSEARCH_INITIAL_ADMIN_PASSWORD=Hamza@31017
ulimits:
memlock:
soft: -1
hard: -1
nofile:
soft: 65536
hard: 65536
volumes:
- /opt/os/data1:/usr/share/opensearch/data
ports:
- 9200:9200
- 9600:9600
- 9300:9300
deploy:
placement:
constraints:
- "node.labels.db == ubuntu"
networks:
- opensearch-net

opensearch-node2:
image: opensearchproject/opensearch:2.18.0
#container_name: opensearch-node2
environment:
- cluster.name=opensearch-cluster
- node.name=opensearch-node2
- discovery.seed_hosts=opensearch-node1,opensearch-node2,opensearch-node3
- cluster.initial_cluster_manager_nodes=opensearch-node1,opensearch-node2,opensearch-node3
- bootstrap.memory_lock=true
- "OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m"
- "DISABLE_INSTALL_DEMO_CONFIG=true"
- "DISABLE_SECURITY_PLUGIN=true"
- OPENSEARCH_INITIAL_ADMIN_PASSWORD=Hamza@31017
ulimits:
memlock:
soft: -1
hard: -1
nofile:
soft: 65536
hard: 65536
volumes:
- /opt/os/data2:/usr/share/opensearch/data
deploy:
placement:
constraints:
- "node.labels.db == node1"
networks:
- opensearch-net

opensearch-node3:
image: opensearchproject/opensearch:2.18.0
#container_name: opensearch-node3
environment:
- cluster.name=opensearch-cluster
- node.name=opensearch-node2
- discovery.seed_hosts=opensearch-node1,opensearch-node2,opensearch-node3
- cluster.initial_cluster_manager_nodes=opensearch-node1,opensearch-node2,opensearch-node3
- bootstrap.memory_lock=true
- "OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m"
- "DISABLE_INSTALL_DEMO_CONFIG=true"
- "DISABLE_SECURITY_PLUGIN=true"
- OPENSEARCH_INITIAL_ADMIN_PASSWORD=Hamza@31017
ulimits:
memlock:
soft: -1
hard: -1
nofile:
soft: 65536
hard: 65536
volumes:
- /opt/os/data3:/usr/share/opensearch/data
deploy:
placement:
constraints:
- "node.labels.db == node2"
networks:
- opensearch-net

opensearch-dashboards:
image: opensearchproject/opensearch-dashboards:2.18.0
container_name: opensearch-dashboards
ports:
- 5601:5601
expose:
- "5601"
environment:
- 'OPENSEARCH_HOSTS=["http://opensearch-node1:9200","http://opensearch-node2:9200","http://opensearch-node3:9200"]'
- "DISABLE_SECURITY_DASHBOARDS_PLUGIN=true"
networks:
- opensearch-net

networks:
opensearch-net:

ISSUE:

[WARN ][o.o.c.c.ClusterFormationFailureHelper] [opensearch-node1] cluster-manager not discovered or elected yet, an election requires at least 2 nodes with ids from [hTfIxK_mST-qJgwiGS0w6w, N8R2kscGT1GAbK-L-mj_qQ, Yesi_vQZQyC-iCwDRkrjuw], have discovered [{opensearch-node1}{hTfIxK_mST-qJgwiGS0w6w}{SCJ76MhQQT6Q2ql-KMPpBg}{10.0.0.80}{10.0.0.80:9300}{dimr}{shard_indexing_pressure_enabled=true}, {opensearch-node2}{4J085Ma_T3qvuPUshCdLBA}{TodgiY4lQ6KY5RF8pZblNQ}{10.0.1.17}{10.0.1.17:9300}{dimr}{shard_indexing_pressure_enabled=true}, {opensearch-node2}{MrJ4dBqdQO2QlHp0RDulPA}{qCoQee04QuaSsO-J8Wc8mQ}{10.0.1.18}{10.0.1.18:9300}{dimr}{shard_indexing_pressure_enabled=true}] which is not a quorum; discovery will continue using [10.0.1.5:9300, 10.0.1.7:9300, 10.0.1.10:9300] from hosts providers and [{opensearch-node1}{hTfIxK_mST-qJgwiGS0w6w}{SCJ76MhQQT6Q2ql-KMPpBg}{10.0.0.80}{10.0.0.80:9300}{dimr}{shard_indexing_pressure_enabled=true}] from last-known cluster state; node term 3, last-accepted version 89 in term 3

Expected behavior:

It should run the cluster as it runs with docker-compose on single vm, if i run this on single vm by removing the placement constraints from the docker-compose.yml it runs fine on only one vm. I need to work with docker swarm by having one container on each node and create cluster.

Host/Environment:

OS: Linux - Ubuntu
Version 20.04

DandyDeveloper · 2025-02-17T10:54:22Z

@hamzaismaeel15 That looks more like you only have a single node running in the swarm? Are you sure all the containers are running correctly?

hamzaismaeel15 · 2025-02-17T11:04:12Z

@DandyDeveloper All containers are deployed on different Virtual machines as you can see the placement constraints defined in the file, which deploys each container on dedicated hostname virtual machine, if it is possible you can join me via call to see the setup. Thank you

DandyDeveloper · 2025-02-17T13:43:38Z

@hamzaismaeel15 Sorry, I'm unable to join a call, but if possible, can you go into the container and verify that your CRI / CNI has appropriately configure the hostnames and the other nodes are resolvable in the network?

The implication is certainly that the swarm side of things is either misconfigured, or the hosts are, but we'll need a lot more info to dive into it.

Exec into a container, try polling the other nodes to establish whether the networking is setup correctly or not.

hamzaismaeel15 added bug Something isn't working untriaged Issues that have not yet been triaged labels Feb 13, 2025

github-project-automation bot added this to Engineering Effectiveness Board Feb 13, 2025

github-project-automation bot moved this to 🆕 New in Engineering Effectiveness Board Feb 13, 2025

hamzaismaeel15 changed the title ~~[BUG][Docker Swarm Multi Node Cluster]~~ [BUG][Docker Swarm Multi Node Cluster connectivity issue] Feb 13, 2025

DandyDeveloper removed the untriaged Issues that have not yet been triaged label Feb 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG][Docker Swarm Multi Node Cluster connectivity issue] #224

[BUG][Docker Swarm Multi Node Cluster connectivity issue] #224

hamzaismaeel15 commented Feb 13, 2025 •

edited

Loading

DandyDeveloper commented Feb 17, 2025

hamzaismaeel15 commented Feb 17, 2025

DandyDeveloper commented Feb 17, 2025

[BUG][Docker Swarm Multi Node Cluster connectivity issue] #224

[BUG][Docker Swarm Multi Node Cluster connectivity issue] #224

Comments

hamzaismaeel15 commented Feb 13, 2025 • edited Loading

DandyDeveloper commented Feb 17, 2025

hamzaismaeel15 commented Feb 17, 2025

DandyDeveloper commented Feb 17, 2025

hamzaismaeel15 commented Feb 13, 2025 •

edited

Loading