Skip to content

Conversation

@ansd
Copy link
Member

@ansd ansd commented May 11, 2021

Relates #662.

Dynamic peer discovery is not needed for RabbitMQ Clusters deployed by the Cluster Operator. All nodes are known at deploy time. The Cluster Operator knows the number of replicas and their host names.

This PR uses classic peer discovery with a static list of nodes instead of the dynamic rabbitmq_peer_discovery_k8s plugin.
In the case of a scale out (i.e. more RabbitMQ nodes added to the RabbitMQ cluster), existing nodes do not get restarted.

Pros:

  • Nodes can join other nodes that are running but not yet ready increasing the likelihood of discovering peers
  • no sophisticated locking mechanism

Cons:

  • There might still be cases where clusters do not get formed correctly since this approach still relies on randomised startup delays. @mkuratczyk and I are doing some more testing with different parameters.

Alternatives:

@ansd ansd changed the title Use classic peer discovery Explore: classic peer discovery with randomised startup delay May 12, 2021
ansd added 2 commits May 19, 2021 16:03
instead of rabbit_peer_discovery_k8s plugin.

For RabbitMQ clusters deployed by the RabbitMQ cluster operator, there
is no need for dynamic service discovery since cluster members are known
at deploy time.

By using the classic config, we increase likelihood of nodes discovering
peers.
In contrast, K8S peer discovery only considers peers that are ready,
which might take a long time resulting in more than one node to start a
cluster.
If the RabbitMQ cluster is under heavy load and is being scaled out
(i.e. more RabbitMQ nodes added to the RabbitMQ cluster), existing nodes
shouldn't be restarted.

Before this commit, exising nodes were restarted because the ConfigMap
gets updated since new peers get included for peer discovery.
However, existing nodes do not need this new peer discovery
configuration since the cluster is already formed.
@ansd ansd force-pushed the peer-discovery-classic branch from 54a2b66 to ff33f41 Compare May 19, 2021 14:07
@ansd
Copy link
Member Author

ansd commented Jun 7, 2021

Closing this PR in favor of rabbitmq/rabbitmq-server#3075.

@ansd ansd closed this Jun 7, 2021
@ansd ansd deleted the peer-discovery-classic branch June 8, 2021 07:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants