Quorum calculation issues

**Context**
As of right now, quorum is calculated by evaluating each registered replica within the cluster and asking it who it thinks the current primary is.  

We achieve two things by doing this:
1. Confirm that the target members are reachable.
2. We are building quorum by confirming that the majority of the cluster agree's on who the primary is.


The reason we evaluate _every_ member within the cluster is because the primary member is unable to know how the `PRIMARY_REGION` value is configured per Machine.  The `PRIMARY_REGION` environment variable represents the region holding the primary and dictates which members are eligible for promotion.  This value should only be changed when a user needs to issue a regional failover, which requires the end-user to perform an update against every Machine in their cluster with the new `PRIMARY_REGION` value.  There are a number of ways this process could fail, which presents opportunities for inconsistency. 

Example case:

This issue can be expressed in a 6 node cluster with members split evenly across two different regions.

Nodes A,B,C residing in **ord** with `PRIMARY_REGION` set to **ord**
Nodes D,E,F residing in **iad** with `PRIMARY_REGION` set to **iad**

If quorum only considered "in-region" nodes, then it would be possible for a split-brain to occur across regions with the presence of a misconfigured `PRIMARY_REGION` environment variable. 


**The problem with considering _every_ registered member**
The problem with this approach is that we start to see an increase in connection timeouts when replicas are placed in regions considerably far away from the primary region. Connection timeouts are currently set to 5 seconds for each registered member.  This issue is amplified if the replica is under load.  When quorum cannot be reached, then the cluster will go read-only until quorum is restored. This can lead to a bad experience for many users who want to push many of their read-replicas into distant regions.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Quorum calculation issues #191

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Quorum calculation issues #191

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions