Skip to content

Conversation

davissp14
Copy link
Contributor

This works reduces the time it takes to identify and fence a primary in the event of a network partition.

When a network partition is initiated a couple things need to happen:

  1. Repmgr will attempt to connect to a registered standby with a 5s connect_timeout.
  2. Repmgr will wait up to 30 seconds for the standby to reconnect before issuing a child_node_disconnect event.
  3. The child_node_disconnect event is then processed and triggers a cluster state evaluation.
  4. The time it takes to evaluate the cluster will depend on the number of nodes registered with the cluster that are no longer reachable. Worst case, we should expect 5s per registered standby.

The split-brain detection window can be calculated using the following formula:

connect_timeout + standby_reconnect_timeout + (registered standbys * 5)

For a typical 3 node cluster we are looking at:

Connect timeout: 5s
Standby reconnect timeout: 30s
Registered standbys: (2 * 5s) = 10s

Total time: 45 seconds.

There are some optimizations we can make here to cut-down on time. E.G. We could get away evaluating only a subset of the registered members and bail once we know quorum can't be met. I'll have to think more about this.

@davissp14 davissp14 merged commit 5d21394 into master Mar 8, 2023
@davissp14 davissp14 deleted the reduce-dataloss-window branch March 20, 2023 22:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant