Skip to content
This repository was archived by the owner on Feb 18, 2025. It is now read-only.
This repository was archived by the owner on Feb 18, 2025. It is now read-only.

cluster-osc-slaves API may not return all required slaves #1423

@sjmudd

Description

@sjmudd

The following code snippet shows what happens when returning slaves of a cluster which could be used as OSC control replicas:

An issue was noticed that with 3 datacentres replication delay was noticed in one of them. The current code looks for 2 intermediate masters, busiest by number of replicas it has, and then looks for leaf nodes based from these servers. The problem if you have more than one datacentre or AZ is that this ignores other ones, and potentially latency and load on the cluster may be enough for you not to be monitoring replicas which may suffer from delay due to the ongoing OSC.

The proposed fix would be:

  • find a minimum of at least 2 intermediate masters if possible, 1 per AZ/DC
  • use intermediate masters if they are present and choose the intermediate master in each AZ/DC with the most number of lower level replicas
  • this almost matches the existing logic but removes the hard limit of 2 intermediate masters to check and instead changes that to at least 1 per dc/az, with the minimum of 2 if possible

The change should be simple and this then better covers more complex topologies which span multiple locations preventing unwanted replication delay happening on the whole cluster.

And please understand that this issue may not be addressed immediately or in a timeframe you were expecting.

Yes. I'm aware that orchestrator is not being maintained at the moment, but think it's good to record issues for later if someone has time to fix them and to share with other users who may not be aware of the issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugoscRelated to online schema changes

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions