Skip to content

Make cluster admin a two-phase (plan/commit) process #177

@jtuple

Description

@jtuple

The way Riak currently provides cluster administration through riak-admin and Riak Control is a fire-and-pray model. For example, you issue riak-admin join to join a node to a cluster, and the join is immediately scheduled. It is impossible to atomically add multiple nodes at once, or to perform both joins and leaves at the same time. Instead, Riak immediately calculates the new partition ownership and begins transferring data to the new node. Furthermore, it is impossible to see how the change will affect the cluster until after issuing the join/leave, but by that point there is no way to cancel/stop things if the changes are undesired (eg. join a node and suddenly 128+ partitions become schedule during peak traffic).

Let's move away from this approach, and move towards a two-phase approach to cluster changes. Rather than having join/leaves/etc happen immediately, issuing such commands should instead stage pending changes to the cluster. A user should then be able to issue a command such as riak-admin plan to print out the staged changes -- the list of changes, the cluster membership/ring ownership resulting from these changes, the number of transfers that would be required, etc. If the user is satisfied, they could then riak-admin commit the plan, and the entire plan would be issued, and transfers scheduled. Otherwise, the user could continue to add/leave additional nodes, or riak-admin clear to clear the entire set of staged changes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions