-
Notifications
You must be signed in to change notification settings - Fork 392
Description
The way Riak currently provides cluster administration through riak-admin and Riak Control is a fire-and-pray model. For example, you issue riak-admin join to join a node to a cluster, and the join is immediately scheduled. It is impossible to atomically add multiple nodes at once, or to perform both joins and leaves at the same time. Instead, Riak immediately calculates the new partition ownership and begins transferring data to the new node. Furthermore, it is impossible to see how the change will affect the cluster until after issuing the join/leave, but by that point there is no way to cancel/stop things if the changes are undesired (eg. join a node and suddenly 128+ partitions become schedule during peak traffic).
Let's move away from this approach, and move towards a two-phase approach to cluster changes. Rather than having join/leaves/etc happen immediately, issuing such commands should instead stage pending changes to the cluster. A user should then be able to issue a command such as riak-admin plan to print out the staged changes -- the list of changes, the cluster membership/ring ownership resulting from these changes, the number of transfers that would be required, etc. If the user is satisfied, they could then riak-admin commit the plan, and the entire plan would be issued, and transfers scheduled. Otherwise, the user could continue to add/leave additional nodes, or riak-admin clear to clear the entire set of staged changes.