Skip to content

Concepts: Shards, Quorums

mtrencseni edited this page Aug 22, 2011 · 1 revision

Sharding

The key => value pairs making up tables are split into shards.

By default, each table consists of exactly one shard, and this one shard contains all the key => value pairs. As the number of pairs grow, the shard is split into two parts (in the middle), and so on. A long table can consist of hundreds of shards:

Controllers and Shard servers

A ScalienDB cluster consists of:

  • controllers and
  • shard servers

Controllers

Controllers are the brain of the ScalienDB cluster. They contain all the meta-data about ScalienDB (identity of shard servers, shards, databases, tables, settings). However, this meta-data is very small (few MBs) and rarely changes, so the controllers see very little load (CPU, memory, disk, network). Controllers can be run in single mode or in replicated mode with 3, 5 or 7 replicas. When running in replicated mode, a majority of the controllers is required for ScalienDB to be functional. However, a majority is not required inside quorums, see below!

Shard servers

Shard servers store the actual data, ie. the shards. They are the work horses of a ScalienDB cluster.

Quorums

Quorums are the unit of data replication in ScalienDB:

  • shard servers are grouped into quorums
  • shards are placed into quorums
  • the key => values inside quorums are replicated onto all shard servers in the quorum

Quorums typically consists of 1, 2 or 3 shard servers (1 means data in that quorum is not replicated).

The shard server membership of quorums is controlled by the controllers. Majority is not required inside quorums, as the controllers deactivate disconnected shard servers and the quorum can make progress!

Best of both worlds

This means that ScalienDB offers the best of both worlds. Due to our use of Paxos during replication, the replicas are always in-sync, no need for eventual consistency and conflict resolution. However, since the controllers act as the brain, they can deactivate shard servers, and as a result no majority is required inside quorums storing the actual user data!