Skip to content

ScalienDB Sizing and Tuning

mtrencseni edited this page Sep 29, 2011 · 9 revisions

For a smoothly running database installation, please review this document to tune the configuration parameters below on all shard servers. It is highly recommended that all shard servers in the same quorum are of the same physical configuration (disk, RAM, CPU) and have identical settings in their scaliendb.conf file:

PARAMETER                      WHERE           DEFAULT VALUE       TUNE?        NOTE
-----------------------------------------------------------------------------------------------------
shardSplitSize                 Controller      500M                no           (should be shardSize)
database.chunkSize             Shard Server    64M                 no
database.logSegmentSize        Shard Server    64M                 no
database.fileChunkCacheSize    Shard Server    256M                yes
database.memoChunkCacheSize    Shard Server    1G                  yes
database.logSize               Shard Server    20G                 yes
database.replicatedLogSize     Shard Server    10G                 yes

In the paragraphs below some other variables will appear. These cannot be set int the scaliendb.conf file, these are just an aid for calculations (eg. chunksPerShard).

To properly tune the parameters, first you have to decide how much data you want to store per shard server (quorum), let's call this dataSize. Using these calculations you will determine how much RAM you should use, call this memorySize. ScalienDB puts data into different chunk files, and ideally you want to keep this fragmentation low. The lower the fragmentation, the less disk seeks ScalienDB has to perform to serve GET and LIST requests. This is determined by chunksPerShard, you want to keep this at about 4-8 (lower is better).

chunksPerShard = shardSplitSize / chunkSize  (should be 4-8)
dataSize = numShards * shardSplitSize
memoChunkCacheSize = numShards * chunkSize
fileChunkCacheSize > numShards * chunksPerShard * 64K
memorySize = memoChunkCacheSize + fileChunkCacheSize + (Operating System) + (Safety Margin)
logSize > memoChunkCacheSize
diskSize > (dataSize * 1.5) + logSize + replicatedLogSize + (Operation System) + (Safety Margin)

At the end of the day:

 chunksPerShard = dataSize / memoChunkCacheSize

So if you want a low fragmentation, that is chunksPerShard = 8 and to store dataSize = 480G of data, you should set memoChunkCacheSize = 60G, so you'll probably need to buy 64G of RAM. If you only have 32G, then no worries, you'll end up with double the fragmentation, which results in linear slowdown (that's the good kind).

It's best to start by leaving database.chunkSize = 64M and database.logSegmentSize = 64M. Since these are the defaults, you can skip these altogether. Continuing the previous example, if you have 64G of RAM, you set database.memoChunkCacheSize = 60G. Leaving shardSplitSize at 500M, you'll end up with numShards = 960. So you'll need fileChunkCacheSize > 960 * 8 * 64K ~ 500M so we set database.fileChunkCacheSize = 1G. This leaves ample room for the OS and a safety margin. Note that the database.memoChunkCacheSize will not be pre-allocated, and will only be ever used if you get close to your total database size.

The database.replicatedLogSize is used for log based catchup if a shard server falls behind in the quorum. If you're writing at 1MB/s into this quorum, the default of 10G should give 10000 seconds or 2.5 hours of breathing room. If you have the disk space, you can set this to a higher number.

The database.logSize is the maximum number of log files (in the logs folder). In our example, since memoChunkCacheSize = 60G we also set database.logSize = 60G. Note that logs are replayed when you start (restart) ScalienDB, which may take several minutes for long logs.

So, we set on all the shard servers:

database.fileChunkCacheSize = 1G
database.memoChunkCacheSize = 60G
database.logSize = 60G

Everything can be left at the defaults.

If we want to store 480G but only have 32G of RAM, we'd set:

database.fileChunkCacheSize = 1G
database.memoChunkCacheSize = 28G
database.logSize = 28G

In this case we expect a 2x slowdown in GET and LISTs due to twice as many seeks compared to the 64G configuration.