Skip to content
7 changes: 7 additions & 0 deletions doc/en/mooncake-store.md
Original file line number Diff line number Diff line change
Expand Up @@ -525,6 +525,13 @@ When the user specifies `--root_fs_dir=/path/to/dir` when starting the master, a

​Note​​: When enabling this feature, the user must ensure that the DFS-mounted directory (`root_fs_dir=/path/to/dir`) is valid and consistent across all client hosts. If some clients have invalid or incorrect mount paths, it may cause abnormal behavior in Mooncake Store.

#### Persistent Storage Space Configuration​
Mooncake provides configurable DFS available space. Users can specify `--global_file_segment_size=1048576` when starting the master, indicating a maximum usable space of 1MB on DFS.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just let users know it's just a metric or something—we didn't evict anything on DFS right now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I will add more descriptive information

The current default setting is the maximum value of int64 (as we generally do not restrict DFS storage usage), which is displayed as `infinite` in `mooncake_maseter`'s console logs.

**Notice** The DFS cache space configuration must be used together with the `--root_fs_dir` parameter. Otherwise, you will observe that the `SSD Storage` usage consistently shows: `0 B / 0 B`
**Notice** The capability for file eviction on DFS has not been provided yet

#### Data Access Mechanism

The persistence feature also follows Mooncake Store's design principle of separating control flow from data flow. The read/write operations of kvcache objects are completed on the client side, while the query and management functions of kvcache objects are handled on the master side. In the file system, the key -> kvcache object index information is maintained by a fixed indexing mechanism, with each file corresponding to one kvcache object (the filename serves as the associated key name).
Expand Down
7 changes: 7 additions & 0 deletions doc/zh/mooncake-store.md
Original file line number Diff line number Diff line change
Expand Up @@ -529,6 +529,13 @@ struct ReplicateConfig {

注意在开启该功能时,用户需要保证各client所在主机的DFS挂载目录都是有效且相同的(`root_fs_dir=/path/to/dir`),如果存在部分client挂载目录无效或错误,会导致mooncake store运行出现一些异常情况。

#### 持久化存储空间配置
mooncake提供了DFS可用空间的配置,用户可以在启动master时指定`--global_file_segment_size=100GB`,表示DFS上最大可用空间为100GB。
当前默认设置为int64的最大值(因为我们一般不限制DFS的使用空间大小),在`mooncake_maseter`的打屏日志中使用`infinite`表示最大值。

**注意** DFS缓存空间配置必须结合`--root_fs_dir`参数一起使用,否则你会发现`SSD Storage`使用率一致是: `0 B / 0 B`
**注意** 当前还没有提供DFS上文件驱逐的能力

#### 数据访问机制
持久化功能同样遵循了mooncake store中控制流和数据流分离的设计。kvcache object的读\写操作在client端完成,kvcache object的查询和管理功能在master端完成。在文件系统中key -> kvcache object的索引信息是由固定的索引机制维护,每个文件对应一个kvcache object(文件名即为对应的key名称)。

Expand Down
4 changes: 4 additions & 0 deletions docs/source/deployment/mooncake-store-deployment-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,10 @@ This page summarizes useful flags, environment variables, and HTTP endpoints to
- `--client_ttl` (int64, default `10` s): Client alive TTL after last ping (HA mode).
- `--cluster_id` (str, default `mooncake_cluster`): Cluster ID for persistence in HA mode.

- DFS Storage (optional)
- `--root_fs_dir` (str, default empty): DFS mount directory for storage backend, used in Multi-layer Storage Support.
- `--global_file_segment_size` (int64, default `int64_max`): Maximum available space for DFS segments.

Example (enable embedded HTTP metadata and metrics):

```bash
Expand Down
16 changes: 16 additions & 0 deletions mooncake-store/include/master_config.h
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ struct MasterConfig {

std::string cluster_id;
std::string root_fs_dir;
int64_t global_file_segment_size;
std::string memory_allocator;

// HTTP metadata server configuration
Expand Down Expand Up @@ -63,6 +64,7 @@ class MasterServiceSupervisorConfig {
std::string local_hostname = "0.0.0.0:50051";
std::string cluster_id = DEFAULT_CLUSTER_ID;
std::string root_fs_dir = DEFAULT_ROOT_FS_DIR;
int64_t global_file_segment_size = DEFAULT_GLOBAL_FILE_SEGMENT_SIZE;
BufferAllocatorType memory_allocator = BufferAllocatorType::OFFSET;

MasterServiceSupervisorConfig() = default;
Expand Down Expand Up @@ -91,6 +93,7 @@ class MasterServiceSupervisorConfig {
local_hostname = rpc_address + ":" + std::to_string(rpc_port);
cluster_id = config.cluster_id;
root_fs_dir = config.root_fs_dir;
global_file_segment_size = config.global_file_segment_size;

// Convert string memory_allocator to BufferAllocatorType enum
if (config.memory_allocator == "cachelib") {
Expand Down Expand Up @@ -161,6 +164,7 @@ class WrappedMasterServiceConfig {
bool enable_ha = false;
std::string cluster_id = DEFAULT_CLUSTER_ID;
std::string root_fs_dir = DEFAULT_ROOT_FS_DIR;
int64_t global_file_segment_size = DEFAULT_GLOBAL_FILE_SEGMENT_SIZE;
BufferAllocatorType memory_allocator = BufferAllocatorType::OFFSET;

WrappedMasterServiceConfig() = default;
Expand All @@ -184,6 +188,7 @@ class WrappedMasterServiceConfig {
enable_ha = config.enable_ha;
cluster_id = config.cluster_id;
root_fs_dir = config.root_fs_dir;
global_file_segment_size = config.global_file_segment_size;

// Convert string memory_allocator to BufferAllocatorType enum
if (config.memory_allocator == "cachelib") {
Expand Down Expand Up @@ -214,6 +219,7 @@ class WrappedMasterServiceConfig {
true; // This is used in HA mode, so enable_ha should be true
cluster_id = config.cluster_id;
root_fs_dir = config.root_fs_dir;
global_file_segment_size = config.global_file_segment_size;
memory_allocator = config.memory_allocator;
}
};
Expand All @@ -236,6 +242,7 @@ class MasterServiceConfigBuilder {
bool enable_ha_ = false;
std::string cluster_id_ = DEFAULT_CLUSTER_ID;
std::string root_fs_dir_ = DEFAULT_ROOT_FS_DIR;
int64_t global_file_segment_size_ = DEFAULT_GLOBAL_FILE_SEGMENT_SIZE;
BufferAllocatorType memory_allocator_ = BufferAllocatorType::OFFSET;

public:
Expand Down Expand Up @@ -293,6 +300,12 @@ class MasterServiceConfigBuilder {
return *this;
}

MasterServiceConfigBuilder& set_global_file_segment_size(
int64_t segment_size) {
global_file_segment_size_ = segment_size;
return *this;
}

MasterServiceConfigBuilder& set_memory_allocator(
BufferAllocatorType allocator) {
memory_allocator_ = allocator;
Expand All @@ -316,6 +329,7 @@ class MasterServiceConfig {
bool enable_ha = false;
std::string cluster_id = DEFAULT_CLUSTER_ID;
std::string root_fs_dir = DEFAULT_ROOT_FS_DIR;
int64_t global_file_segment_size = DEFAULT_GLOBAL_FILE_SEGMENT_SIZE;
BufferAllocatorType memory_allocator = BufferAllocatorType::OFFSET;

MasterServiceConfig() = default;
Expand All @@ -333,6 +347,7 @@ class MasterServiceConfig {
enable_ha = config.enable_ha;
cluster_id = config.cluster_id;
root_fs_dir = config.root_fs_dir;
global_file_segment_size = config.global_file_segment_size;
memory_allocator = config.memory_allocator;
}

Expand All @@ -353,6 +368,7 @@ inline MasterServiceConfig MasterServiceConfigBuilder::build() const {
config.enable_ha = enable_ha_;
config.cluster_id = cluster_id_;
config.root_fs_dir = root_fs_dir_;
config.global_file_segment_size = global_file_segment_size_;
config.memory_allocator = memory_allocator_;
return config;
}
Expand Down
35 changes: 24 additions & 11 deletions mooncake-store/include/master_metric_manager.h
Original file line number Diff line number Diff line change
Expand Up @@ -19,14 +19,23 @@ class MasterMetricManager {
MasterMetricManager(MasterMetricManager&&) = delete;
MasterMetricManager& operator=(MasterMetricManager&&) = delete;

// Storage Metrics
void inc_allocated_size(int64_t val = 1);
void dec_allocated_size(int64_t val = 1);
void inc_total_capacity(int64_t val = 1);
void dec_total_capacity(int64_t val = 1);
int64_t get_allocated_size();
int64_t get_total_capacity();
double get_global_used_ratio(void);
// Memory Storage Metrics
void inc_allocated_mem_size(int64_t val = 1);
void dec_allocated_mem_size(int64_t val = 1);
void inc_total_mem_capacity(int64_t val = 1);
void dec_total_mem_capacity(int64_t val = 1);
int64_t get_allocated_mem_size();
int64_t get_total_mem_capacity();
double get_global_mem_used_ratio(void);

// File Storage Metrics
void inc_allocated_file_size(int64_t val = 1);
void dec_allocated_file_size(int64_t val = 1);
void inc_total_file_capacity(int64_t val = 1);
void dec_total_file_capacity(int64_t val = 1);
int64_t get_allocated_file_size();
int64_t get_total_file_capacity();
double get_global_file_used_ratio(void);

// Key/Value Metrics
void inc_key_count(int64_t val = 1);
Expand Down Expand Up @@ -175,9 +184,13 @@ class MasterMetricManager {

// --- Metric Members ---

// Storage Metrics
ylt::metric::gauge_t allocated_size_; // Use update for gauge
ylt::metric::gauge_t total_capacity_; // Use update for gauge
// Memory Storage Metrics
ylt::metric::gauge_t mem_allocated_size_; // Use update for gauge
ylt::metric::gauge_t mem_total_capacity_; // Use update for gauge

// File Storage Metrics
ylt::metric::gauge_t file_allocated_size_;
ylt::metric::gauge_t file_total_capacity_;

// Key/Value Metrics
ylt::metric::gauge_t key_count_;
Expand Down
15 changes: 15 additions & 0 deletions mooncake-store/include/master_service.h
Original file line number Diff line number Diff line change
Expand Up @@ -245,6 +245,8 @@ class MasterService {
if (soft_pin_timeout) {
MasterMetricManager::instance().dec_soft_pin_key_count(1);
}
MasterMetricManager::instance().dec_allocated_file_size(
disk_replica_size);
}

ObjectMetadata() = delete;
Expand All @@ -261,6 +263,16 @@ class MasterService {
MasterMetricManager::instance().inc_soft_pin_key_count(1);
}
MasterMetricManager::instance().observe_value_size(value_length);
// Automatic update allocated_file_size via RAII
for (const auto& replica : replicas) {
if (replica.is_disk_replica()) {
disk_replica_size += replica.get_descriptor()
.get_disk_descriptor()
.object_size;
}
}
MasterMetricManager::instance().inc_allocated_file_size(
disk_replica_size);
}

ObjectMetadata(const ObjectMetadata&) = delete;
Expand All @@ -275,6 +287,7 @@ class MasterService {
std::chrono::steady_clock::time_point lease_timeout; // hard lease
std::optional<std::chrono::steady_clock::time_point>
soft_pin_timeout; // optional soft pin, only set for vip objects
uint64_t disk_replica_size = 0;

// Check if there are some replicas with a different status than the
// given value. If there are, return the status of the first replica
Expand Down Expand Up @@ -471,6 +484,8 @@ class MasterService {
const std::string cluster_id_;
// root filesystem directory for persistent storage
const std::string root_fs_dir_;
// global 3fs/nfs segment size
int64_t global_file_segment_size_;

bool use_disk_replica_{false};

Expand Down
1 change: 1 addition & 0 deletions mooncake-store/include/replica.h
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@

#include "types.h"
#include "allocator.h"
#include "master_metric_manager.h"

namespace mooncake {

Expand Down
5 changes: 5 additions & 0 deletions mooncake-store/include/types.h
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
#include <memory>
#include <optional>
#include <string>
#include <limits>
#include <unordered_map>
#include <vector>

Expand Down Expand Up @@ -33,6 +34,10 @@ static constexpr int64_t ETCD_MASTER_VIEW_LEASE_TTL = 5; // in seconds
static constexpr int64_t DEFAULT_CLIENT_LIVE_TTL_SEC = 10; // in seconds
static const std::string DEFAULT_CLUSTER_ID = "mooncake_cluster";
static const std::string DEFAULT_ROOT_FS_DIR = "";
// default do not limit DFS usage, and use
// int64_t to make it compaitable to file metrics monitor
static const int64_t DEFAULT_GLOBAL_FILE_SEGMENT_SIZE =
std::numeric_limits<int64_t>::max();
static const std::string PUT_NO_SPACE_HELPER_STR = // A helpful string
" due to insufficient space. Consider lowering "
"eviction_high_watermark_ratio or mounting more segments.";
Expand Down
6 changes: 4 additions & 2 deletions mooncake-store/include/utils.h
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
#include <cstddef>
#include <cstdlib>
#include <string>
#include <limits>
#include <ylt/util/tl/expected.hpp>

#include "types.h"
Expand Down Expand Up @@ -111,8 +112,9 @@ void free_memory(const std::string& protocol, void* ptr);

std::ostringstream oss;
oss << std::fixed << std::setprecision(2);

if (bytes >= static_cast<uint64_t>(TB)) {
if (static_cast<int64_t>(bytes) == std::numeric_limits<int64_t>::max()) {
oss << "infinite";
} else if (bytes >= static_cast<uint64_t>(TB)) {
oss << bytes / TB << " TB";
} else if (bytes >= static_cast<uint64_t>(GB)) {
oss << bytes / GB << " GB";
Expand Down
10 changes: 5 additions & 5 deletions mooncake-store/src/allocator.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ AllocatedBuffer::~AllocatedBuffer() {
alloc->deallocate(this);
VLOG(1) << "buf_handle_deallocated size=" << size_;
} else {
MasterMetricManager::instance().dec_allocated_size(size_);
MasterMetricManager::instance().dec_allocated_mem_size(size_);
VLOG(1) << "allocator=expired_or_null in buf_handle_destructor";
}
}
Expand Down Expand Up @@ -117,7 +117,7 @@ std::unique_ptr<AllocatedBuffer> CachelibBufferAllocator::allocate(
VLOG(1) << "allocation_succeeded size=" << size
<< " segment=" << segment_name_ << " address=" << buffer;
cur_size_.fetch_add(size);
MasterMetricManager::instance().inc_allocated_size(size);
MasterMetricManager::instance().inc_allocated_mem_size(size);
return std::make_unique<AllocatedBuffer>(shared_from_this(), buffer, size);
}

Expand All @@ -128,7 +128,7 @@ void CachelibBufferAllocator::deallocate(AllocatedBuffer* handle) {
size_t freed_size =
handle->size_; // Store size before handle might become invalid
cur_size_.fetch_sub(freed_size);
MasterMetricManager::instance().dec_allocated_size(freed_size);
MasterMetricManager::instance().dec_allocated_mem_size(freed_size);
VLOG(1) << "deallocation_succeeded address=" << handle->buffer_ptr_
<< " size=" << freed_size << " segment=" << segment_name_;
} catch (const std::exception& e) {
Expand Down Expand Up @@ -217,7 +217,7 @@ std::unique_ptr<AllocatedBuffer> OffsetBufferAllocator::allocate(size_t size) {
}

cur_size_.fetch_add(size);
MasterMetricManager::instance().inc_allocated_size(size);
MasterMetricManager::instance().inc_allocated_mem_size(size);
return allocated_buffer;
}

Expand All @@ -228,7 +228,7 @@ void OffsetBufferAllocator::deallocate(AllocatedBuffer* handle) {
size_t freed_size = handle->size();
handle->offset_handle_.reset();
cur_size_.fetch_sub(freed_size);
MasterMetricManager::instance().dec_allocated_size(freed_size);
MasterMetricManager::instance().dec_allocated_mem_size(freed_size);
VLOG(1) << "deallocation_succeeded address=" << handle->data()
<< " size=" << freed_size << " segment=" << segment_name_;
} catch (const std::exception& e) {
Expand Down
13 changes: 13 additions & 0 deletions mooncake-store/src/master.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,9 @@ DEFINE_int64(client_ttl, mooncake::DEFAULT_CLIENT_LIVE_TTL_SEC,

DEFINE_string(root_fs_dir, mooncake::DEFAULT_ROOT_FS_DIR,
"Root directory for storage backend, used in HA mode");
DEFINE_int64(global_file_segment_size,
mooncake::DEFAULT_GLOBAL_FILE_SEGMENT_SIZE,
"Size of global NFS/3FS segment in bytes");
DEFINE_string(cluster_id, mooncake::DEFAULT_CLUSTER_ID,
"Cluster ID for the master service, used for kvcache persistence "
"in HA mode");
Expand Down Expand Up @@ -129,6 +132,9 @@ void InitMasterConf(const mooncake::DefaultConfig& default_config,
FLAGS_cluster_id);
default_config.GetString("root_fs_dir", &master_config.root_fs_dir,
FLAGS_root_fs_dir);
default_config.GetInt64("global_file_segment_size",
&master_config.global_file_segment_size,
FLAGS_global_file_segment_size);
default_config.GetString("memory_allocator",
&master_config.memory_allocator,
FLAGS_memory_allocator);
Expand Down Expand Up @@ -269,6 +275,11 @@ void LoadConfigFromCmdline(mooncake::MasterConfig& master_config,
!conf_set) {
master_config.root_fs_dir = FLAGS_root_fs_dir;
}
if ((google::GetCommandLineFlagInfo("global_file_segment_size", &info) &&
!info.is_default) ||
!conf_set) {
master_config.global_file_segment_size = FLAGS_global_file_segment_size;
}
if ((google::GetCommandLineFlagInfo("memory_allocator", &info) &&
!info.is_default) ||
!conf_set) {
Expand Down Expand Up @@ -385,6 +396,8 @@ int main(int argc, char* argv[]) {
<< ", rpc protocol=" << protocol
<< ", cluster_id=" << master_config.cluster_id
<< ", root_fs_dir=" << master_config.root_fs_dir
<< ", global_file_segment_size="
<< master_config.global_file_segment_size
<< ", memory_allocator=" << master_config.memory_allocator
<< ", enable_http_metadata_server="
<< master_config.enable_http_metadata_server
Expand Down
Loading
Loading