-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crashed when used large storage mode with 100GB capacity and 10GB single cache file size #808
Comments
the foyer we used is "0.12.2" |
Hi, @hopkings2008 . Thanks for reporting. The sracktrace indicated that there was an OOM. Would you like to share your configutation of your node instance and foyer? Let me check if there is something wrong on the foyet side. 🙏 |
And, would you please also ahare what is the largest entry size in your workload? 10 GiB per cache file looks too large, each eviction op would invalidate 10% of the totol cache capacity. If there is not that large entrirs in your workload, a cache file size from 64 MiB would be enough. |
Hi MrCroxx, thank you for your quickly response, and below is our detail config: let cache_result = exec.get_runtime().block_on(
HybridCacheBuilder::new()
.memory(1)
.with_shards(16)
.with_eviction_config(LruConfig::default())
.with_object_pool_capacity(1024)
.with_hash_builder(ahash::RandomState::default())
.storage(Engine::Mixed(0.1))
.with_device_options(
DirectFsDeviceOptions::new("/tmp/cache_server")
.with_capacity(102400 * 1024 * 1024)
.with_file_size(10240 * 1024 * 1024),
)
.with_flush(true)
.with_recover_mode(RecoverMode::None)
.with_admission_picker(Arc::new(RateLimitPicker::new(100 * 1024 * 1024)))
.with_compression(None)
.with_runtime_options(RuntimeOptions::Separated {
read_runtime_options: TokioRuntimeOptions {
worker_threads: 8,
max_blocking_threads: 16,
},
write_runtime_options: TokioRuntimeOptions {
worker_threads: 8,
max_blocking_threads: 16,
},
})
.with_large_object_disk_cache_options(
LargeEngineOptions::new()
.with_indexer_shards(64)
.with_recover_concurrency(8)
.with_flushers(2)
.with_reclaimers(2)
.with_buffer_pool_size(256 * 1024 * 1024)
.with_clean_region_threshold(4)
.with_eviction_pickers(vec![Box::<FifoPicker>::default()])
.with_reinsertion_picker(Arc::new(RateLimitPicker::new(10 * 1024 * 1024))),
)
.with_small_object_disk_cache_options(
SmallEngineOptions::new()
.with_set_size(16 * 1024)
.with_set_cache_capacity(64)
.with_flushers(2),
)
.build(),
); |
the largest entry size in our workload is about 4MiB, and we tested to use 64MiB as the single file size before, it works fine. |
Hi, @hopkings2008 .
For your workload, I think 64 MiB for each file is enough and would work better than 10 GiB setup. Because evict disk cache by region (file with fs). 10 GiB in 100 GiB means each time foyer will evict 10% of the data. |
Does it mean that if the single file size is large, the crash will happen? and what is the root cause of that crash i uploaded before? |
The OOM is unexpected. I'm investigating it. 🙌 |
Hi all,
We met one crash when we use the large storage mode with 100GB capacity and each cache file with 10GB size. The crashed stack is as below:
After we investigated this problem, we found that the leading bytes of deserialize reader buffer is removed, there is 64 bytes buffer, but after deserialize_from_custom_seed is called, the 8 bytes at the beginging of the reader buffer are cut off, which made a large reader buffer size such as deserialize_from_custom_seed.
is there a limit for the single cache file size of foyer?
The text was updated successfully, but these errors were encountered: