-
Notifications
You must be signed in to change notification settings - Fork 703
RFC-6297: Cache Layer #6297
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
RFC-6297: Cache Layer #6297
Conversation
|
My current understanding of the cache layer is that it is a very thin component that simply wraps two (or more) OpenDAL operators. The cache policy and strategy should be defined by a separate trait. I haven't explored this idea in depth yet, but it might look something like the following: let cache_layer = CacheLayer::builder()
.route("*.hint", WholeCache::new(1024), memory_cache)
.route("*.json", FixedSizeCache::new(64 << 20), memory_cache)
.route("*.parquet", ParquetCache::new(), foyer_cache)
.build()?;
let op = op.layer(cache_layer);All |
I agree,I think it's best to simplify the cache layer design first, without considering a unified cache policy and strategy design. Instead, let the corresponding OpenDAL service manage it itself. After collecting enough use cases and feedback, we can consider the design of policy/strategy and provide some common policy/strategy implementations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR proposes RFC-6297, which introduces a Cache Layer to OpenDAL for transparent read-through and write-through caching capabilities. The RFC aims to improve performance by allowing users to cache data from slower storage services (like S3) to faster ones (like Memory or Redis).
Key changes:
- Introduces a new
CacheLayerthat wraps existing storage with caching functionality - Defines a
CacheStoragetrait for flexible cache backend implementations - Proposes
CacheReaderandCacheWritercomponents for handling cached and uncached data flows
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
2bd38da to
6dbc7a7
Compare
|
Thank you for working on this! I will find sometime to review this RFC again. |
| } | ||
|
|
||
| async fn delete(&self) -> Result<(RpDelete, Self::Deleter)> { | ||
| self.inner.delete().await |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we also delete the file in the cache service ? we shouldn't depend on the evict policy of the underlying service here for we may always read the deleted file if we use memory as a cache service.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it'll be nice to invalidate the cache on delete() is called imo, in this manner, we could claim a "best effort" cache consistency.
|
|
||
| 1. **Buffer Accumulation**: All written data is accumulated in an internal `buffer` | ||
| 2. **Primary Write**: Data is always written to the underlying service first via `inner.write()` | ||
| 3. **Cache Write**: If `cache_write` is enabled and the writing to underlying service succeeds, the complete data is written to cache |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we invalidate the cache data even if cache_write is disabled after a successful write?
| 1. **Buffer Accumulation**: All written data is accumulated in an internal `buffer` | ||
| 2. **Primary Write**: Data is always written to the underlying service first via `inner.write()` | ||
| 3. **Cache Write**: If `cache_write` is enabled and the writing to underlying service succeeds, the complete data is written to cache | ||
| 4. **Atomic Caching**: Cache operations happen only after successful completion to ensure cache consistency |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not atomic as cache operation could be failed.
|
|
||
| # Future possibilities | ||
|
|
||
| - Customizable Cache Key Generation: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we may need some keys like {path}-{version}, {path}-{range}, etc.
|
|
||
| // Try cache first if read caching is enabled | ||
| if self.cache_options.read { | ||
| match self.cache_service.read(&cache_key).await { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when dealing with object storages, it's considered as a common practise to merge smaller writes into a bigger object, which often makes each object exceed 512MB or even gigabytes.
also, when objects are accessed, it's often scanned within a range, and the range is often maintained from an outside index (like iceberg's manifest). it would has read amplification if we have to load the entire object from cache.
it'll be nice to able to have an abstraction here likely called as CacheMapper or CacheChunker, which handles the shaping of cache accesses, letting it to be 1:M for object:cache entity. this allows us not to cache the entire object, but a subset of the object by range.
|
|
||
| 1. **Cache Key Strategy**: Should we provide options for custom cache key generation (e.g., hashing, prefixing)? | ||
|
|
||
| 2. **Invalidation API**: Should we provide explicit cache invalidation methods, or rely entirely on the underlying cache storage? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for having no invlidation API, it should solely depends on the eviction strategy underlying.
| } | ||
| } | ||
|
|
||
| async fn write(&self, key: &str, value: Vec<u8>) -> Result<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about using a Buffer instead of Vec<u8> here?
| fn stat(&self, key: &str) -> impl Future<Output = Result<Metadata>> + MaybeSend; | ||
|
|
||
| /// Check whether `key` exists in the cache. | ||
| fn exists(&self, key: &str) -> impl Future<Output = Result<bool>> + MaybeSend; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this exists method considered as a must? imo we could already check the existencs with the stat method.
or, iiuc, we could consider to have a default impl for exists by calling stat method.
|
|
||
| Storage access performance varies greatly across different storage services. | ||
| Remote object stores like S3 or GCS have much higher latency than local storage or in-memory caches. | ||
| In many applications, particularly those with read-heavy workloads or repeated access to the same data, caching can significantly improve performance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we assume to keep the content of objects to be immutable on arbitary path when using CacheLayer?
imo keeping content immutable is a common practise for object store caching, keeping things immutable removes the hardest part on cache consistency. however, keeping content immutable requires user to keep this in mind.
| // Enable cache promotion during read operations (default: true) | ||
| read_promotion: true, | ||
| // Enable write-through caching (default: true) | ||
| write: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it possible to use an enum for the write option to seperate write through or write back?
sometimes write back could be useful, be like if the upstream s3 service became unavailable, having a write back cache could be benefical to keep our service available by buffering the writes.
|
Hello everyone, before we continue discussing the Cache Layer, I invite everyone to reach a consensus on the Route Layer's design. I believe this approach can greatly simplify our Cache Layer design and allow it to focus solely on caching. RFC: #7130 cc @koushiro @flaneur2020 @meteorgan @killme2008, also @MrCroxx for work on foyer layer. |
| ```rust | ||
| let op = s3.layer( | ||
| CacheLayer::new(memory) | ||
| .with_options(CacheOptions { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of providing CacheOptions, I prefer CacheLayer to do less work and delegate the different options into a new trait called CachePolicy.
In this way, every cache layer will be composed by an Operator and a CachePolicy.
#[derive(Clone, Copy, Debug, PartialEq, Eq)]
pub enum CacheDirective {
Bypass,
Use { chunk_size: Option<u32>, fill: bool },
}
pub trait CachePolicy: Send + Sync + 'static {
fn evaluate(&self, req: &CacheRequest<'_>) -> CacheDirective;
}CachePolicy can decide everything about cache itself but we can start with simple:
- bypass or use cache
- if cache missed, do we need to fill it.
- how to cache it, as a whole or in chunk.
We can ship some widely used policy too:
WholeCachePolicyChunkedCachePolicyMetadataOnlyCachePolicy
In this way, users can extend cache behavior based on their own needs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CacheDirective you mentioned appears to only cover read operations. Does it lack provisions for write or delete operations? Should we also include an Invalidate directive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For CacheRequest, I currently think it is composed as follows, but I'm not sure if I should directly use the existing ops types (OpStat/OpRead/OpWrite/OpDelete) in CacheOperation. What do you think?
pub struct CacheRequest<'a> {
/// Path as seen by OpenDAL.
pub path: &'a str,
/// Operation kind (stat/read/write/delete)
pub op: CacheOperation,
// Additional fields can be added in the future.
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it lack provisions for write or delete operations? Should we also include an Invalidate directive?
Yes, I do think we do lock of invalidate support here. However I believe it's not a blocker and we can add it later. We can test CacehLayer on users that ensure that files are immutable.
I'm not sure if I should directly use the existing ops types (OpStat/OpRead/OpWrite/OpDelete) in CacheOperation
I think it's fine. I also thing use a String for path is also fine unless we are sure this is a bottleneck.
| ## Architecture | ||
|
|
||
| The Cache Layer implements the `Layer` trait and wraps an underlying `Access` implementation with caching capabilities. | ||
| It introduces a `CacheService` trait that defines the interface for cache operations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest we just accpet Operator as public API and convert into Accessor inside. Also, we should not add a new API surface like CacheService.
By simply reusing the Access trait, we can avoid a lot of repetitive work.
|
|
||
| ```rust | ||
| #[derive(Clone, Copy, Debug)] | ||
| pub struct CacheOptions { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed before, I think we should expose a policy trait instead.
| cache_options: CacheOptions, | ||
| } | ||
|
|
||
| impl<A: Access, S: CacheService> LayeredAccess for CacheAccessor<A, S> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The proposal should not include detailed code, as it could divert our focus.
Which issue does this PR close?
Closes #.
Rationale for this change
Propose a Cache Layer design
What changes are included in this PR?
A new RFC
Are there any user-facing changes?