-
Notifications
You must be signed in to change notification settings - Fork 566
feat: add Redis cache plugin for Bifrost requests #70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
akshaydeo
merged 1 commit into
main
from
06-11-_plugins_feat_redis_caching_plugin_added
Aug 11, 2025
+1,570
−0
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,364 @@ | ||
| # Redis Cache Plugin for Bifrost | ||
|
|
||
| This plugin provides Redis-based caching functionality for Bifrost requests. It caches responses based on request body hashes and returns cached responses for identical requests, significantly improving performance and reducing API costs. | ||
|
|
||
| ## Features | ||
|
|
||
| - **High-Performance Hashing**: Uses xxhash for ultra-fast request body hashing | ||
| - **Asynchronous Caching**: Non-blocking cache writes for optimal response times | ||
| - **Response Caching**: Stores complete responses in Redis with configurable TTL | ||
| - **Streaming Cache Support**: Caches and retrieves streaming responses chunk by chunk | ||
| - **Cache Hit Detection**: Returns cached responses for identical requests | ||
| - **Intelligent Cache Recovery**: Automatically reconstructs streaming responses from cached chunks | ||
| - **Simple Setup**: Only requires Redis address and cache key - sensible defaults for everything else | ||
| - **Self-Contained**: Creates and manages its own Redis client | ||
|
|
||
| ## Installation | ||
|
|
||
| ```bash | ||
| go get github.com/maximhq/bifrost/core | ||
| go get github.com/maximhq/bifrost/plugins/redis | ||
| ``` | ||
|
|
||
| ## Quick Start | ||
|
|
||
| ### Basic Setup | ||
|
|
||
| ```go | ||
| import ( | ||
| "github.com/maximhq/bifrost/plugins/redis" | ||
| bifrost "github.com/maximhq/bifrost/core" | ||
| ) | ||
|
|
||
| // Simple configuration - only Redis address and cache key are required! | ||
| config := redis.RedisPluginConfig{ | ||
| Addr: "localhost:6379", // Your Redis server address | ||
| CacheKey: "x-my-cache-key", // Context key for cache identification | ||
| } | ||
|
|
||
| // Create the plugin | ||
| plugin, err := redis.NewRedisPlugin(config, logger) | ||
| if err != nil { | ||
| log.Fatal("Failed to create Redis plugin:", err) | ||
| } | ||
|
Pratham-Mishra04 marked this conversation as resolved.
|
||
|
|
||
| // Use with Bifrost | ||
| bifrostConfig := schemas.BifrostConfig{ | ||
| Account: yourAccount, | ||
| Plugins: []schemas.Plugin{plugin}, | ||
| // ... other config | ||
| } | ||
| ``` | ||
|
|
||
| That's it! The plugin uses Redis client defaults for connection handling and these defaults for caching: | ||
|
|
||
| - **TTL**: 5 minutes | ||
| - **CacheByModel**: true (include model in cache key) | ||
| - **CacheByProvider**: true (include provider in cache key) | ||
|
|
||
| **Important**: You must provide the cache key in your request context for caching to work: | ||
|
|
||
| ```go | ||
| ctx := context.WithValue(ctx, redis.ContextKey("x-my-cache-key"), "cache-value") | ||
| response, err := client.ChatCompletionRequest(ctx, request) | ||
| ``` | ||
|
|
||
| ### With Password Authentication | ||
|
|
||
| ```go | ||
| config := redis.RedisPluginConfig{ | ||
| Addr: "localhost:6379", | ||
| CacheKey: "x-my-cache-key", | ||
| Password: "your-redis-password", | ||
| } | ||
| ``` | ||
|
|
||
| ### With Custom TTL and Prefix | ||
|
|
||
| ```go | ||
| config := redis.RedisPluginConfig{ | ||
| Addr: "localhost:6379", | ||
| CacheKey: "x-my-cache-key", | ||
| TTL: time.Hour, // Cache for 1 hour | ||
| Prefix: "myapp:cache:", // Custom prefix | ||
| } | ||
| ``` | ||
|
|
||
| ### Per-Request TTL Override (via Context) | ||
|
|
||
| You can override the cache TTL for individual requests by providing a TTL in the request context. Configure a `CacheTTLKey` on the plugin, then set a `time.Duration` value at that context key before making the request. | ||
|
|
||
| ```go | ||
| // Configure plugin with a context key used to read per-request TTLs | ||
| config := redis.RedisPluginConfig{ | ||
| Addr: "localhost:6379", | ||
| CacheKey: "x-my-cache-key", | ||
| CacheTTLKey: "x-my-cache-ttl", // The context key for reading TTL | ||
| TTL: 5 * time.Minute, // Fallback/default TTL | ||
| } | ||
|
|
||
| plugin, err := redis.NewRedisPlugin(config, logger) | ||
| // ... init Bifrost client with plugin | ||
|
|
||
| // Before making a request, set a per-request TTL | ||
| ctx := context.WithValue(ctx, redis.ContextKey("x-my-cache-ttl"), 30*time.Second) | ||
| resp, err := client.ChatCompletionRequest(ctx, request) | ||
| ``` | ||
|
|
||
| Notes: | ||
|
|
||
| - The context value must be of type `time.Duration`. If it is missing or of the wrong type, the plugin falls back to `config.TTL`. | ||
| - This applies to both regular and streaming requests. For streaming, the same per-request TTL applies to all chunks. | ||
|
|
||
| ### With Different Database | ||
|
|
||
| ```go | ||
| config := redis.RedisPluginConfig{ | ||
| Addr: "localhost:6379", | ||
| CacheKey: "x-my-cache-key", | ||
| DB: 1, // Use Redis database 1 | ||
| } | ||
| ``` | ||
|
|
||
| ### Streaming Cache Example | ||
|
|
||
| ```go | ||
| // Configure plugin for streaming cache | ||
| config := redis.RedisPluginConfig{ | ||
| Addr: "localhost:6379", | ||
| CacheKey: "x-stream-cache-key", | ||
| TTL: 30 * time.Minute, // Cache streaming responses for 30 minutes | ||
| } | ||
|
|
||
| // Use with streaming requests | ||
| ctx := context.WithValue(ctx, redis.ContextKey("x-stream-cache-key"), "stream-session-1") | ||
| stream, err := client.ChatCompletionStreamRequest(ctx, request) | ||
| // Subsequent identical requests will be served from cache as a reconstructed stream | ||
| ``` | ||
|
|
||
| ### Custom Cache Key Configuration | ||
|
|
||
| ```go | ||
| config := redis.RedisPluginConfig{ | ||
| Addr: "localhost:6379", | ||
| CacheKey: "x-my-cache-key", | ||
| CacheByModel: bifrost.Ptr(false), // Don't include model in cache key | ||
| CacheByProvider: bifrost.Ptr(true), // Include provider in cache key | ||
| } | ||
| ``` | ||
|
|
||
| ### Custom Redis Client Configuration | ||
|
|
||
| ```go | ||
| config := redis.RedisPluginConfig{ | ||
| Addr: "localhost:6379", | ||
| CacheKey: "x-my-cache-key", | ||
| PoolSize: 20, // Custom connection pool size | ||
| DialTimeout: 5 * time.Second, // Custom connection timeout | ||
| ReadTimeout: 3 * time.Second, // Custom read timeout | ||
| ConnMaxLifetime: time.Hour, // Custom connection lifetime | ||
| } | ||
| ``` | ||
|
|
||
| ## Configuration Options | ||
|
|
||
| | Option | Type | Required | Default | Description | | ||
| | ----------------- | --------------- | -------- | ----------------- | ----------------------------------- | | ||
| | `Addr` | `string` | ✅ | - | Redis server address (host:port) | | ||
| | `CacheKey` | `string` | ✅ | - | Context key for cache identification| | ||
| | `Username` | `string` | ❌ | `""` | Username for Redis AUTH (Redis 6+) | | ||
| | `Password` | `string` | ❌ | `""` | Password for Redis AUTH | | ||
| | `DB` | `int` | ❌ | `0` | Redis database number | | ||
| | `TTL` | `time.Duration` | ❌ | `5 * time.Minute` | Time-to-live for cached responses | | ||
| | `Prefix` | `string` | ❌ | `""` | Prefix for cache keys | | ||
| | `CacheByModel` | `*bool` | ❌ | `true` | Include model in cache key | | ||
| | `CacheByProvider` | `*bool` | ❌ | `true` | Include provider in cache key | | ||
|
|
||
| **Redis Connection Options** (all optional, Redis client uses its own defaults for zero values): | ||
|
|
||
| - `PoolSize`, `MinIdleConns`, `MaxIdleConns` - Connection pool settings | ||
| - `ConnMaxLifetime`, `ConnMaxIdleTime` - Connection lifetime settings | ||
| - `DialTimeout`, `ReadTimeout`, `WriteTimeout` - Timeout settings | ||
|
|
||
| All Redis configuration values are passed directly to the Redis client, which handles its own zero-value defaults. You only need to specify values you want to override from Redis client defaults. | ||
|
|
||
| ## How It Works | ||
|
|
||
| The plugin generates an xxhash of the normalized request including: | ||
|
|
||
| - Provider (if CacheByProvider is true) | ||
| - Model (if CacheByModel is true) | ||
| - Input (chat completion or text completion) | ||
| - Parameters (includes tool calls) | ||
|
|
||
| Identical requests will always produce the same hash, enabling effective caching. | ||
|
|
||
| ### Caching Flow | ||
|
|
||
| #### Regular Requests | ||
|
|
||
| 1. **PreHook**: Checks Redis for cached response, returns immediately if found | ||
| 2. **PostHook**: Stores the response in Redis asynchronously (non-blocking) | ||
| 3. **Cleanup**: Clears all cached entries and closes connection on shutdown | ||
|
|
||
| #### Streaming Requests | ||
|
|
||
| 1. **PreHook**: Checks Redis for cached chunks using pattern `{cache_key}_chunk_*` | ||
| 2. **Cache Hit**: Reconstructs stream from cached chunks in correct order | ||
| 3. **PostHook**: Stores each stream chunk with index: `{cache_key}_chunk_{index}` | ||
| 4. **Stream Reconstruction**: Subsequent requests get sorted chunks as a new stream | ||
|
|
||
| **Asynchronous Caching**: Cache writes happen in background goroutines with a 30-second timeout, ensuring responses are never delayed by Redis operations. This provides optimal performance while maintaining cache functionality. | ||
|
|
||
| **Streaming Intelligence**: The plugin automatically detects streaming requests and handles chunk-based caching. Each chunk is stored with its index, allowing perfect reconstruction of the original stream order. | ||
|
|
||
| ### Cache Keys | ||
|
|
||
| #### Regular Responses | ||
|
|
||
| Cache keys follow the pattern: `{prefix}{cache_value}_{xxhash}` | ||
|
|
||
| Example: `bifrost:cache:my-session_a1b2c3d4e5f6...` | ||
|
|
||
| #### Streaming Responses | ||
|
|
||
| Chunk keys follow the pattern: `{prefix}{cache_value}_{xxhash}_chunk_{index}` | ||
|
|
||
| Examples: | ||
|
|
||
| - `bifrost:cache:my-session_a1b2c3d4e5f6..._chunk_0` | ||
| - `bifrost:cache:my-session_a1b2c3d4e5f6..._chunk_1` | ||
| - `bifrost:cache:my-session_a1b2c3d4e5f6..._chunk_2` | ||
|
|
||
| ## Manual Cache Invalidation | ||
|
|
||
| You can invalidate specific cached entries at runtime using the method `ClearCacheForKey(key string)` on the concrete `redis.Plugin` type. This deletes the provided key and, if it corresponds to a streaming response, all of its chunk entries (`<key>_chunk_*`). | ||
|
|
||
| ### Getting the cache key from responses | ||
|
|
||
| - **Regular responses**: When a response is served from cache, the plugin adds metadata to `response.ExtraFields.RawResponse`: | ||
| - `bifrost_cached: true` | ||
| - `bifrost_cache_key: "<prefix><cache_value>_<xxhash>"` | ||
| Use this `bifrost_cache_key` as the argument to `ClearCacheForKey`. | ||
|
|
||
| - **Streaming responses**: Cached stream chunks include `bifrost_cache_key` for the specific chunk, in the form `"<base>_chunk_{index}"`. To invalidate the entire stream cache, strip the `"_chunk_{index}"` suffix to obtain the base key and pass that base key to `ClearCacheForKey`. | ||
|
|
||
| ### Examples | ||
|
|
||
| ```go | ||
| // Non-streaming: clear the cached response you just used | ||
| resp, err := client.ChatCompletionRequest(ctx, req) | ||
| if err != nil { | ||
| // handle error | ||
| } | ||
|
|
||
| if resp != nil && resp.ExtraFields.RawResponse != nil { | ||
| if raw, ok := resp.ExtraFields.RawResponse.(map[string]interface{}); ok { | ||
| if keyAny, ok := raw["bifrost_cache_key"]; ok { | ||
| if pluginImpl, ok := plugin.(*redis.Plugin); ok { | ||
| _ = pluginImpl.ClearCacheForKey(keyAny.(string)) | ||
| } | ||
| } | ||
| } | ||
| } | ||
| ``` | ||
|
|
||
| ```go | ||
| // Streaming: clear all chunks for a cached stream | ||
| for msg := range stream { | ||
| if msg.BifrostResponse == nil { | ||
| continue | ||
| } | ||
| raw := msg.BifrostResponse.ExtraFields.RawResponse | ||
| rawMap, ok := raw.(map[string]interface{}) | ||
| if !ok { | ||
| continue | ||
| } | ||
| keyAny, ok := rawMap["bifrost_cache_key"] | ||
| if !ok { | ||
| continue | ||
| } | ||
| chunkKey := keyAny.(string) // e.g., "<base>_chunk_3" | ||
|
|
||
| // Derive base key by removing the trailing "_chunk_{index}" part | ||
| baseKey := chunkKey | ||
| if idx := strings.LastIndex(chunkKey, "_chunk_"); idx != -1 { | ||
| baseKey = chunkKey[:idx] | ||
| } | ||
|
|
||
| if pluginImpl, ok := plugin.(*redis.Plugin); ok { | ||
| _ = pluginImpl.ClearCacheForKey(baseKey) | ||
| } | ||
| break // we only need one chunk to compute the base key | ||
| } | ||
| ``` | ||
|
|
||
| To clear all entries managed by this plugin (by prefix), call `Cleanup()` during shutdown: | ||
|
|
||
| ```go | ||
| _ = plugin.(*redis.Plugin).Cleanup() | ||
| ``` | ||
|
|
||
| ## Testing | ||
|
|
||
| The plugin includes comprehensive tests for both regular and streaming cache functionality. | ||
|
|
||
| Run the tests with a Redis instance running: | ||
|
|
||
| ```bash | ||
| # Start Redis (using Docker) | ||
| docker run -d -p 6379:6379 redis:latest | ||
|
|
||
| # Run all tests | ||
| go test -v | ||
|
|
||
| # Run specific tests | ||
| go test -run TestRedisPlugin -v # Test regular caching | ||
| go test -run TestRedisPluginStreaming -v # Test streaming cache | ||
| ``` | ||
|
|
||
| Tests will be skipped if Redis is not available. The tests validate: | ||
|
|
||
| - Cache hit/miss behavior | ||
| - Performance improvements (cache should be significantly faster) | ||
| - Content integrity (cached responses match originals) | ||
| - Streaming chunk ordering and reconstruction | ||
| - Provider information preservation | ||
|
|
||
| ## Performance Benefits | ||
|
|
||
| - **Reduced API Calls**: Identical requests are served from cache | ||
| - **Ultra-Low Latency**: Cache hits return immediately, cache writes are non-blocking | ||
| - **Streaming Efficiency**: Cached streams are reconstructed and delivered faster than original API calls | ||
| - **Cost Savings**: Fewer API calls to expensive LLM providers | ||
| - **Improved Reliability**: Cached responses available even if provider is down | ||
| - **High Throughput**: Asynchronous caching doesn't impact response times | ||
| - **Perfect Stream Fidelity**: Cached streams maintain exact chunk ordering and content | ||
|
|
||
| ## Error Handling | ||
|
|
||
| The plugin is designed to fail gracefully: | ||
|
|
||
| - If Redis is unavailable during startup, plugin creation fails with clear error | ||
| - If Redis becomes unavailable during operation, requests continue without caching | ||
| - If cache retrieval fails, requests proceed normally | ||
| - If cache storage fails asynchronously, responses are unaffected (already returned) | ||
| - Malformed cached data is ignored and requests proceed normally | ||
| - Cache operations have timeouts to prevent resource leaks | ||
|
|
||
| ## Best Practices | ||
|
|
||
| 1. **Start Simple**: Use only `Addr` and `CacheKey` - let defaults handle the rest | ||
| 2. **Choose meaningful cache keys**: Use descriptive context keys that identify cache sessions | ||
| 3. **Set appropriate TTL**: Balance between cache efficiency and data freshness | ||
| 4. **Use meaningful prefixes**: Helps organize cache keys in shared Redis instances | ||
| 5. **Monitor Redis memory**: Track cache usage, especially for streaming responses (more chunks = more storage) | ||
| 6. **Context management**: Always provide cache key in request context for caching to work | ||
| 7. **Use `bifrost.Ptr()`**: For boolean pointer configuration options | ||
| 8. **Streaming considerations**: Longer streams create more cache entries, adjust TTL accordingly | ||
|
|
||
| ## Security Considerations | ||
|
|
||
| - **Sensitive Data**: Be cautious about caching responses containing sensitive information | ||
| - **Redis Security**: Use authentication and network security for Redis | ||
| - **Data Isolation**: Use different Redis databases or prefixes for different environments | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.