-
Notifications
You must be signed in to change notification settings - Fork 880
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BOUNTY - $200] Share kv cache between nodes for redundancy #52
Comments
Hi Alex . just was curious as I was currently learning about CRDT strategies and indeed wanted to understand the challenge on broadcasting the cache changes with either:
so is that a good strategy to think about the KVCacahe updates? |
I think eventual consistency is preferred here. Just broadcast them out optimistically, with a monotonically increasing version number and pick |
sounds good . I am happy to get assigned to this and create PR if its public issue . |
Great! I will actually assign a $200 bounty to this as it’s an important upgrade and highly valued contribution if you can get it working reliably! |
Thanks . I will try to setup PR till Wednesday due to limited time BW tommorrow . Just wanted to ask whether the |
More context info for people who want to understand why, how etc. provided by Claude :) Synchronizing the full KV cache between all nodes after each inference could indeed provide some unique benefits, particularly in terms of fault tolerance and maintaining context across the cluster. Let's explore this idea and consider its implications: Benefits:
Implementation Considerations:
Challenges and Considerations:
Potential Optimizations:
Implementation Steps:
While this approach could offer some unique advantages, it's important to carefully consider the trade-offs, particularly in terms of performance and complexity. It might be worth implementing this as an optional feature that can be enabled for specific use cases where maintaining context across the cluster is particularly valuable. |
Thanks @stephanj for adding context along with sharing the checklists of the subject . currently it's work in progress here . I am gonna implement modified version of gossip protocol for providing strong consistency . I do have some general feedback of the viewpoints that you've shared:
|
Hi, |
Hi @pranav4501 , yes I had starting working on this 2 weeks but didnt had time in between : https://github.com/dhruvmalik007/exo/tree/feat/adding_KV_broadcast_cache I will try till end of this week but if you are interested to implement before I can assign you. |
Hi @dhruvmalik007 , |
Hello, @dhruvmalik007 |
Hi, you mean that in the |
#23 (comment)
Perhaps after each inference, we synchronise the full kv cache between all nodes. This should be fairly straightforward, we can broadcast the entire cache.
this would allow for saving context even when a node goes down.
The text was updated successfully, but these errors were encountered: