-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Model merging with delta objects #4177
Conversation
…into model_delta_merging
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like model_delta
is just a wrapper around a workspace
that is used to communicate that this holds a delta
Wondering if the +/-
operators should be operators of the workspace
class leading to a more intuitive API for those operators, and it would also mean that we avoid wrapping workspace
s into deltas in order to merge them
If we need to indicate that a workspace is a diff we could add a bool field indicating that it is a diff, or maybe model_delta
could be derived from workpace
Thoughts?
Yes, as it is right now The I specifically do not want The case of "wrapping workspaces into deltas in order to merge them" isn't as bad as it seems as first. In federated learning, we would be directly receiving deltas to pass to the merge function - having to merge workspaces via deltas would be the less common use case. It also simplifies the merge logic inside of each and every reduction, because now there's no longer a requirement to check for a base workspace and subtract it if it exists. |
VW::model_delta
object that internally keeps aVW::workspace
+
and-
such that deltas are created by subtracting two workspaces, and can be added to update a workspacemerge_delta
function that combines multiple deltas into a single delta