-
Notifications
You must be signed in to change notification settings - Fork 452
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document version fetching #218
Comments
Great proposal, @alecgibson ! APIAn alternative would be to support passing options to Implementation detailsData flowIt's not completely clear to me what data you're planning to send to the client, when a specific version is requested. It would be nice if the server sent the requested snapshot, rather then the operations leading up to that snapshot. It could drastically limit the amount of data that needs to be transferred and would enable adding the optimizations that are currently out of scope. DiscussionRead-only snapshots
Even if a "historical" snapshot is loaded into a
It would avoid the problem of shared documents mentioned below. It would also prevent Overall I like this idea. What kind of API do you have in mind? Whatever it is, I think it should be consistent with Sharing of DocumentsAnother thing to consider is sharing of the Doing the workI'm certainly available to help with the implementation. :-) |
Okay, so I definitely think we should move this logic away from the New API suggestionHow about we put it on This will return a new class Snapshot {
data: any;
id: string;
collection: string;
type: any;
version: number;
timestamp: Date; // NB: Not actually currently on Doc, but would be useful
} NamingI'm aware that That being said, I'm all ears for an alternative name for this object to reduce any confusion or assuage anyone's fears. Possible alternatives:
|
Looks great to me! Just to clarify the API, is it like this: What do you think of |
Yes, that's exactly what I'm picturing for API. I was thinking that a
Of course internally we could convert that |
I'd prefer to use a number internally (client-server communication) to avoid the overhead of generating and parsing date strings. In the public API maybe we could support a |
Yeah sounds good. We can discuss time interface properly on the pull request. I'll probably start with just the |
Number and Date would be a good solution, personally I would not add momentjs that is a great library but like many others is over used and I would not encourage people to use it or even worse introduce this dependency on sharedb even as a dev dependency. |
@gkubisa I've started looking into this, and thinking about it a bit more. I'm not sure that defining the version through an object is the correct thing to do. For example, what happens when a user calls the function with the following: {
timestamp: new Date("2018-06-29"),
version: 4
} We could return an error, but if the point is extensibility, then error checking surely gets more and more annoying the more ways we have of defining the version: if ( (version && timestamp) || (version && newDefinition) || (timestamp && newDefinition) ) return callback(error); If we really don't want to have multiple methods, how about we just have our argument accept either a version Connection.prototype.getSnapshot = function(collection, id, version, callback) {
if (typeof version === 'number') {
// it's a version number
} else if (version instanceof Date) {
// it's a date
} else {
return callback({ message: 'Version must be a version number or a Date object' });
}
}; While we're here - how do error codes work? If an error doesn't fit into an existing code, do I just claim a new one and record it in the |
Also, regarding deleted documents, what's the expected behaviour? If a document is currently deleted (ie its final op is a deletion op), then do we tell the client that the document is deleted, regardless of version they try to fetch? |
Okay, I've raised a PR here: #220 |
This feature was added in this PR: #220 |
Problem statement
As a ShareDb client, I want to fetch a read-only snapshot of my document at a given version number or timestamp.
Motivation
By the time we're going to all the effort of storing a set of deltas between document versions, it's only natural that a client would wish to leverage this power to view a document at any point in its history.
The problem statement mentions fetching a document at a given timestamp, because it is far more natural to request a document at a particular time, than at a given (arbitrary) version number.
API
The proposed API is to add two methods to the
Doc
class:Doc.prototype.fetchVersion(version, callback)
takesversion
, which is anumber
, and recreates the snapshot up to that version number. The result is stored indoc.data
Doc.prototype.fetchAtTime(time, callback)
takestime
, which is aDate
, and recreates the snapshot using ops whose timestamps are up-to-and-including thatDate
. The result is stored indoc.data
Implementation details
Data flow
The request for the document version will be submitted like the existing
fetch
function - by submitting an event to the server, and attaching a callback.The message will be picked up by
Agent._handleMessage
, which will then leverageBackend.getOps
to fetch the requested ops.We may need to make a small change to
Backend.getOps
to let us request metadata from the backend using theoptions
object. As discussed in this Pull Request, this will be done in such a way that keeps theoption
object out of the public API (probably by creating an internalBackend._getOps
method that can take anoptions
object, and calling it withnull
fromBackend.getOps
).Discussion
Read-only snapshots
Using the
Doc
class is potentially a leaky concern, given that it will also haveDoc.prototype.submitOp
, which doesn't really make sense when fetching an historical document.A possible alternative could be to expose these functions instead on the
Connection
class? That way it should be very clear that the consumer is receiving a snapshot, and not a full-blownDoc
?Out of scope
The following possibilities are deemed out-of-scope for the initial solution.
Optimising for reversible types
Making a type reversible is optional. As such, any solution must at least be able to construct a document from its initial version, and build up. However, given the nature of documents, it is highly likely that users will wish to return to more recent versions, where it will probably be faster to start from the current version and work backwards.
This is deemed out-of-scope.
Caching ops
Fetching ops can be expensive. Ideally we would cache the ops for a given document, and - so long as the requested version/timestamp is lower than the latest op - then we could simply read ops back from the cache.
This is deemed out-of-scope.
Other starting snapshot optimisations
It could be possible to fetch the latest
create
op, and start building from there, instead of the very beginning. It might also (theoretically) be possible to store intermittent snapshots of the document for faster reconstruction at a trade-off with space.These sorts of optimisations are also deemed out-of-scope.
Doing the work
Given that we need this functionality, I'm happy to undertake the majority of the work on this, but I haven't developed in this codebase before, so may need some assistance (especially because I haven't really worked with all the features of ShareDb, such as projections).
The text was updated successfully, but these errors were encountered: