Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fetch snapshot by time #262

Merged
merged 3 commits into from
Jan 30, 2019
Merged

Fetch snapshot by time #262

merged 3 commits into from
Jan 30, 2019

Conversation

alecgibson
Copy link
Collaborator

This change adds the ability fetch a snapshot by time. The motivation
for this is that fetching a document by time is quite a "natural" way
to think about document history, and allows us to - for example - fetch
multiple documents as they were at a given time, without having to
look up their exact version numbers first.

We add a new Connection.fetchSnapshotByTimestamp method, which
follows a very similar route to Connection.fetchSnapshot, and where
possible, as much code is re-used as possible:

  • both methods use a subclassed child of SnapshotRequest
  • both methods have their requests handled by the same machinery in
    Connection
  • both methods in the Backend have ops applied by a common method,
    but use their own methods for calls to middleware

In order to make this feature possible at scale, this change also adds
two new methods to the MilestoneDB interface:

  • getMilestoneSnapshotAtOrBeforeTime
  • getMilestoneSnapshotAtOrAfterTime

These methods are used to fetch milestone snapshots either side of the
requested timestamp, which means we only need to fetch the ops between
the two of them to reach the desired timestamp.

In the case where a milestone database is not being used, then fetching
a snapshot by timestamp is still possible, but it will fetch all the ops
for a document, and keep applying them from v0 until the timestamp is
reached, which is not particularly scalable.

@coveralls
Copy link

coveralls commented Nov 30, 2018

Coverage Status

Coverage increased (+0.2%) to 95.848% when pulling 8e46271 on snapshot-by-timestamp into 95c81b8 on master.

@alecgibson alecgibson force-pushed the snapshot-by-timestamp branch 4 times, most recently from bd8763f to 513f6b6 Compare December 3, 2018 15:08
This change adds the ability fetch a snapshot by time. The motivation
for this is that fetching a document by time is quite a "natural" way
to think about document history, and allows us to - for example - fetch
multiple documents as they were at a given time, without having to
look up their exact version numbers first.

We add a new `Connection.fetchSnapshotByTimestamp` method, which
follows a very similar route to `Connection.fetchSnapshot`, and where
possible, as much code is re-used as possible:

  - both methods use a subclassed child of `SnapshotRequest`
  - both methods have their requests handled by the same machinery in
    `Connection`
  - both methods in the `Backend` have ops applied by a common method,
    but use their own methods for calls to middleware

In order to make this feature possible at scale, this change also adds
two new methods to the `MilestoneDB` interface:

  - `getMilestoneSnapshotAtOrBeforeTime`
  - `getMilestoneSnapshotAtOrAfterTime`

These methods are used to fetch milestone snapshots either side of the
requested timestamp, which means we only need to fetch the ops between
the two of them to reach the desired timestamp.

In the case where a milestone database is not being used, then fetching
a snapshot by timestamp is still possible, but it will fetch all the ops
for a document, and keep applying them from v0 until the timestamp is
reached, which is not particularly scalable.
Copy link
Contributor

@ericyhwang ericyhwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the notes I took during the PR review meeting today.

I haven't looked at the tests yet, but the actual code itself looks great. Glad you could find a clean abstraction boundary.

lib/backend.js Outdated Show resolved Hide resolved
lib/backend.js Outdated Show resolved Hide resolved
lib/client/connection.js Outdated Show resolved Hide resolved
This change removes or renames `shouldBreak` calls. In `Backend`, for
clarity we instead pre-filter ops, and just pass around the ops we want
to be applied to a snapshot.

In the `MemoryMilestoneDB`, these functions are extracted and renamed to
more descriptive break condition names.
The function for building a snapshot from ops is useful, and has no
dependencies on `Backend`. This change moves it into the `ot` module,
where it will be a bit more discoverable and can be reused.
@alecgibson
Copy link
Collaborator Author

@nateps / @ericyhwang as discussed, I've moved the snapshot building function into ot.js.

@ericyhwang
Copy link
Contributor

LGTM from me and Nate

@ericyhwang ericyhwang merged commit f507ca1 into master Jan 30, 2019
@alecgibson alecgibson deleted the snapshot-by-timestamp branch January 30, 2019 17:27
alecgibson pushed a commit to share/sharedb-milestone-mongo that referenced this pull request Jan 30, 2019
This change adds [new milestone database methods][1] for fetching
milestones by timestamp.

In order to make this performant, this change also adds a new index to
the `m.mtime` (modified timestamp) field.

[1]: share/sharedb#262
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants