Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Similar Versioning Pattern #3

Closed
jorroll opened this issue Oct 4, 2017 · 2 comments
Closed

Similar Versioning Pattern #3

jorroll opened this issue Oct 4, 2017 · 2 comments

Comments

@jorroll
Copy link

jorroll commented Oct 4, 2017

Hi There!

I just learned about this project, and I was immediately struck by how similar your versioning pattern is to the one I came up with for a volunteer project I'm working on. I thought I'd share my own pattern with you, so you could see how someone else solved a very similar problem. A few caveats: while I've been working on my pattern for about 4 months now, like yours, mine is still a work in progress. At this point, I'm pretty confident the decisions I've made are right for my project, but my project has also not reached the production stage yet.

Summary

My goal is to provide a deep level of "undo" / "undelete" functionality throughout my app:

  1. I want to be able to track changes to node properties, but not necessarily track changes to every property on a node
  2. I want to be able to "undelete" destroyed relationships
  3. I want to be able to "undelete" destroyed nodes
  4. I want to only retain the diffs between versions, so as to not take up too much database space.

These needs resulted in a few distinct database patterns

Pattern 1: for tracking changes to node properties

screen shot 2017-10-03 at 6 05 16 pm

A tracked node has a constellation of associated :History nodes (node)-[:HISTORY]->(:History). It has one synced history node (node)-[:SYNCED_HISTORY]->(:History). I elected to not bother with (:History)-[:PREVIOUS]->(:History) relations. Instead, each :History node has a created_at timestamp which orders them.

Like your implementation, the node being tracked always contains the current state. Mine differs from yours however, in that the :History node has a single log property which contains a JSON string containing the changes needed to move from that state to the previous state. i.e. to rollback the current state to it's previous state, you would load the -[:SYNCED_HISTORY]-> and merge the serialized log property with the current node. The return value would be the previous state.

In common usage, I expect the user will either want to undo the most recent action (by grabbing the -[:SYNCED_HISTORY]->), or they will want to view all the changes in chronological order and revert to an arbitrary one. While only retaining diffs means that, in order to move from the current state to a state several version back, I'll need to load every version and apply the changes in order, I think this is an acceptable performance decision because I do not expect users to take advantage of this feature often.

Pattern 2: for tracking previously destroyed nodes

To track destroyed nodes, every node which supports tracking has either an additional :Current label or an additional :Destroyed label. These nodes also have an updated_at timestamp. How this is used: say you have a :ShoppingCart and the :ShoppingCart has many :Items. To find all of the current items (:ShoppingCart)-->(:Item:Current). To find the most recent destroyed item (:ShoppingCart)-->(item :Item:Destroyed) RETURN item ORDER BY item.updated_at DESC LIMIT 1

Neo4j makes tracking destroyed nodes very nice, because you can just swap out the :Current label for a :Destroyed label and otherwise retain all the proper relations & properties of the node.

Pattern 3: for tracking previously destroyed relationships / changes to relationships

To track relationships, it is necessary to model the relationship with a node. (node)-[:RELATION]->(:Relation:Current)-[:RELATION]->(second_node). To destroy the relation, but maintain the historical record of it (node)-[:RELATION]->(:Relation:Destroyed)-[:RELATION]->(second_node). This also allows you to track the changes to the relations properties. i.e. (:Relation:Current)-[:History]->(:History).

:History nodes associated with a :Relation are almost the same as history nodes associated with a standard node (i.e. they also have a JSON serialized log property), except that they also can have a single -[:RELATION_HISTORY]-> relation, which allows me to track changes to the relation itself. i.e. Say you have (node_a)-[:RELATION]->(:Relation:Current)-[:RELATION]->(node_b), but then you update it to (node_a)-[:RELATION]->(:Relation:Current)-[:RELATION]->(node_c). You want to track the change from (node_a)-->()-->(node_b) to (node_a)-->()-->(node_c). The persisted pattern including current :History node is:

(node_a)-[:RELATION]->(rel :Relation:Current)-[:RELATION]->(node_c)
(rel)-[:SYNCED_HISTORY]->(:History)-[:RELATION_HISTORY]->(node_b)`

The -[:RELATION_HISTORY]-> records the step necessary to move backwards to the previous state (i.e. the previous node this relation was pointed at).

screen shot 2017-10-03 at 6 14 30 pm

Anyway, there are a few more details, but this is the bulk of it.

All of my change tracking logic has been written in Ruby, specifically for the Neo4jrb gem, so I won't share it here. This being said, if someone's interested, I'm happy to. (At some point when I have the time, probably next year, I plan on open sourcing my module and sharing it with @cheerfulstoic & the Neo4jrb project. It definitely doesn't fit into that project's "standard" library, but I could see a lot of people benefitting from it as an add-on)

Anyway, I'm not sure how clear all of this is. Let me know if you'd like me to add more info / clarify anything. I'm really intrigued by your project. I'd be interested to know if you considered a pattern similar to mine and decided against it for some reason? Also, am I correct in thinking that this library doesn't currently support tracking the destroyed / non-destroyed state of an entire node? Or of tracking relationship versions?

PS: I notice you have a "use cases" wiki. I would vote to add CRM (e.g. Salesforce, which, if you're not familiar, is a fancy address book) as a great use case. My use case is a volunteer coordination application (amazingly, all the current options available to nonprofits suck).

@mfalcier
Copy link
Member

mfalcier commented Oct 4, 2017

Hi @thefliik !
Thank you for just have dedicated some of your time to our project! 👍
I've somehow understood your model, which is an interesting one regards tracking deleted nodes. Our Core Versioner, was born in order to give people the chance to use an out of the box tool for versioning the usual Entity-State model.
We just picked a different model from yours, because, if we wanted to list States related to our Entity, the linked list was the best option, thanks to Neo4j index-free adjacency. Actually we didn't thought about storing timestamps inside our State nodes, since the linked list just came out first as our final solution.
We didn't focus on deleting nodes, since we didn't want to create a model which forces users or limit their creativity (I'm not saying yours does, since if I understood right, your model was aiming that specific use case): just by giving users the chance to add custom labels to State, they could actually give different meanings to different State.
We have scheduled just a few days ago our next step, since we want to improve our model by also versioning relationship: we will create a project here on GitHub to make some public milestones.
Thank you also for the hint about the use case, we will take a look on that!

@jorroll
Copy link
Author

jorroll commented Oct 18, 2017

Hey, I got busy with another project, but thanks for the prompt reply! What you've said makes sense. I'll definitely keep an eye on this library.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants