MongoDB-Gremlin compiles MongoDB insert and query documents to Gremlin bytecode. In doing so, JSON documents are embedded in the underlying Apache TinkerPop-enabled graph system.
This project currently serves as a proof-of-concept and should not be used in any real, production capacity. It was created to provide a technical foundation for the The Von Gremlin Architecture article. If anyone is interested in taking over this project and flushing out the remaining CRUD operations of MongoDB (e.g. delete and update), please feel free to do so. There are a few peculiarities that were introduced for the sake of providing a quick POC. These should be rectified in a more thorough implementation and can be discussed with the interested developer.
NOTE: Please see SQL-Gremlin and SPARQL-Gremlin for other examples of using common data structures and query languages with Gremlin.
The JSON-to-graph encoding is as follows:
-
Every JSON object is a vertex.
-
Every JSON primitive field is a vertex property.
-
Every JSON array must either be all primitives or all objects.
-
Primitive arrays are represented using vertex multi-properties.
-
Object arrays are represented using edges to vertices.
-
-
JSON arrays may not contain JSON arrays.
It is important to note that appropriate graph-based indices should be defined to enable quick retrieval based on the abstract structure defined above.
~/software/tinkerpop bin/gremlin.sh
\,,,/
(o o)
-----oOOo-(3)-oOOo-----
plugin activated: tinkerpop.server
plugin activated: tinkerpop.utilities
plugin activated: tinkerpop.tinkergraph
gremlin> :install com.datastax.tinkerpop mongodb-gremlin 0.1-SNAPSHOT
==>Loaded: [com.datastax.tinkerpop, mongodb-gremlin, 0.1-SNAPSHOT]
gremlin> import com.datastax.tinkerpop.mongodb.MongoDBTraversalSource
==>...
gremlin> graph = TinkerGraph.open()
==>tinkergraph[vertices:0 edges:0]
gremlin> db = graph.traversal(MongoDBTraversalSource)
==>mongodbtraversalsource[tinkergraph[vertices:0 edges:0], standard]
gremlin> db.insertOne("""
......1> {
......2> "~id" : "71223bf3-9dcc-4de1-b95a-13fdb8aba9e0",
......3> "~label" : "person",
......4> "name" : "Gremlin",
......5> "hobbies" : ["traversing", "reflecting"],
......6> "birthyear" : 2009,
......7> "alive" : true,
......8> "languages" : [
......9> {
.....10> "name" : "Gremlin-Java",
.....11> "language" : "Java8"
.....12> },
.....13> {
.....14> "name" : "Gremlin-Python",
.....15> "language" : "Python"
.....16> },
.....17> {
.....18> "name" : "Ogre",
.....19> "language" : "Clojure"
.....20> }
.....21> ]
.....22> }
.....23> """)
==>[acknowledged:true,insertId:71223bf3-9dcc-4de1-b95a-13fdb8aba9e0]
gremlin> db.find('{"name" : "Gremlin"}')
==>[~id:71223bf3-9dcc-4de1-b95a-13fdb8aba9e0,
~label:person,
birthyear:2009,
alive:true,
hobbies:[traversing,reflecting],
name:Gremlin,
languages:
[[~label:vertex,~id:2,name:Gremlin-Java,language:Java8],
[~label:vertex,~id:6,name:Gremlin-Python,language:Python],
[~label:vertex,~id:10,name:Ogre,language:Clojure]]]
Finally, it is important to note that because the JSON document is embedded in the graph as vertices, edges, and properties,
it is possible to use GraphTraversal
to query the graph structure as well.
gremlin> g = graph.traversal()
==>graphtraversalsource[tinkergraph[vertices:4 edges:3], standard]
gremlin> g.V().has('name','Gremlin').out('languages').has('name','Ogre').values('language')
==>Clojure