Skip to content

Latest commit

 

History

History
71 lines (63 loc) · 3.3 KB

versioning.md

File metadata and controls

71 lines (63 loc) · 3.3 KB

Versioning

This is an experimental feature. Please try it out in a test environment and provide feedback.

Since Datahike has a persistent memory model it can be used similarly to git. While using databases with different underlying stores is the general way to combine data in Datahike and should be preferred for separate data sets, in cases where you want to evolve a single database the structural sharing of its indices has unique advantages. Git is efficient and fast because it does not need to copy shared data on each operation. So in cases where you want to evolve a database with new data, but don't want to write it directly into the main database, you can branch! and evolve a copy of the database that behaves like the main branch under :db. After you have evolved the database you can decide what data to retain and then merge! it back. You can also take any in-memory DB value and dump it into a durable branch with force-branch!. To inspect the write history use branch-history.

You can see the following example as an example,

(require '[superv.async :refer [<?? S]]
         '[datahike.api :as d]
         '[datahike.experimental.versioning :refer [branch! branch-history delete-branch! force-branch! merge!
                                                    branch-as-db commit-as-db parent-commit-ids]])

(let [cfg    {:store              {:backend :file
                                   :path    "/tmp/dh-versioning-test"}
              :keep-history?      true
              :schema-flexibility :write
              :index              :datahike.index/persistent-set}
      conn   (do
              (d/delete-database cfg)
              (d/create-database cfg)
              (d/connect cfg))
      schema [{:db/ident       :age
               :db/cardinality :db.cardinality/one
               :db/valueType   :db.type/long}]
      _      (d/transact conn schema)
      store  (:store @conn)]
  (branch! conn :db :foo) ;; new branch :foo, does not create new commit, just copies
  (let [foo-conn (d/connect (assoc cfg :branch :foo))] ;; connect to it
    (d/transact foo-conn [{:age 42}]) ;; transact some data
    ;; extracted data from foo by query
    ;; ...
    ;; and decide to merge it into :db
    (merge! conn #{:foo} [{:age 42}]))
  (count (parent-commit-ids @conn)) ;; => 2, as :db got merged from :foo and :db
  ;; check that the commit stored is the same db as conn
  (= (commit-as-db store (commit-id @conn)) (branch-as-db store :db) @conn) ;; => true
  (count (<?? S (branch-history conn))) ;; => 4 commits now on both branches
  (force-branch! @conn :foo2 #{:foo}) ;; put whatever DB value you have created in memory
  (delete-branch! conn :foo))

Here we create a database as usual, but then we create a branch :foo, write to it and then merge it back. A simple query to extract all data in transactable form that is in a branch1 db but not in branch2 is

(d/q [:find ?db-add ?e ?a ?v ?t
      :in $ $2 ?db-add
      :where
      [$ ?e ?a ?v ?t]
      [(not= :db/txInstant ?a)]
      (not [$2 ?e ?a ?v ?t])]
      branch1 branch2 :db/add)

but you might want to be more selective when creating the data for merge!. We are very interested in what you are planning to do with this functionality, so please reach out if you have ideas or experience problems!