Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Library dbvar for delta encodings and mutable DBVar variable #2841

Merged
merged 20 commits into from
Nov 1, 2021

Conversation

HeinrichApfelmus
Copy link
Contributor

@HeinrichApfelmus HeinrichApfelmus commented Aug 24, 2021

Issue number

ADP-1100, ADP-1102, ADP-1111, ADP-1112, ADP-1134

Overview

This is the initial implementation of a library dbvar about delta encodings.

The Cardano Wallet backend becomes faster if we store only the deltas between checkpoints of the wallet state in the database. The wallet state is essentially a List of known addresses and a Set of UTxOs. The library implemented here can handle this use case. In addition, the use of side effects in the wallet can be significantly reduced with the new DBVar abstraction.

The library provides

  • A mutable variable type DBVar which represents values that are kept in memory, but which are written to the hard drive on every update.
  • Delta encodings and combinators on them (Delta type class, Embedding type).
  • An in-memory type for database tables and delta encodings for it.
  • A Chain data type which can store checkpoints of the wallet state when combined with delta encodings

Progress

  • The API and implementation are now stable enough to be merged into the repository and be a dependency of other pull requests.
  • The API is still preliminary and will see some streamlining when we use it in anger.
  • The modules Database.Schema and Demo.Database are still experimental.
  • Property tests are not provided yet. We will rely on existing unit tests of the wallet while integrating, up until the point where the database format is changed.

Comments

  • As this library is of general utility, I would like to release it on hackage eventually.
    • Before that, I would like to reorganize it slightly and rename it to delta-encodings.

@HeinrichApfelmus
Copy link
Contributor Author

As discussed today, I would like to start adding this library into our codebase, so that the database layer can be refactored and eventually switched to the delta encoding of the wallet state checkpoints. The API abstractions are stable, but I intend to postpone property tests until just before the switch in order to accommodate minor API changes that arise during refactoring. With that in mind, reviews welcome! 😊

@HeinrichApfelmus HeinrichApfelmus changed the title Delta encodings for more efficient "write-dominant" database layer Library dbvar for delta encodings and mutable DBVar variable Oct 22, 2021
@HeinrichApfelmus HeinrichApfelmus marked this pull request as ready for review October 22, 2021 11:42
@HeinrichApfelmus HeinrichApfelmus force-pushed the HeinrichApfelmus/ADP-1043/dbvar branch from 591dfa9 to a08ab95 Compare October 22, 2021 12:00
lib/dbvar/README.md Outdated Show resolved Hide resolved
lib/dbvar/README.md Outdated Show resolved Hide resolved
to the store was unsuccesful.
-}
data Store m da = Store
{ loadS :: m (Maybe (Base da))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to have Either here. If someone wants Maybe then one can use for example https://hackage.haskell.org/package/either-5.0.1.1/docs/Data-Either-Combinators.html#v:leftToMaybe

Copy link
Contributor Author

@HeinrichApfelmus HeinrichApfelmus Oct 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. This seemingly minor change has larger ramifications. In particular, if Store has an additional type parameter e that keeps track of potential error messages, then Embedding should have one, too.

But then — which of the following type signatures should we give embedStore?

embedStore_1 :: (…) => Embedding da db -> Store m db -> Store m da
embedStore_2 :: (…) => Embedding da db -> Store e m db -> Store e m da
embedStore_3 :: (…) => Embedding e da db -> Store e m db -> Store e m da
embedStore_4 :: (…) => Embedding e1 da db -> Store e2 m db -> Store (e1 :+: e2) m da

Taking the union of the error messages, solution No. 4, seems like the most general choice, but this solution would require tools for managing unions of error types, which we currently do not employ.

Another solution would be to use the most general union of error types, SomeException, and use the type signature

loadS5 :: Store m da -> m (Either SomeException (Base da))

In this way, the error is not visible in the type of Store, but at least the error message is not lost. I think that I prefer this approach, actually.

Yet another option would be to use the underlying monad m to signal a failure case. But that would be unsuitable for Embedding.

On that note, I should go back and make sure that the DBVar are capable of handling asynchronous exceptions. This may not be the case as I originally hoped for. 🤔

Copy link
Contributor

@paweljakubas paweljakubas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice abstraction introduced here! Especially mesmerized by servant like adoption to get resources. I am in favor of merging this as it is, and then add in separate PRs testing for this (an interesting instances of concepts, checking properties of many algebra laws that should be obeyed, etc). I bet some abstractions could be polished, comments improved as those tests and practical realizations at work (in tests) are tried. Also, some cleaning of commit history, squashing would do no harm!

@HeinrichApfelmus
Copy link
Contributor Author

Thanks for taking the time to review this, Pawel! 😊 I'm also in favor of merging this as it is, as I agree that writing tests and actually using this library will help the most with polishing the abstractions and comments.

I'll wrap this up by

  • fixing robustness in the presence of asynchronous exceptions
  • replacing Maybe with Either SomeException in the load* functions
  • squashing the commit history a bit

and then merging the pull request. Thanks!

I'm also fond of the servant-like database row types, though they will probably the last feature to be integrated into the wallet code, as Persistent does a good-enough job for the time being.

@HeinrichApfelmus HeinrichApfelmus force-pushed the HeinrichApfelmus/ADP-1043/dbvar branch from 3498dbb to c665594 Compare November 1, 2021 11:46
Hides key management from the API user.
Also: Minor clean up.
WARNING: Unfortunately, this version has a bug relating to management of primary keys.
* Clarify that an `Embedding `can be *redundant*. Fix the `compose` function by providing an additional parameter to the `update` function.
* Remove the base type parameters and make `Embedding` an instance of `Semigroupoid` (aka `Category` without identities.)
Rows in a database table are unordered. As the order of operations in a Semigroup matters, we need to make this issue explicit when saving deltas as rows in a database.
Nasty things happen if the unique ID supply of the table does not match the supply that we would get if were to load the table afresh from the database. For example, the definition composition of `Embedding` currently relies on this equivalence.
Useful for highlighting that rows in a database table are essentially unordered.
* Implements cancellation of 'Insert' and 'Delete'. This will eventually translate into the deletion of spent UTxOs from the database.
* Prepares for storing `DeltaSet` as a list of rows in the database, which do *not* have a well-defined ordering.
using "Mealy" machines aka "stream transformers". Interestingly, I think that although composition appears to give an unwieldy tower of machine, it very much corresponds to what we would get if we were to compose updates on mutable variables.
* Use `persistent` as a backend for raw SQL queries. Eventually, we may want to move to a more bare-bones backend such as `sqlite-simple`.
Bypasses the use of TemplateHaskell and the creation of an intermediary `PersistEntity` instance.

* Move database `Store` into separate module
* Also rename using type synonym `SqlPersistM` which is kindly provided by `persistent` and already used in cardano-wallet
* instance of Semigroup for (Replace a)
* instances for higher tuples for the `Delta` class
* `pair` for `Embedding`
* `nodes` for retrieving all nodes of a `Chain`
@HeinrichApfelmus HeinrichApfelmus force-pushed the HeinrichApfelmus/ADP-1043/dbvar branch from c665594 to 8201610 Compare November 1, 2021 12:36
… to cover the case where we want to make no modifications.

Even though `Maybe …` is now an instance of `Delta`, it is not useful to use this as the base delta type for a DBVar, hence the new helper function.

That said however, one could try to add a function that lifts `DBVar m da` into `DBVar m (Maybe da)` by introducing the concept of "delta transformer". `Maybe` would be one example of this. 🤔
* Rewrite update operations in terms of `modifyDBMaybe` to ensure this
* Fix `newWithCache` to
    * release the write-lock on exception
    * always update the cache when the update function succeeds
* Clarify assumptions on `updateS` and `writeS`
* Fix `embedStore` to always update the machine cache when the original update succeeds
* Add warning to `pairStore` concerning asynchronous exceptions
See NOTE [EitherSomeException]
@HeinrichApfelmus HeinrichApfelmus force-pushed the HeinrichApfelmus/ADP-1043/dbvar branch from f6bec2c to fa3161d Compare November 1, 2021 13:04
@HeinrichApfelmus
Copy link
Contributor Author

bors merge

@iohk-bors
Copy link
Contributor

iohk-bors bot commented Nov 1, 2021

Build succeeded:

@iohk-bors iohk-bors bot merged commit b9d8007 into master Nov 1, 2021
@iohk-bors iohk-bors bot deleted the HeinrichApfelmus/ADP-1043/dbvar branch November 1, 2021 17:57
iohk-bors bot added a commit that referenced this pull request Dec 8, 2021
2942: Refactor DBLayer to use in-memory Checkpoints r=HeinrichApfelmus a=HeinrichApfelmus

### Issue number

ADP-1169

### Overview

Previous work in epic ADP-1043 introduced delta encodings, DBVars, and an embedding of the wallet state and its delta encodings into a database table. It's time to integrate these tools with the wallet code. To facilitate code review, the integration proceeds in a sequence of refactorings that do not change functionality and pass all unit tests.

In this step, we introduce a type Checkpoints a which stores all checkpoints of the wallet state in memory. The current ad-hoc cache of the latest checkpoints is replaced by a DBVar that stores these checkpoints instead.

### Details

* Keep the `Checkpoints` type stupid and simple for now. It will be replaced later.
* This pull request will (temporarily) increase the memory requirements for the wallet, as the collection of checkpoints is kept in-memory. However, memory usage will go back to normal (and better) once delta encodings are integrated.

### Comments

Merge PR #2841 before this one, because this pull request is based on the branch of the former.


Co-authored-by: Heinrich Apfelmus <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants