Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(wallet)!: rework persistence, changeset, and construction #1514

Merged
merged 10 commits into from
Jul 22, 2024

Conversation

evanlinjin
Copy link
Member

@evanlinjin evanlinjin commented Jul 17, 2024

Closes #1496
Closes #1498
Closes #1500

Description

Rework sqlite: Instead of only supported one schema (defined in bdk_sqlite), we have a schema per changeset type for more flexiblity.

  • rm bdk_sqlite crate (as we don't need bdk_sqlite::Store anymore).
  • add sqlite feature on bdk_chain which adds methods on each changeset type for initializing tables, loading the changeset and writing.

Rework changesets: Some callers may want to use KeychainTxOutIndex where K may change per descriptor on every run. So we only want to persist the last revealed indices by DescriptorId (which uniquely-ish identifies the descriptor).

  • rm keychain_added field from keychain_txout's changeset.
  • Add keychain_added to CombinedChangeSet (which is renamed to WalletChangeSet).

Rework persistence: add back some safety and convenience when persisting our types. Working with changeset directly (as we were doing before) can be cumbersome.

  • Intoduce struct Persisted<T> which wraps a type T which stores staged changes to it. This adds safety when creating and or loading T from db.
  • struct Persisted<T> methods, create, load and persist, are available if trait PersistWith<Db> is implemented for T. Db represents the database connection and PersistWith should be implemented per database-type.
  • For async, we have trait PersistedAsyncWith<Db>.
  • Wallet has impls of PersistedWith<rusqlite::Connection>, PersistedWith<rusqlite::Transaction> and PersistedWith<bdk_file_store::Store> by default.

Rework wallet-construction: Before, we had multiple methods for loading and creating with different input-counts so it would be unwieldly to add more parameters in the future. This also makes it difficult to impl PersistWith (which has a single method for load that takes in PersistWith::LoadParams and a single method for create that takes in PersistWith::CreateParams).

  • Introduce a builder pattern when constructing a Wallet. For loading from persistence or ChangeSet, we have LoadParams. For creating a new wallet, we have CreateParams.

Notes to the reviewers

TODO

Changelog notice

### Added

- Add `sqlite` feature to `bdk_chain` which introduces methods on changeset types that encode/decode changesets to SQLite database.
* Introduce `PersistWith<Db>` and `PersistAsyncWith<Db>` traits and a `Persisted` wrapper. This ergonomically makes sure user inits the db before reading/writing to it.

### Changed

- Moved `bdk_chain::CombinedChangeSet` to `bdk_wallet::ChangeSet` and added `keychain_added` field.
- `bdk_wallet::Wallet` construction now uses a builder API using the newly introduced `CreateParams` and `LoadParams`.

### Removed

- Remove `keychains_added` field from `bdk_chain::keychain_txout::ChangeSet`.

Checklists

All Submissions:

  • I've signed all my commits
  • I followed the contribution guidelines
  • I ran cargo fmt and cargo clippy before committing

New Features:

  • I've added tests for the new feature
  • I've added docs for the new feature

@evanlinjin evanlinjin changed the title Flexible sqlite, rework persistence, refactor changeset Flexible sqlite, rework persistence, refactor changeset, refactor wallet construction Jul 17, 2024
@evanlinjin evanlinjin force-pushed the flexible_sqlite branch 4 times, most recently from 47fb0ac to 5146a3b Compare July 17, 2024 14:51
@evanlinjin evanlinjin self-assigned this Jul 17, 2024
@notmandatory
Copy link
Member

Does the sqlite module really need to be in the chain crate? I know it's feature flagged but since the chain crate should be decoupled from the different persistence mechanisms shouldn't they live in their own crates?

Rework sqlite: Instead of only supported one schema (defined in
`bdk_sqlite`), we have a schema per changeset type for more flexiblity.

* rm `bdk_sqlite` crate (as we don't need `bdk_sqlite::Store` anymore).
* add `sqlite` feature on `bdk_chain` which adds methods on each
  changeset type for initializing tables, loading the changeset and
  writing.

Rework changesets: Some callers may want to use `KeychainTxOutIndex`
where `K` may change per descriptor on every run. So we only want to
persist the last revealed indices by `DescriptorId` (which uniquely-ish
identifies the descriptor).

* rm `keychain_added` field from `keychain_txout`'s changeset.
* Add `keychain_added` to `CombinedChangeSet` (which is renamed to
  `WalletChangeSet`).

Rework persistence: add back some safety and convenience when persisting
our types. Working with changeset directly (as we were doing before) can
be cumbersome.

* Intoduce `struct Persisted<T>` which wraps a type `T` which stores
  staged changes to it. This adds safety when creating and or loading
  `T` from db.
* `struct Persisted<T>` methods, `create`, `load` and `persist`, are
  avaliable if `trait PersistWith<Db>` is implemented for `T`. `Db`
  represents the database connection and `PersistWith` should be
  implemented per database-type.
* For async, we have `trait PersistedAsyncWith<Db>`.
* `Wallet` has impls of `PersistedWith<rusqlite::Connection>`,
  `PersistedWith<rusqlite::Transaction>` and
  `PersistedWith<bdk_file_store::Store>` by default.

Rework wallet-construction: Before, we had multiple methods for loading
and creating with different input-counts so it would be unwieldly to add
more parameters in the future. This also makes it difficult to impl
`PersistWith` (which has a single method for `load` that takes in
`PersistWith::LoadParams` and a single method for `create` that takes in
`PersistWith::CreateParams`).

* Introduce a builder pattern when constructing a `Wallet`. For loading
  from persistence or `ChangeSet`, we have `LoadParams`. For creating a
  new wallet, we have `CreateParams`.
Remove returning `Result` for builder methods on `CreateParams` and
`LoadParams`.
@evanlinjin
Copy link
Member Author

evanlinjin commented Jul 18, 2024

Does the sqlite module really need to be in the chain crate? I know it's feature flagged but since the chain crate should be decoupled from the different persistence mechanisms shouldn't they live in their own crates?

Here are some perspectives from @LLFourn.

The sqlite feature can be thought of as a way to encode/decode. To have a separate crate just to impl encoding/decoding format is odd. E.g. supporting serde in a library should be a feature flag, not a separate crate.

From the other angle, lets see what will be required to have sqlite features in a separate crate...

  • To have methods on changesets impl-ed in a separate crate means we need an extension trait. To use extension traits for types in the same library is odd.
  • Instead of methods on changesets, we can just have loose functions instead. I.e. init_localchain_tables(&db_tx), load_localchain_from_tables(&db_tx) -> ChangeSet, persist_localchain_to_tables(&db_tx, &changeset). This is odd in it's own right.
  • It makes sense to upgrade the types and schemas together. It will be harder to reason about this is the types and schemas were in separate crates.

Is there a drawback to have a feature flag for sqlite on bdk_chain? I'm currently okay with both solutions. I will change my mind if a major drawback is presented for one of them.

@evanlinjin evanlinjin marked this pull request as ready for review July 18, 2024 05:42
@notmandatory
Copy link
Member

notmandatory commented Jul 18, 2024

We already know that there will be more than one SQL based store for ChangeSet, this PR only implements for rusqlite (blocking SQLite) but for server based apps we'll need one for SQLx (async SQLite, PostgreSQL, or MySQL). We might also have users who want to create some sort of custom cloud storage solution for Wallet ChangeSet data.

We don't want to make every data store a feature flagged options in the chain crate. Likely other data store crates won't even live in the bdk workspace so we should use the rusqlite data store to demonstrate now how to do a SQL based store as a separate crate.

Same goes for the bdk_wallet crate, with the current approach we'll need a new feature flag for each new SQL or other kind of data store for ChangeSet. It's be simpler to add a "wallet" feature flag on each new data store crate that adds a new impl on ChangeSet than to have to add a new optional feature for every data store impl in bdk_wallet.

We already have to update downstream blockchain client crates when the chain types change, I don't see it as any different for separate storage crates.

@evanlinjin
Copy link
Member Author

evanlinjin commented Jul 18, 2024

@notmandatory you can still implement custom persistence externally. It just makes sense from a maintenance point of view to update the encoding/decoding together with the types for the persistence backends that BDK plans to maintain.

Edit: My current stance is I'm okay with both. I'll see what @LLFourn 's arguments are tomorrow. If my stance doesn't change, I'll create a commit to split the sqlite stuff out and have everyone else decide which one to go forward with.

Edit: if we have feature flags for different DBs on bdk_chain, and we wish to update a dependency for that DB to a new major version, would we have to change the version of bdk_chain? The benefit of splitting the persistence stuff out is we can have multiple versions of persistence that are for the same bdk_chain version. Is this a true benefit? I'm not sure.

@notmandatory
Copy link
Member

notmandatory commented Jul 18, 2024

I do think we should be able to update the persistence related dependencies without having to bump the chain and wallet crates. I feel pretty strongly that sqlite dependencies, even optional, don't belong in the chain (or wallet) crate.

@LLFourn
Copy link
Contributor

LLFourn commented Jul 19, 2024

These kinds of choices as much of an art as they are a science so intuition is important. I think it's important from the outset to understand that it's easy to go to an external crate from feature flags in a major release if we belive it's going to be stable for a while. Also it's not an either/or decision. You can have feature flags on bdk_chain for sqlite support and none on bdk_wallet where you have to use bdk_wallet_rusqite.

In so far as there is a science to making these decisions here are the considerations I see:

  1. You can't implement traits on foreign types. Of note in this case is the ToSql/FromSql on the chain types from rusqlite. You have to wrap them if you don't implement them in bdk_chain. Then every other sql crate has to create their own wrapper type to implement the trait. The design Evan has made here also means each sql crate will need to make a wrapper to implement the PersistWith trait. When you implement it in the same crates that define the changeset you can implement PersistWith<rusqlite::Connection> for ChangeSet without creating wrappers.
  2. Logic that is tightly coupled with the type should be implemented in the same crate as the type. When we are changing the changeset types there is a 1-to-1 corresponding change that needs to happen within the SQL crates. They are very tightly coupled. So ask yourself does the code that implements sqlite support depend more on the version of rusqlite we are using or which version of bdk_chain we're using? I think it's bdk_chain by a long shot which gives me a bias that this logic should be in bdk_chain.
  3. Upgrading dependencies is a breaking change (unless you make a new feature flag and support multiple versions of the dependency). This means you're going to have to make a major version increment to add support for new persistence backend with the feature flags approach. This biases things to keeping things in separate crates if we don't want to increment versions of bdk_wallet very often for example.

Taking these into account, I think bdk_chain should do feature flags for the time being since all the intricate logic to serialie/deserialze things to an sql database is tightly coupled to the changesets. We want to be able to quickly iterate on bdk_chain for a while. These iterations will be breaking and probably need changes to sql tables etc. It will be easier to keep bdk_chain up to date with all its dependencies if all the implementations are in the bdk_chain crate.

For bdk_wallet I actually expect upgrading rusqlite + whatever many other persistence backends at the same time as we upgrade other dependencies (in a major release) to be fine for a while. Eventually I'd expect bdk_wallet to be stable enough that you do want to shift things to an external crate and any new persistence backend will start as a new crate (perhaps outside the workspace). Shifting things out can be done in a major bdk_wallet release. For now bdk (especially bdk_chain) is more experimental than most of the persistence backends that people want to support but eventually that will change.

We already know that there will be more than one SQL based store for ChangeSet, this PR only implements for rusqlite (blocking SQLite) but for server based apps we'll need one for SQLx (async SQLite, PostgreSQL, or MySQL). We might also have users who want to create some sort of custom cloud storage solution for Wallet ChangeSet data.

Just noting, this works without too much friction: For each of these general databases you make a feature flag. For custom project specific databases it's easy since you can implement PersistWith<MyCustomDb> for bdk_wallet::Wallet as long as you do it in the same crate as MyCustomDb.

@notmandatory
Copy link
Member

Ok let's go ahead and start as @evanlinjin has it and we can re-evaluate for the next major release if we like it or want to move the rusqlite store into it's own crate.

Post beta release I'll help @matthiasdebernardini adapt his sqlx+postgres db store and see how much of a hassle wrapping the sqlx db connection pool is.

crates/chain/src/sqlite.rs Outdated Show resolved Hide resolved
Previously, `Persist{Async}With::persist` can be directly called as a
method on the type (i.e. `Wallet`). However, the `db: Db` (which is an
input) may not be initialized. We want a design which makes it harder
for the caller to make this mistake.

We change `Persist{Async}With::persist` to be an "associated function"
which takes two inputs: `db: &mut Db` and `changeset`. However, the
implementer cannot take directly from `Self` (as it's no longer an
input). So we introduce a new trait `Staged` which defines the staged
changeset type and a method that gives us a `&mut` of the staged
changes.
Comment on lines +112 to +128

/// A wrapper that we use to impl remote traits for types in our crate or dependency crates.
pub struct Impl<T>(pub T);

impl<T> From<T> for Impl<T> {
fn from(value: T) -> Self {
Self(value)
}
}

impl<T> core::ops::Deref for Impl<T> {
type Target = T;

fn deref(&self) -> &Self::Target {
&self.0
}
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LLFourn would you be okay with this?

Also fix imports and rename `sqlite` module to `rusqlite_impl`.
@evanlinjin
Copy link
Member Author

Things to do later:

We need docs for how to impl PersistWith<CustomDb> for wallet and be able to construct wallet with the builder pattern.

Basically, the following two constraints need to be met for the associated types: PersistWith::LoadParams = bdk_wallet::LoadParams and PersistWith::CreateParams = bdk_wallet::CreateParams.

Copy link
Member

@notmandatory notmandatory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 2cf07d6

I primarily focused on the Wallet and persistence API changes and overall this is a big improvement. I added a few small nits, feel free to ignore them, they can be fixed after the beta release.

The comments from @ValuedMammal should be addressed before merging.

self.index.index_txout(outpoint, txout);
}

self.graph.apply_changeset(changeset.graph);
self.graph.apply_changeset(changeset.tx_graph);
}

/// Determines the [`ChangeSet`] between `self` and an empty [`IndexedTxGraph`].
pub fn initial_changeset(&self) -> ChangeSet<A, I::ChangeSet> {
let graph = self.graph.initial_changeset();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ small nit.

Suggested change
let graph = self.graph.initial_changeset();
let tx_graph = self.graph.initial_changeset();

@@ -89,21 +94,30 @@ where
pub fn apply_update(&mut self, update: TxGraph<A>) -> ChangeSet<A, I::ChangeSet> {
let graph = self.graph.apply_update(update);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ same small nit.

Suggested change
let graph = self.graph.apply_update(update);
let tx_graph = self.graph.apply_update(update);

ChangeSet {
tx_graph: graph,
indexer,
}
}

/// Insert a floating `txout` of given `outpoint`.
pub fn insert_txout(&mut self, outpoint: OutPoint, txout: TxOut) -> ChangeSet<A, I::ChangeSet> {
let graph = self.graph.insert_txout(outpoint, txout);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ same nit.

Suggested change
let graph = self.graph.insert_txout(outpoint, txout);
let tx_graph = self.graph.insert_txout(outpoint, txout);

ChangeSet {
tx_graph: graph,
indexer,
}
}

/// Insert and index a transaction into the graph.
pub fn insert_tx(&mut self, tx: Transaction) -> ChangeSet<A, I::ChangeSet> {
let graph = self.graph.insert_tx(tx);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ nit and same for rest of file.

Suggested change
let graph = self.graph.insert_tx(tx);
let tx_graph = self.graph.insert_tx(tx);

pub struct ChangeSet<K> {
/// Contains the keychains that have been added and their respective descriptor
pub keychains_added: BTreeMap<K, Descriptor<DescriptorPublicKey>>,
pub struct ChangeSet {
Copy link
Member

@notmandatory notmandatory Jul 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📖 docs need to be updated since K was removed no more K to descriptor mapping.


use super::{ChangeSet, LoadError, PersistedWallet};

/// This atrocity is to avoid having type parameters on [`CreateParams`] and [`LoadParams`].
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ I'd avoid using the word "atrocity" you're only doing what has to be done. 🙂

@LLFourn
Copy link
Contributor

LLFourn commented Jul 20, 2024

I am working on cleaning up a few things on this PR. I am not changing anything about the sqlite queries that are actually run. My main goal is to have Wallet, PersistWith trait implementation to be implemented by calling PersistWith impls on its component parts.

This is mostly done but I need another 24h to modify the API builder things that evan created to work with this new approach.

@LLFourn
Copy link
Contributor

LLFourn commented Jul 21, 2024

After quite some work I've decided to abandon the work I was doing for now. Although I made a lot of progress on making things nicer internally I don't think it's worth investing the time to finish the job -- in particular the tests are going to take me another few hours that I don't have right now.

In any case, this PR can be merged as there are no severe issues and it's a big step forward and only a few smalls steps back which can be addressed soonish.

I fixed the bugs discovered by @ValuedMammal above. I will leave testing them for now. The merge bug is no longer really expressible in the PR I was working on due to some improvements I made FWIW.

ACK: 64eb576


other comments:

While developing I very much turned against the PersistWIth trait definition. It was a good starting point but I think it's created an overwrought set of associated types and some think layers of abstraction.

As I discussed with @evanlinjin I think the long term solution here is for every method that wants to persist changes to take a &mut ChangeSet. Then it is the persistence backend job to just persist the changesets.

E.g.:

let db = ...;
let mut wallet = db.persist(|changeset| {
    Wallet::new(params, changeset)
});
// now i want to get an address
let address = db.persist(|changeset| {
    wallet.reveal_next_address(KeychainKind::External, changeset)
});

So the idea would be that .persist is taking the &mut changeset it passed into the FnOnce(&mut ChangeSet) you provided and persiting it after that. This would dramatically simplify the trait definitions for a persistence backend. Importantly the trait would be over the ChangeSet only and not the type you are persisting.

@notmandatory notmandatory mentioned this pull request Jul 21, 2024
30 tasks
@notmandatory notmandatory changed the title Flexible sqlite, rework persistence, refactor changeset, refactor wallet construction refactor(wallet)!: rework persistence, changeset, and construction Jul 21, 2024
Copy link
Member

@notmandatory notmandatory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re ACK 64eb576

@evanlinjin and @LLFourn thanks for moving the rusqlite impl code into it's own module and final cleaning for the issues @ValuedMammal found.

@notmandatory notmandatory merged commit 0c8ee1d into bitcoindevkit:master Jul 22, 2024
12 checks passed
@ValuedMammal ValuedMammal mentioned this pull request Aug 8, 2024
8 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api A breaking API change module-wallet
Projects
Archived in project
4 participants