-
Notifications
You must be signed in to change notification settings - Fork 318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Redesigning bdk_chain around a "ChainOracle" #895
Comments
I am not following closely lately so I may miss some pieces. I would like to understand if the following "sync mode" could be supported by this design. I have a completely offline device, I physically bring the full utxo-set at some height (eg. the file produced by I don't think not knowing the height is a big issue, but wanted to share it so that could be considered at this design stage. |
Hey @RCasatta. So in this case you already have a utxo set so all of this tooling is irrelevant. The thing you want is to be able to build transactions without having to use This issue which will soon be ported to the bdk repo is probably the best description of how this will work: LLFourn/bdk_core_staging#40 You'll probably need to provide a descriptor for each UTXO though. I think the coin selection stuff will just require you to know the witness weight. |
I have another usecase, similar to what @RCasatta has mentioned, but more tailored for the If we are running a pruned node, the initial wallet scan will need to depend on Does I jsut thought of the solution. The solution to this is to get the tip's blockhash before the scan, and ensure the blockhash still exists after our scan + after we have all the blockhashes that we need. Is there a better way to do this? |
Nope I think this is what we have to do. The correct long term solution IMO is to fix bitcoind and electrum because height by itself is very ambiguous. |
b799a57 [bdk_chain_redesign] Add tests for `IndexedTxGraph` with `LocalChain` (rajarshimaitra) 8cd0328 [bdk_chain_redesign] Implement `OwnedIndexer` for indexers (rajarshimaitra) 911af34 [bdk_chain_redesign] Fix calculation bugs. (rajarshimaitra) e536307 [bdk_chain_redesign] Fix `tx_graph::Additions::append` logic (志宇) f101dde [bdk_chain_redesign] Fix `tx_graph::Additions::append` logic (志宇) 1b15264 [bdk_chain_redesign] Change `insert_relevant_txs` method (志宇) ecc74ce [bdk_chain_redesign] Docs for `is_mature` and `is_confirmed_and_spendable` (志宇) ac336aa [bdk_chain_redesign] Make `insert_relevant_txs` topologically-agnostic (志宇) 165b874 [bdk_chain_redesign] Add test for `insert_relevant_txs` (志宇) f3e7b67 [bdk_chain_redesign] Various tweaks and fixes (志宇) 03c1283 [bdk_chain_redesign] Revert changes to `SparseChain` (志宇) 34a7bf5 [bdk_chain_redesign] Rm unnecessary code and premature optimisation (志宇) 6c49570 [bdk_chain_redesign] Rm `HashSet` from `TxGraph::relevant_heights` (志宇) 1003fe2 [bdk_chain_redesign] Test `LocalChain` (志宇) 7175a82 [bdk_chain_redesign] Add tests for `TxGraph::relevant_heights` (志宇) 8e36a2e [bdk_chain_redesign] Remove incomplete logic (志宇) 81436fc [bdk_chain_redesign] Fix `Anchor` definition + docs (志宇) 001efdd Include tests for new updates of TxGraph (rajarshimaitra) 10ab77c [bdk_chain_redesign] MOVE `TxIndex` into `indexed_chain_graph.rs` (志宇) 7d92337 [bdk_chain_redesign] Remove `IndexedTxGraph::last_height` (志宇) a7fbe0a [bdk_chain_redesign] Documentation improvements (志宇) ee1060f [bdk_chain_redesign] Simplify `LocalChain` (志宇) 611d2e3 [bdk_chain_redesign] Consistent `ChainOracle` (志宇) bff80ec [bdk_chain_redesign] Improve `BlockAnchor` docs (志宇) 24cd8c5 [bdk_chain_redesign] More tweaks and renamings (志宇) ddd5e95 [bdk_chain_redesign] Modify signature of `TxIndex` (志宇) da4cef0 [bdk_chain_redesign] Introduce `Append` trait for additions (志宇) 89cfa4d [bdk_chain_redesign] Better names, comments and generic bounds (志宇) 6e59dce [bdk_chain_redesign] `chain_oracle::Cache` (志宇) a7eaebb [bdk_chain_redesign] Add serde support for `IndexedAdditions` (志宇) c09cd2a [bdk_chain_redesign] Added methods to `LocalChain` (志宇) 7810059 [bdk_chain_redesign] `TxGraph` tweaks (志宇) a63ffe9 [bdk_chain_redesign] Simplify `TxIndex` (志宇) a1172de [bdk_chain_redesign] Revert some API changes (志宇) 8c90617 [bdk_chain_redesign] Make default anchor for `TxGraph` as `()` (志宇) 468701a [bdk_chain_redesign] Initial work on `LocalChain`. (志宇) 34d0277 [bdk_chain_redesign] Rm anchor type param for structs that don't use it (志宇) 3440a05 [bdk_chain_redesign] Add docs (志宇) 236c50f [bdk_chain_redesign] `IndexedTxGraph` keeps track of the last synced height (志宇) e902c10 [bdk_chain_redesign] Fix `apply_additions` logic for `IndexedTxGraph`. (志宇) 313965d [bdk_chain_redesign] `mut_index` should be `index_mut` (志宇) db7883d [bdk_chain_redesign] Add balance methods to `IndexedTxGraph` (志宇) d0a2aa8 [bdk_chain_redesign] Add `apply_additions` to `IndexedTxGraph` (志宇) 6cbb18d [bdk_chain_redesign] MOVE: `IndexedTxGraph` into submodule (志宇) 784cd34 [bdk_chain_redesign] List chain data methods can be try/non-try (志宇) 43b648f [bdk_chain_redesign] Add `..in_chain` methods (志宇) 61a8606 [bdk_chain_redesign] Introduce `ChainOracle` and `TxIndex` traits (志宇) 5ae5fe3 [bdk_chain_redesign] Introduce `BlockAnchor` trait (志宇) Pull request description: ### Description This is part of #895 The initial `bdk_chain` structures allowed updating to be done without blocking IO (alongside many other benefits). However, the requirement to have transactions "perfectly positioned" in our `SparseChain` increased complexity of the syncing API. Updates needed to be meticulously crafted to properly connect with the original `SparseChain`. Additionally, it forced us to keep a local copy of the "best chain" data (which may not always be needed depending on the chain source). The redesigned structures, as introduced by this PR, fixes these shortcomings. 1. Instead of `SparseChain`, we introduce the ability to `Anchor` a transaction to multiple blocks that may or may not be in the same chain history. We expand `TxGraph` to records these anchors (while still maintaining the *monotone* nature of `TxGraph`). When updating our new `TxGraph`, we don't need to complicated *are-we-connected* checks that we need for `SparseChain`. 2. The chain source, in combination with our new`TxGraph` is all that we need to determine the "chain position" of a transaction. The chain source only needs to implement a single trait `ChainOracle`. This typically does not need to be updated (as it is remote), although we can have a special `ChainOracle` implementation that is stored locally. This only needs block height and hash information, reducing the scope of *non-monotine* structures that need to be updated. **What is done:** * `TxGraph` includes anchors (ids of blocks that the tx is seen in) and last-seem unix timestamp (for determining which non-confirmed tx we should consider as part of "best chain" if there are conflicts). This structure continues to be fully "monotone"; we can introduce data in any manner and not risk resulting in an inconsistent state. * `LocalChain` includes the "checkpoint" logic of `SparseChain` but removes the `txid` data. `LocalChain` implements the `ChainOracle` trait. Any blockchain-source can also implement the `ChainOracle` trait. * `IndexedTxGraph` is introduced and contains two fields; an internal `TxGraph` struct and a `TxIndex` implementation. These two fields will be updated atomically and can replace the functionality of `keychain::Tracker`. **What is in-progress:** * ~@evanlinjin: The `TxIndex` trait should not have `is_{}_relevant` methods as we cannot guarantee this across all transaction indexes. We should introduce extension traits for these (#926 (comment) * ~@evanlinjin: `BlockAnchor` should be defined as "if this block is in chain, then this tx must be in chain". Therefore, the anchor does not provide us with the exact confirmation height of the tx. We need to introduce an extension trait `ExactConfirmationHeightAnchor` for certain operations (#926 (comment) **What will be done external to this PR:** * @rajarshimaitra: Persisting `indexed_tx_graph::Additions` (#937). * @notmandatory: Update examples to use redesigned structures. * Update `bdk::Wallet` to use the redesigned structures. ### Changelog notice * Initial implementation of the `bdk_chain` redesign (as mentioned in #895). Currently, only the `bdk_chain` core structures are implemented. Persistence and updates to the current examples and `bdk::Wallet` will be done in separate PRs. ### Checklists #### All Submissions: * [x] I've signed all my commits * [x] I followed the [contribution guidelines](https://github.com/bitcoindevkit/bdk/blob/master/CONTRIBUTING.md) * [x] I ran `cargo fmt` and `cargo clippy` before committing #### New Features: * [x] I've added tests for the new feature * [x] I've added docs for the new feature ACKs for top commit: rajarshimaitra: ACK b799a57 danielabrozzoni: Partial ACK b799a57 - good job! I haven't looked at the tests or at the methods implementations in depth, but I have reviewed the architecture and it looks good. I have a few non-blocking questions. Tree-SHA512: 8c386354cbd02f0701b5134991b65d206a54d19a2e78ab204e6ff1fa78a18f16299051bc0bf4ff4d2f5a0adab9b15658fa53cd0de2ca16969f4bf6a485225082
4963240 Add more `impl`s for `Append` and docs for file store `magic` (志宇) 2aa08a5 [persist_redesign] Introduce redesigned `persist` types (志宇) Pull request description: ### Description This is part of #895 and #971 * Introduce a more generic version of the `keychain::persist::*` structures that only needs a single generic for the changeset type. Additional changes: * The `Append` trait has a new method `is_empty`. * Introduce `Store` structure for `bdk_file_store` (which implements `PersistBackend`). ### Changelog notice ### Checklists #### All Submissions: * [x] I've signed all my commits * [x] I followed the [contribution guidelines](https://github.com/bitcoindevkit/bdk/blob/master/CONTRIBUTING.md) * [x] I ran `cargo fmt` and `cargo clippy` before committing #### New Features: * [x] I've added tests for the new feature * [x] I've added docs for the new feature Top commit has no ACKs. Tree-SHA512: 0211fbe7d7e27805d3ed3a80b42f184cdff1cebb32fd559aa9838e4a7f7c7e47b6c366b6ef68e299f876bafed549b8d1d8b8cc0366bf5b61db079504a565b9b4
Should be fixed in #976 |
@LLFourn can we close this issue and/or move remaining redesign into separate issues? |
fixed in #976 |
…o scantxoutset output 17845e7 rpc: add utxo's blockhash and number of confirmations to scantxoutset output (Luis Schwab) Pull request description: This PR resolves #30478 by adding two fields to the `scantxoutset` RPC: - blockhash: the blockhash that an UTXO was created - confirmations: the number of confirmations an UTXO has relative to the chaintip. The rationale for the first field is that a blockhash is a much more reliable identifier than the height: > When using the scantxoutset RPC, the current behaviour is to show the block height of the UTXO. This is not optimal, as block height is ambiguous, especially in the case of a block reorganization happening at the same instant of the query. In this case, an UTXO that does not exist would be assumed to exist, unless the chain's tip hash is recorded before the scan, and make sure it still exists after, as per bitcoindevkit/bdk#895 (comment) comment by evanlinjin. The second one was suggested by maflcko, and I agree it's useful for human users: > While touching this, another thing to add could be the number of confirmations? I understand that this wouldn't help machine consumers of the interface, but human callers may find it useful? This will yield an RPC output like so: ```diff bitcoin-cli scantxoutset start "[\"addr(bc1q5q9344vdyjkcgv79ve3tldz4jmx4lf7knmnx6r)\"]" { "success": true, "txouts": 185259116, "height": 853622, "bestblock": "00000000000000000002e97d9be8f0ddf31829cf873061b938c10b0f80f708b2", "unspents": [ { "txid": "fae435084345fe26e464994aebc6544875bca0b897bf4ce52a65901ae28ace92", "vout": 0, "scriptPubKey": "0014a00b1ad58d24ad8433c56662bfb45596cd5fa7d6", "desc": "addr(bc1q5q9344vdyjkcgv79ve3tldz4jmx4lf7knmnx6r)#smk4xmt7", "amount": 0.00091190, "coinbase": false, "height": 852741, + "blockhash": "00000000000000000002eefe7e7db44d5619c3dace4c65f3fdcd2913d4945c13", + "confirmations": 882 } ], "total_amount": 0.00091190 } ``` ACKs for top commit: sipa: utACK 17845e7 Eunovo: ACK 17845e7 tdb3: ACK 17845e7 Tree-SHA512: 02366d0004e5d547522115ef0efe6794a35978db53dda12c675cfae38197bf43f0bf89ca99a3d79e3d2cff95186015fe1ab764abb8ab82bda440ae9302ad973b
BDK currently depends heavily on a
SparseChain
to tell it whether transactions are in the chain or not. This proposal is to redesign it around a more genericChainOracle
trait which can tell you whether a(height, block_hash)
is in the chain or not and use this information to decide whether a transaction is in the chain. We can do this because for each transaction we can always know a block header and hash that implies that it is in the chain.A
SparseChain
can also fill the role of aChainOracle
but this change would allow us use other sources of chain data to fulfil the need. For example with our pendingnakamoto
cbf implementation we could use the on-chain block headers as theChainOracle
.This proposal is to remove
SparseChain
as a core mechanism and instead pass in aimpl ChainOracle
into methods that need to know about the current state of the chain.Removing transactions from
SparseChain
I think we should remove transactions from the sparse chain. Without needing to "re-org" the chain there is no real need to have transaction location so closely coupled to the blocks.
Instead of storing the "position" of a transaction within
SparseChain
we could easily store it inTxGraph
in a monotone way. Imagine a field like this inTxGraph
:Then when you want to figure out if
txid
is in the chain you can query the heights and hashes that it has been seen in and see if any of those blocks are in the canonical chain.Note that we don't need this position to be the blockhash the transaction is in. We could call this the "anchor block" -- if it's in the chain then this txid is in the chain. This means we don't need to make additional queries to electrum/esplora to figure out the block hash that it's actually in.
So maybe it could be more like this:
where
A
does theAnchor
trait which has one method to return the block hash and height whose presence implies the existence -- it may be the height the transaction is actually at or it may be simple the tip that implies its existence. This allows different applications to store different data about their transaction's positions.Indexing the chain data
Now that the chain is out of the way and outside of the transaction system we can put the indexing of transactions next to the graph.
Getting a UTXO set
How do we get a a UTXO set from a set of TxOuts (and outpoints) owned by the wallet, a transaction graph and information about the blocks in the longest chain. The first thing to note is that we can built a predicate
is_in_chain(chain: &_, tx_graph: &TxGraph, txid)
from the information about the longest chain. We simply iterate through all positions of the txid (stored inTxGraph
) and return true if one ofthose blocks exist in the longest chain. Keep in mind that the longest chain can be modelled a
SparseChain
or could be a header chain etc.The naive algorithm
Given
is_in_chain
we can easily create a predicateis_unspent
for anyOutPoint
. Withis_unspent
we can just call it on everyOutPoint
in that we own to find a utxo set.Note that this algorithm has complexity of
O(T)
whereT
is the number of txouts you've ever owned. It also makes multiple calls tois_in_chain
per outpoint. This might be slow especially if you have to do disk access or query an external service. To make it a bit faster you could cache the result ofis_in_chain
and apply positive results to each of the transaction's parents (a child being in chain implies a parent).Optimized algorithms for later development
There should be faster algorithms for discovering this that don't require iterating over the entire txout set. For example, the
TxGraph
could index the set of UTXOs. From there you can visit each one and check if the tx containing it is in the chain. If so, add it to the UTXO set, if not add the parent outputs of the transaction to the candidates list.How to handle unconfirmed transactions
We can model unconfirmed transactions as those with no anchors to blocks that are in the chain. We of course need to distinguish between those unconfirmed transactions that are invalid because they double spend (or their parents double spend) a tx that is in the chain and those that could be in the mempool.
To the naive algorithm above, when you encounter a output that is not in the chain you can add it to "unconfirmed" list and then from the unconfirmed list decide which of these outputs can exist in the mempool by resolving conflicts in the graph.
Work plan
While writing this document I discovered how I felt about priorities:
TxGraph
but also has the "anchor" positions. It could just be done inTxGraph
itself (e.g.TxGraph<Anchor>
).ChainOracle
trait which has a single method to return whether a tx is confirmed given a (block hash, height).*_in_chain
functions that exist onChainGraph
toTxGraph
but take a something implementing theChainOracle
trait.TxGraph
and something implementing theTxIndexer
trait.Every time a tx is inserted into this type it will add it to the underlying graph and the
TxIndexer
(e.g.IndexedTxGraph<Anchor, Indexer>
). ImplementTxIndexer
forKeychainTxOutIndex
andSpkTxOutIndex
.KeychainTracker
to an impl ofIndexedTxGraph<Anchor, KeychainTxOutIndex>
. wdFor things that need chain data like
full_utxos
you'll need to pass in aChainOracle
.SparseChain
without the transactions (SparseChain2
). ImplementChainOracle
forSparseChain2
.TxGraph<A>
and aSparseChain2
as their updatebdk
's Wallet around thisSparseChain
(or rather renameSparseChain2
toSparseChain
),ChainGraph
andKeychainTracker
Downsides
The downsides of this idea are:
It's harder to use the information about what's in the mempool from esplora and electrum. Right now we are able to evict transactions that conflict with things that we find in the mempool from the
SparseChain
. With this change we don't really have a way to use this information so which of the conflicting unconfirmed txs you choose as canonical will be somewhat arbitrary. To fix this we could consider having the "last-time-seen" in mempool as part of the TxGraph position for that. This can be monotone so whenever you see it again the in the mempool you replace the existing value if it's larger. Then when you are trying to figure out who's in the mempool and who isn't (from your limited view) you can choose the one that was seen most recently between two conflicts.It's hard to see how this improves
Wallet
. This gives much greater flexibility to users ofbdk_chain
but I'm not sure if we'll be able to realize this improvements for users ofWallet
. To do so we'd have to not model the canonical chain inside wallet and leave it to the user to provide aimpl ChainOracle
. The biggest problem here is that if the user was using aSparseChain
they'd have to persist changes to it separately which muddies the "batteries included" design goal ofWallet
.Some
ChainOracle
's will have to returnResult<bool, _>
rather than justbool
for whether the block hash and height are in the chain. Some of them might even want to be async! So we're kind of back with offering multiple method for each method that needs to know what's in chain e.g.balance
,try_balance
andtry_balance_async
!The text was updated successfully, but these errors were encountered: