Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: split state validation status into kernel and rproof updates. #3096

Merged
merged 11 commits into from
Nov 17, 2019

Conversation

JosephGoulden
Copy link
Contributor

Fixes #3084

I split out the sync validation status into kernel and range proof validation and added these as extra sync steps. Maybe this is too much information but I feel its nice to see progress being made when syncing and to help understand what's going on.
Also the percentage completion for range proofs and kernels is accurate from 0-100%.

There are a couple of things I'm not sure about though and could use some help.

  1. The way I fixed the status was to add a function to the backend PMMR to get the leaf_set length, instead of using pmmr::n_leaves. Is this okay? I don't understand why the two functions give different results. Can't we just use the length of the leaf_set in all places instead of calculating leafs based on size in pmmr::n_leaves.

  2. Breaking change in the server API - TxHashsetValidation replaced by TxHashsetRangeProofsValidation and TxHashsetKernelsValidation. Is this okay for a major version increment like 3.0.0 or do we need to depreciate first?

@antiochp
Copy link
Member

The way I fixed the status was to add a function to the backend PMMR to get the leaf_set length, instead of using pmmr::n_leaves. Is this okay? I don't understand why the two functions give different results.

Notice that pmmr::n_leaves() does not operate on any specific pmmr instance. It is simply a way of determining how many leaves are in an MMR (any MMR) of a particular size.
i.e.

  • An MMR with 3 elements in it will always have 2 leaves (peak at height 1 with 2 leaves under it).
  • An MMR with 4 elements in it will always have 3 leaves (peak at height 1, peak at height 0).

The leaf_set is not generic in this way - it represents a particular MMR instance, specifically one that can be pruned, with leaves being removed.
So the leaf_set represents a sub set of all leaves for that specific MMR that have not been removed. For the output MMR this represents the "unspent" outputs. All other outputs have been spent (removed) and potentially pruned away.

In the simple examples above we may have an MMR of size 4 (3 leaves) where we only have a single leaf position in the leaf_set. Say at the last pos 4 in the MMR, with outputs at pos 1 and pos 2 having been removed.

   3                      3
  /\            ->       /\
 1  2  4                .  .  4

Note: outputs can be removed but not yet pruned because we need to support "rewind" in a fork situation where we undo a block and reapply a potentially different set of transactions to the chain state.
We may have pruned the outputs and pos 1 and pos 2 in the example above, leaving only the peak hash at pos 3. But we may have not yet pruned them, they may simply be removed (removed from the leaf_set but the data is still present in the underlying MMR files).

Can't we just use the length of the leaf_set in all places instead of calculating leafs based on size in pmmr::n_leaves.

Not necessarily because we do not always want to know the number of leaves in the entire MMR. We have many instances where we need to know the number of leaves in a particular subtree or beneath a particular peak within an MMR.
And for the entire MMR we often need to know what the number of leaves would be if the MMR was not pruned (so we can calculate offsets etc into the underlying files) and we cannot use the leaf_set for this.

Hope this makes sense. The internals of the MMR data structure are pretty complex. But this complexity is worth it. The immutable, append-only semantics that it gives us though are very well aligned with the data we need to store on disk.

@@ -55,6 +55,9 @@ pub trait Backend<T: PMMRable> {
/// Iterator over current (unpruned, unremoved) leaf positions.
fn leaf_pos_iter(&self) -> Box<dyn Iterator<Item = u64> + '_>;

/// Number of leaves
fn n_leafs(&self) -> u64;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should think about a better name for this. Its not the number of leaves. Its the number of "not removed" leaf positions but that's a pretty awkward name.

@@ -192,7 +192,8 @@ impl SyncRunner {
match self.sync_state.status() {
SyncStatus::TxHashsetDownload { .. }
| SyncStatus::TxHashsetSetup
| SyncStatus::TxHashsetValidation { .. }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@quentinlesceller What implications are there for changing the set of sync states here? Is this something we can safely do or should we maintain the existing set of states?

Copy link
Contributor Author

@JosephGoulden JosephGoulden Oct 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could add the TxHashsetValidation status back to maintain compatibility for any consumers? Was thinking we could remove it in a major release.. 3.0.0

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If think it's pretty safe to move forward and add this two new state. That could potentially break wallet using the status if they are doing strong type check but I presume it's okay for 3.0.0.

store/src/pmmr.rs Outdated Show resolved Hide resolved
Copy link
Member

@antiochp antiochp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment for a (hopefully useful) explanation of n_leaves

I like this - but there are a couple of things to resolve.

@antiochp antiochp added this to the tentative 3.0.0 milestone Oct 23, 2019
@lehnberg lehnberg modified the milestones: tentative 3.0.0, 3.0.0 Oct 30, 2019
@lehnberg lehnberg added the P3: Fix if time Nice to have, does not block release label Oct 30, 2019
@antiochp
Copy link
Member

Just testing this locally and the tui is just showing the following during rangeproof and signature validation -

Sync step 3/7 - Preparing chain state for validation

I don't see it progress to 4/7 or 5/7.

@JosephGoulden
Copy link
Contributor Author

Just testing this locally and the tui is just showing the following during rangeproof and signature validation -

Sync step 3/7 - Preparing chain state for validation

I don't see it progress to 4/7 or 5/7.

I think this might be broken on master. Seems to be around getting header stats whilst txhashset validating/rebuilding is going on. Looking into a fix.

@JosephGoulden
Copy link
Contributor Author

JosephGoulden commented Nov 14, 2019

I think it was broke by this #3045

Not sure how to get the status now. There is a quite long lived write lock on pmmr_header which prevents us from reading when updating the stats.. will have a think.

@JosephGoulden
Copy link
Contributor Author

@antiochp this is working now.

Another potential future improvement would be to have a separate state for catching up on the latest blocks, rather than flicking between sync state 1 and 7 (or 1 and 4 as it is now).

Copy link
Member

@quentinlesceller quentinlesceller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just synced from scratch with it. Worked great. Awesome work @JosephGoulden. Minor comments below.

api/src/handlers/server_api.rs Outdated Show resolved Hide resolved
chain/src/types.rs Outdated Show resolved Hide resolved
@JosephGoulden
Copy link
Contributor Author

Fixed those, thanks @quentinlesceller .

@hashmap hashmap merged commit 6d864a8 into mimblewimble:master Nov 17, 2019
@antiochp antiochp mentioned this pull request Dec 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement P3: Fix if time Nice to have, does not block release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[TUI] Initial sync status/progress could be improved
5 participants