-
Notifications
You must be signed in to change notification settings - Fork 990
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generate txhashset archives on 720 block intervals. #2813
Conversation
Finally, all 3 brothers are here 😃 👍 |
Hello brother. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think once this is parameterized we may want to extend this to a 24 hour period (1,440 blocks) rather than 250 blocks (at least for mainnet and floonet).
We have attempted to keep all "time" based values to round numbers in terms of hours, days, weeks etc rather than 100s, 1000s of blocks.
It might be less intuitive in the code but it makes it easier to reason about ("time is money").
For an extreme example - https://www.grin-forum.org/t/wtb-one-hour-in-atomic-swap-for-1-8-btc/4831 where John Tromp is attempting to purchase an "hour" of grin...
This is ready for review whenever you get a chance @antiochp |
@@ -251,15 +251,17 @@ impl MessageHandler for Protocol { | |||
sm_req.hash, sm_req.height | |||
); | |||
|
|||
let txhashset = self.adapter.txhashset_read(sm_req.hash); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to think this through a bit. Our peer sends us a height and hash (the sm_req
) but we are now effectively ignoring this and just sending the most recent snapshot.
I wonder if we need to take the height send by our peer and quantize this to the nearest previous snapshot (i.e. the closest 720
snapshot to that height).
Alternatively maybe we should be ignoring the values passed in by our peer - all we care about is serving the most recent snapshot?
We just need to be careful that our peer receives consistent data in terms of -
- the full set of headers and
- the corresponding txhashet archive
It is critical that these are on the same fork. If for whatever reason they are on a fork (i.e. another peer sent them the "wrong" latest headers) then they need to be able to rewind headers to the point where this txhashset exists on the chain. If they cannot then they cannot recover from this and the full txhashset is wasted.
I don't think this is a problem in practice because the txhashset is at least n blocks old (based on the horizon).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about it for a while, but I don't think we would want to provide them any other txhashset. I considered verifying that the requested header is within an expected range, but it seemed unlikely enough to not justify any additional complexity.
I need to think through the horizon behavior here a bit more but looks good so far. |
Sure, take your time. This does effectively change the horizon temporarily for newly synced nodes. Once most nodes have this change, it might make sense to just make txhashset_archive_header the new cut_through_horizon. Perhaps as part of the first hard fork? |
Currently we make some guarantees/assumptions about what blocks are available from peers. We maintain 7 days of full block history based on So every running node has at least 48 hours of full blocks (after successful sync). One of the edge cases we need to account for here is when we ask for full blocks during fast sync from another node that has recently sync'd. i.e. Other nodes (possible a majority of nodes on the network) are only guaranteed to have blocks from 48 hours ago. Currently this is exactly 48 hours ago. So there are a couple of conflicting requirements here pulling in competing directions -
So currently we are kind of stuck doing the same thing as all other peers and setting our sync_threshold to 48 hours. If we were to deploy a node with different behaviour by for example using more of a "step function" in terms of 740 block periods then we risk being a bad citizen with respect to these conflicting requirements. We either have too few blocks locally (temporarily) or we are asking for too many blocks and cannot reliably retrieve them (if our peers are also syncing). |
The way it's coded in this PR, it would be the latter. This was coded that way intentionally, for the reasons you listed above. Although initial sync might be delayed very slightly if it requests blocks from a peer that just synced, it shouldn't be a large problem because the significant majority of peers would have more than 48 hours worth of blocks. |
@cadmuspeverell thanks for your work and how are you doing? I would like to take a look and test this PR and could you please solve the conflict? thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Q) If we only produce a new txhashset every 12 hours - do we now reuse one already generated if we are asked for it more than once in a 12 hour period? I don't remember if we implemented de-duping behavior before (we had a smaller window before). |
Yep! That logic was already there - and seemed to work well when I was doing my testing. I'll resolve that darned conflict now and then maybe @garyyu can try it out to make sure I didn't miss anything? 😸 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As of v0.5.3, Grin++ gracefully handles receiving a different TxHashSet height than requested. Since grin already handles this, merging this code shouldn't adversely affect the network. I didn't test every scenario, but I did run the code for a day or so, and didn't see any unusual bans. There were only ever 2 or 3 zips in the temp folder, even though dozens of peers synced from my node. I'm happy with it.
After my 2nd reading everything is good on my view, and thanks @cadmuspeverell for this 1st PR 👍 and a long review :-) |
Looks like this PR was merged into If we do need to push I'm not 100% sure if this is the approach we will always take but we have some tight time constraints between |
but it was tagged as 1.1.1 milestone before, you can see above history. |
@garyyu There is no 1.1.1 milestone anymore, in recent meetings we've collectively decided that the next release will be 2.0.0, and that there will only be a 1.1.1 if there is a critical issue uncovered that warrants it. Also, given the time constraints we should only be pushing PRs that are hard-fork related to the branch, and labeling any other fixes or changes as post 2.0.0 (i've created a 2.x.x milestone for that) Sorry this wasn't communicated properly as there have been one or two extra meeting schedules over the past days, but we really feel it's important that everyone stick to this plan for the next few weeks. |
* generate txhashset archives on 250 block intervals. * moved txhashset_archive_interval to global and added a simple test. * cleaning up the tests and adding license. * increasing cleanup duration to 24 hours to prevent premature deletion of the current txhashset archive * bug fixes and changing request_state to request height using archive_interval. * removing stopstate from chain_test_helper to fix compile issue
This is the first of several PRs to resolve #2740. With this PR, txhashset archives will only be generated for blocks 250, 500, 750, etc. This inadvertently solves #2806 as well.
Marked as WIP to encourage review and discussion while I write tests.