Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bittorrent support for txhashset zip download #2740

Closed
ignopeverell opened this issue Apr 10, 2019 · 10 comments · Fixed by #2813
Closed

Bittorrent support for txhashset zip download #2740

ignopeverell opened this issue Apr 10, 2019 · 10 comments · Fixed by #2813

Comments

@ignopeverell
Copy link
Contributor

I've had this in mind for quite a while and was actually the main reason to opt for a simple package file. The idea is simply to have peers download the file over bittorrent to speed it up, instead of chunking using some file-specific formats (i.e. kernels, then outputs, etc). The risk brought by a dependency against a bittorrent client or library (https://github.com/Luminarys/synapse or https://github.com/GGist/bip-rs come to mind) can be limited by simply running the download in a different process.

Note that this would also require some sort of regular snapshotting of the state. Right now a client asks for the txhashset at a specific block and if another one came a couple blocks after, another archive would have to be produced. But there is no reason for it to work this way, archives could just be produced on demand and reused over 500 or 1000 blocks.

@antiochp
Copy link
Member

antiochp commented Apr 10, 2019

Related - #2743 may be redundant now 😞...

I think there is still a case to be made for splitting the txhashset into two separate downloadable payloads -

  • UTXO set (outputs and rangeproofs)
  • Kernels

The reason I'm arguing this is because of how different these two chunks of data behave - specifically one is prunable and compactable (outputs + rangeproofs) vs. the kernels which are not.
Kernel download can be simplified significantly (see #2743) due to the unprunable nature of the data - we do not need the hash file, just the data file.

tl;dr Are we open to considering two (separate but related) chunks of data downloaded via bittorrent?

@ignopeverell
Copy link
Contributor Author

ignopeverell commented Apr 10, 2019

I don't disagree and I think #2743 is still good to have. Torrenting 2 files (or more) vs one doesn't make much difference. I also opened this issue to start discussion, not necessarily to push a solution forward.

@antiochp
Copy link
Member

👍 Just wanted to make sure #2743 wasn't going down the wrong path. I think a couple of torrents would actually make a lot of sense - the kernel torrent being simply the pmmr_data.bin file at some predefined set of "snapshot" set of block heights (every 1,000 blocks say).

I'm not that familiar with bittorrent but I'm assuming if I torrent a file (say kernels up to height 100,000) and you independently build the "same" file then they would both generate the same file hash and be seen as duplicates?

@ignopeverell
Copy link
Contributor Author

Yes, that's correct. Bittorrent also has an envelop format so you can download the files all at once or individually.

@cadmuspeverell
Copy link
Contributor

@ignopeverell I fail to understand how bittorrent gains us anything. How would we maintain consensus on the metadata in what would be our equivalent of a torrent file? Torrents just give us a structure committed to by cryptographic hashes. We already have such a structure: Merkle Mountain Ranges.

I spent the last few days thinking about this, and I believe there are a number of problems with the approach suggested here. They may be solvable problems, and perhaps I'm overthinking everything, but I'm quite confident we can make parallel downloads work without adding a bittorrent dependency. To avoid writing a book here, I will make myself available in the dev gitter channel and via e-mail if anyone wants to discuss in more detail.

@ignopeverell
Copy link
Contributor Author

ignopeverell commented May 26, 2019

Sorry for the late reply. It just seems that making our own parallel download would just be duplicating a whole bunch of bittorrent functionalities as well as re-inventing our own protocol for that. But I'd be happy to be proven wrong.

Regarding consensus on the metadata, I'd note that the current protocol does not do that either. The archive downloaded has no hash that we commit to. What's committed to are the resulting trees, once the archive is processed.

@hashmap
Copy link
Contributor

hashmap commented May 26, 2019

We may need a subset of bittorrent functionality(I’m not sure, but we have peer discovery already and we basically need to download one resource), in this case we can have much simpler protocol hence a smaller attack vector. We had a few issues with zip format.

@cadmuspeverell
Copy link
Contributor

It just seems that making our own parallel download would just be duplicating a whole bunch of bittorrent functionalities as well as re-inventing our own protocol for that.

Then why commit to 4 mmr roots in the header if you don't intend on taking full advantage of their cryptographic properties? Just commit to the merkle root of the 4 roots and save yourself 96 bytes a minute if you don't intend on using them individually.

Regarding consensus on the metadata, I'd note that the current protocol does not do that either. The archive downloaded has no hash that we commit to. What's committed to are the resulting trees, once the archive is processed.

I've had state verification fail before which is frustrating now but will only get a lot worse as Grin ages and its state grows. I submitted a proposal for a way we can verify as we download, it will scale way better that way. There were concerns posed that are worthy of a response when I get the time to answer thoroughly, although I don't see anything mentioned that would be unsolvable. #2837

@hashmap
Copy link
Contributor

hashmap commented Jun 6, 2019

#2813 was just one step in that direction, reopening

@lehnberg
Copy link
Collaborator

Closing this, as we're going in a different direction now (#3471)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants