-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Streaming write for PrivateFile #163
Conversation
It can always be derived from `revision_key`. Storing it will only make it possible for `revision_key` and `content_key` to get out-of-sync.
instead of an encrypted set of Cids
So: - PrivateNodeHeader gets its own block - PrivateFile and PrivateDirectory refer back to the header via a CID - PrivateRef gets its own "disambiguation pointer" content_cid - PrivateForest now resolves PrivateRefs - PrivateRefs always refer to pre-existing content, never to "open slots"
instead of `RevisionKey` and `ContentKey`, respectively.
Also, make use of `Rc::make_mut`, accordingly.
(it had some good bugfixes regarding FinalizationRegistries recently)
For the file to test in the fixture, I went with a recording of Clara Schumann's Scherzo No. 2. The recording is from Luis Sarro and I downloaded it from musopen, so should be public domain. It's a classical music piece and Clara Schumann rocks (even though we've forgotten about her nowadays and her husband took the spotlight in today's history books). Honestly, I'm up for other ideas though! If anyone knows some cool fixture file to use that is >1MB and <10MB, let me know. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Closes #152
This is a minimal effort PR for importing big data into WNFS without insane memory requirements by streaming it in during file creation.
The incoming stream is split into private file chunks and encrypted & serialized one-by-one. Each chunk gets written before the next chunk gets fetched, thus this should take only roughly constant memory.
In the future we should look two more things:
PrivateDirectory::write
, where currently that only takes aVec<u8>
of bytes. We probably need to figure out something akin toPrivateDirectory::open
which returns aPrivateFile
, which can later be used inPrivateDirectory::write
(perhaps keeping the old version, too, because it's a nice shortcut for small files).PrivateFile
that implementsAsyncRead + AsyncWrite + AsyncSeek
. However, to make that performant on big files we need content-defined chunking.