-
Notifications
You must be signed in to change notification settings - Fork 1.2k
feat: non-bufferring multipart body encoder #3151
Conversation
|
Reminder to myself to include tests discussed in #3138 here |
|
@hugomrdias there one issue that I'm not sure how to resolve. It appears that |
|
All tests except the example one (that also fails on master) are passing now. I think this is ready for the review. |
This comment has been minimized.
This comment has been minimized.
|
The test was failing due to a temporary infrastructure problem. All good now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If merged, this is great performance win for browser devs, users, but also for IPFS Desktop users (ipfs/ipfs-webui#1529).
Side concerns:
- I am worried about added complexity and potential regression in the future.
Are we able to add tests/benchmarks that safeguard browser-related improvements? - As noted during our review call, problematic metadata is only supported by js-ipfs, we may want to look into tweaking HTTP API separately from this PR, to fix it before go-ipfs implements it.
That worries me as well. I am also not happy with increased complexity. Only other way I can imagine going about this (that would not involve API changes) is to have a
I think there is opportunity to simplify this approach a bit by using our custom types instead of
I was trying to come up with some approach here, e.g.
However I do not think we can not have a way to tell if browser did any buffering or not. Only thing I could come up with is to generate fragment of data from echo server stop writing until corresponding put occurs on other endpoint. However that is really complex and we need to go through some hoops. There is also no guarantee that browser doesn't read say 2 two chunks at a time. I think better strategy is to test that when we put in blobs (and alike) what we get on the other end is blobs (not objects with async iterate content). That is a lot easier to test and is free from breaking when browser changes (e.g. how much it fetches before it starts upload). |
|
Added more tests to ensure that result of
There is the caveat, this will not catch all regression e.g. if for some reason |
|
Test are failing now due to #3169 |
|
I had a conversation with @achingbrain earlier today and we have decided:
I think it might also make sense to factor out introduced |
|
Externalized File and Blob implementations. |
|
I've merged /pull/3184 in favour of this. I hope that it's taken on some of the good ideas from this PR. It bums me out a little, because you've clearly spent a lot of time and effort on this, but ultimately I think requiring people to use non-standard Blob/FormData/etc implementations to use our HTTP API is a step too far, and taking on the long-term maintenance burden of those custom implementations is not something we should be doing given the available dev capacity. |
Status
Overview
Normalization
Before
normaliseInputused to normalize arbitrary input taken byipfs.addintoAsyncIterable<FileObject>whereFileObjectis:There was (implicit) invariant that if
FileObjectdoesn't havecontentit represents a directory.However representing
contentasAsyncIterable<ArrayBufferView|ArrayBuffer>is what lead to buffering in the browser asfetchstill does support stream body.After
This patch changes
normaliseInputto produce a different output:AsyncIterable<ExtendedFile|FileStream|Directory>whereDirectoryis just likeFileObjectand does not havecontent.ExtendedFilerepresents aFileObjectwith known sizeFileFileis used in nodeBlobis used in nodemtime,modeandpathproperties (assumed by ipfs-unixfs-importer).contentgetter which returnsAsyncIterable<Uint8Array>of it's parts, which creates compatibility withFileObjectinterface.FileStreamis just likeFileObjectthat does have acontent.AsyncIterable<*>ExtendedFilebecausemultipartRequestcan't add it to theFormDatawithout buffering it's body, while it can do that withExtendedFile.Multipart Encoder
New
FormDataEncoderclass was added that provides can encodeAsyncIterable<Part>intoAsyncIterable<BlobPart>representing body of the multipart request, wherePartis:to-streammodule had being replaced byto-bodywhich turnsAsyncIterable<BlobPart>to readable stream on node and intoBlobin browser.With above pieces in place
multipartRequestnowAsyncIterable<ExtendedFile|FileStream|Directory>AsyncIterable<Part>(and ensures thatExtendedFileis passed as content instead of passing it's content, to avoid buffering)AsyncIterable<BlobPart>viaFormDataEncoder.toBody(that in node produces readable stream and in browser produces blob).Result
ipfs.addcan continue usingnormalizeInputas changes to it should be API (backwards) compatible.ipfs-http-client on node should continue using streams. Only thing that changed there is that some inputs are turned into
Blobs instead ofAsyncIterators but during form data encoding all gets flattened anyway.ipfs-http-client in browser will not buffer as long as input passed in isn't a stream and will fall back to buffering otherwise. E.g.
ipfs.add([ 'hello', await (await fetch(url)).blob(), { path: '/foo/bar', content: droppedFile } ])will not incur bufferingipfs.add([ 'hello', { path: '/foo', content: droppedFile.stream() }, await (await fetch(url)).blob() ])will only buffer content's of thedroppedFileand use other pieces as is.I am not super happy with complexity of all this, nor with the fact that user can accidentally fall of happy path and incur buffering but I do not believe there is a better option without changing an API.
attempt to fix #3029