Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reimplement multipart and make it stream the file upload #846

Closed
wants to merge 6 commits into from
Closed

Reimplement multipart and make it stream the file upload #846

wants to merge 6 commits into from

Conversation

paolobarbolini
Copy link
Contributor

@paolobarbolini paolobarbolini commented May 8, 2021

Hello, seeing how long #323 has been open I decided to give it a go myself. I should probably have asked in the issue or on Discord, but I really wanted to give this issue a shot, so here I am.

This rewrites the multipart filter to do all of the multipart decoding by itself, dropping the dependency on the multipart crate. This reduces the number of dependencies, and makes it so the body is streamed instead of having to read it entirely into memory (see #323).

No changes have been done to the public API. The only breaking change will be observed by implementations that try to read from Parts obtained before the last call to FormData::poll_next. In those cases an error will be returned.

The decoder is mostly zero copy. The only case in which data is copied is when the underlying body yields very small chunks, since the boundary searcher expects that two consecutive chunks have a combined length greater than the boundary length.

@jxs jxs added the waiting-on-author awaiting some action (such as code changes) from the PR or issue author. label Jun 10, 2021
@paolobarbolini paolobarbolini marked this pull request as ready for review June 12, 2021 05:40
@seanmonstar
Copy link
Owner

So, I'm beyond excited at the work here! Thanks so much, this is super cool. I'd really like to fix warp's multipart filters to use full streaming. I have two meta thoughts about this though:

  • Would this make sense as a separate crate, so that others could utilize it outside of warp?
  • To have a full multipart parser, it'd be a good idea to have a sufficient test suite testing everything about it.

@paolobarbolini
Copy link
Contributor Author

Would this make sense as a separate crate, so that others could utilize it outside of warp?

A separate crate could be interesting here. Looking around crates.io there are a few other implementations but they seem a bit bloated to me, so a new crate could change that. I could move this implementation to a separate crate and update the PR when I get around to publishing it.

To have a full multipart parser, it'd be a good idea to have a sufficient test suite testing everything about it.

That's definitely something that's missing now and that I would add before calling this PR ready or publishing a crate. Another advantage of having this be a separate crate could be avoiding bloating warp's tests.

@paolobarbolini
Copy link
Contributor Author

paolobarbolini commented Jun 27, 2021

I published the implementation as a separate crate and updated this PR to use it. Since I see you use the squash and merge GitHub feature I thought it would be fine to keep the commit history, so I just made a commit that removes the implementation from warp and wraps my crate instead.

@jxs jxs added waiting-on-reviewer Awaiting review from the assignee but also interested parties. and removed waiting-on-author awaiting some action (such as code changes) from the PR or issue author. labels Jul 22, 2021
@Eragonfr
Copy link

Is there an update on this, I feel like this PR gets some importance as it allow to drop multipart dependencies. As it is no longer updated and now it also gives a new warning in rust nightly

warning: the following packages contain code that will be rejected by a future version of Rust: buf_redux v0.8.4, multipart v0.18.0

The GitHub repo of multipart and buf_redux has been archived, as far as I know there is no current plan to update them.

@willclevine
Copy link

Following up on @Eragonfr 's comment. Is there any update on this? There is now a vulnerability in remove_dir_all that will go away if multipart is removed:

error[vulnerability]: Race Condition Enabling Link Following and Time-of-check Time-of-use (TOCTOU)
...
    = Solution: Upgrade to >=0.8.0
    = remove_dir_all v0.5.3
      └── tempfile v3.3.0
          └── multipart v0.18.0
              └── warp v0.3.3

@jaysonsantos
Copy link
Contributor

Hey folks, what do you think about using multer for multipart? It seems more up-to date and maintained than the other options.

@seanmonstar
Copy link
Owner

I tried fixing up the conflicts, and it just got annoying. I then tried to cherry-pick manually, which made it worse. So now, I have a new PR where I just copy and pasted the changes into the files, and made it a new commit. That's in #1031.

(Using multer would be fine, I tried for a few minutes, until I realized it only has async fns, which means I'd need to use stream::unfold and a BoxStream. I'd merge a new PR doing that, if someone wanted to do it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
waiting-on-reviewer Awaiting review from the assignee but also interested parties.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants