Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for multipart/mixed content type #194

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

npryce
Copy link

@npryce npryce commented Apr 18, 2018

No description provided.

Copy link
Contributor

@dougwilson dougwilson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thank you so much! There are a few things I don't understand though if you can clarify since the PR has no description:

  1. Can you describe the background to why you're making the PR? This module is just a form-data module so if we're going to expand the scope, just want to understand better what additional support I'm going to take on.
  2. Nothing in the parsing was changed. Are you suggesting that the parsing of mixed is identical in every way to form-data?
  3. It looks like you tried to add a test, but it is just another form-data body. Can you add mixed bodies to test against?
  4. Can you update the readme?

Thanks!

@npryce
Copy link
Author

npryce commented Apr 18, 2018

Currently the module rejects any multipart content that is not multipart/form-data or multipart/related. We need to write a service that receives data from a client that posts data with the `multipart/mixed' content type. Multiparty is the only JS library we could find for parsing multipart data that exposes part headers, which are required for resolving CID URL links between parts.

Why multipart/mixed? Multipart/related is intended for content in which the parts are organised into a hierarchy, with parent/child relationships represented by CID URLs and Content-ID headers (the content type can have an optional start parameter that identifies the root of the hierarchy). Multipart/mixed is for content that contains multiple parts that are not organised into a hierarchy, but may be interlinked.

All multipart content types have the same basic format, but different content types indicate different semantics of how a receiver is meant to interpret the parts. So no changes to the parser code is required to support multipart/mixed.

I can add a test for mixed bodies, but the content type doesn't need different parsing logic, so I picked a single simple example to run through with the multipart/mixed content type to avoid duplication. I looked into making all the tests for success cases run for all supported content types, but considered it to be too much work (given my time constraints) and too large a change to send in one PR.

@dougwilson
Copy link
Contributor

I'm not sure it is true that they are all the same. #175 seems to say otherwise and we were going to drop the related one since it doesn't actually work with this module unless someone is willing to make the parser changes so it works. Can you link me to the specification for mixed?

@npryce
Copy link
Author

npryce commented Apr 18, 2018

Multipart/mixed is defined here: https://tools.ietf.org/html/rfc2046#section-5.1.3

@dougwilson
Copy link
Contributor

Awesome, thanks! I'll read over this this weekend and get back to you 👍

@npryce
Copy link
Author

npryce commented Apr 18, 2018

Thanks.

README.md Outdated
@@ -6,7 +6,7 @@
[![Build Status][travis-image]][travis-url]
[![Test Coverage][coveralls-image]][coveralls-url]

Parse http requests with content-type `multipart/form-data`, also known as file uploads.
Parse http requests with content-type `multipart/form-data`, also known as file uploads, `multipart/mixed` or `multipart/related`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please not add multipart/related? As I referenced, we don't correctly support it and will remove this type.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

@dougwilson
Copy link
Contributor

Another question for you real quick is how can I replicate making the requests from the browser upload? I'm curious for two reasons: (1) to make sure I understand how to reproduce for future support requests and also (2) because the way this module is currently parsing filenames is actually completely off the spec because web browsers didn't actually correctly follow it so we compensated through trial and error. Does the same broken parsing apply to mixed as well, or should we have mixed follow the spec?

@npryce
Copy link
Author

npryce commented Apr 18, 2018

To test our use of Multiparty, I generated a multipart/mixed request in Java using the MultipartRequestEntity class of Apache HttpClient. Multiparty parsed it perfectly, the filenames were parsed correctly, and content ids were accessible via the part headers, which was the key feature we needed.

@dougwilson
Copy link
Contributor

Can you share the Java code and how I can use it? What happens if your filename has "%22" in the name with the mixed type? Does it parse correctly? I would assume it should parse through as %22. For example parse your mixed response with some software that currently parses mixed and see if it sees the filename the same way this module does.

@npryce
Copy link
Author

npryce commented Apr 18, 2018

I'll check and let you know.

@dougwilson
Copy link
Contributor

Awesome. Maybe can you teach me how to fish, so to say? I can help and I'll need to know in the future anyway to maintain this and field issues on it as well 👍i would try all possible characters in the filename.

@adamhaeger
Copy link

Hi. I am struggling with receiving a multipart/related request that contains a binary zip file. No matter how I extract the file from the incoming request, the file is corrupted. I was hoping that this PR might do it, but I can see that its not really implemented. If anyone has a general idea of how to implement parsing of multiplart/related, I could have a crack at writing it.

Any ideas, or pointers in the right direction?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants