Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weird behavior when returning response without consuming multipart body [REPRO] #2850

Open
fs-99 opened this issue Jul 26, 2024 Discussed in #2445 · 2 comments
Open

Weird behavior when returning response without consuming multipart body [REPRO] #2850

fs-99 opened this issue Jul 26, 2024 Discussed in #2445 · 2 comments

Comments

@fs-99
Copy link

fs-99 commented Jul 26, 2024

A bit more explanation and instructions on how to see what I mean directly can be found in this reproduction:
https://github.com/fs-99/axum-multipart-bug-repro

The problem I'm facing appears when verifying size limits on very large file uploads with axum.
The body/request size limit holds, but if the body is not buffered completely beforehand, the frontend does not receive a "413 Payload Too Large" error, just a "network error".
This might be because closing a stream does not sit well with browsers, in which case this is a browser issue.
Postman does not seem to have this problem, it rightfully receives a 413 from the server.

As Discussed in #2445

Originally posted by momozahara December 25, 2023

(there was another one)

As Discussed in #1650

@CmdrMoozy
Copy link

If it helps, I can post the code I wrote to work around the issue when I ran into it in #2445. It's a trivial function, feel free to use it as public domain / with no restrictions.

In my case I only cared about handling file uploads, I wasn't worried about enforcing size limits on other kinds of requests. So I ended up not using the 'built-in' axum middleware for enforcing size limits at all, and wrote the following upload handler:

async fn receive_upload(mut multipart: extract::Multipart) -> Result<Vec<Upload>> {
    let mut uploads = Vec::new();
    let mut err = None;
    let mut total_upload_bytes = 0;

    while let Some(mut field) = multipart.next_field().await? {
        let mut buf = BytesMut::new();
        while let Some(chunk) = field.chunk().await? {
            if total_upload_bytes + chunk.len() > MAX_UPLOAD_BYTES {
                err = Some(anyhow!(
                    "file upload too large, limit is {} bytes",
                    MAX_UPLOAD_BYTES
                ));
            }
            
            if err.is_some() {
                continue;
            }
            
            total_upload_bytes += chunk.len();
            buf.put(chunk);
        }

        if err.is_some() {
            continue;
        }
        
        uploads.push(Upload {
            field_name: field.name().map(|s| s.to_owned()),
            filename: field.file_name().map(|s| s.to_owned()),
            data: buf.freeze(),
        });
    }

    if let Some(err) = err {
        return Err(err);
    }

    Ok(uploads)
}

The basic idea is to go ahead and 'consume' all of the chunks / fields, even though we don't actually do anything with them once the error condition has been triggered. Then at the very end, send an error response if appropriate.

As discussed in #2445 this appeases (some?) browsers, and gives the right behavior. Other clients like cURL don't mind receiving a response before they are done sending their request, and so either approach works.

@fs-99
Copy link
Author

fs-99 commented Jul 28, 2024

@CmdrMoozy thank you, that's a good workaround in "userspace".
I experimented a bit and think I kinda did the same thing as a patch in limited.rs:

impl<B> Body for Limited<B>
where
    B: Body,
    B::Error: Into<Box<dyn Error + Send + Sync>>,
{
    type Data = B::Data;
    type Error = Box<dyn Error + Send + Sync>;

    fn poll_frame(
        self: Pin<&mut Self>,
        cx: &mut Context<'_>,
    ) -> Poll<Option<Result<Frame<Self::Data>, Self::Error>>> {
        let mut this = self.project();
        let res = match this.inner.as_mut().poll_frame(cx) {
            Poll::Pending => return Poll::Pending,
            Poll::Ready(None) => None,
            Poll::Ready(Some(Ok(frame))) => {
                if let Some(data) = frame.data_ref() {
                    if data.remaining() > *this.remaining {
                        *this.remaining = 0;
                        // >>>>>>>>> CHANGES START HERE <<<<<<<<<
                        // NOTE: returning Pending directly here does not work, since we need to error on Ready(None)
                        // println!("PATCH: polling remaining frames!");
                        match this.inner.as_mut().poll_frame(cx) {
                            Poll::Pending => {
                                print!("pending ");
                                return Poll::Pending
                            },
                            Poll::Ready(None) => {
                                println!("ready none ");
                                return Poll::Ready(Some(Err(LengthLimitError.into())))
                            }
                            // continue polling until None reached
                            // NOTE: this codepath was never triggered with my test file
                            Poll::Ready(Some(Ok(frame))) => {
                                print!("frame ");
                                return Poll::Pending;
                            },
                            // NOTE: this codepath was never triggered with my test file
                            Poll::Ready(Some(Err(err))) => {
                                let err = err.into();
                                println!("some err: {err}");
                                return Poll::Ready(Some(Err(err)))
                            },
                        }
                    } else {
                        *this.remaining -= data.remaining();
                        Some(Ok(frame))
                    }
                } else {
                    Some(Ok(frame))
                }
            }
            Poll::Ready(Some(Err(err))) => Some(Err(err.into())),
        };

        Poll::Ready(res)
    }

It just polls again the moment it finds the body to be larger, then repeats the cycle. It ends up a bunch of times in Pending and then goes to Ready(None) (which means "end of stream"). I'm trying to learn about this only by pattern matching with my brain and navigating this code a bit. There might be a cleaner way to do this, too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants