flush() after writing to gzip_file by vin · Pull Request #2753 · Kludex/starlette

vin · 2024-11-15T06:31:08Z

Summary

In order to better support streaming responses where the chunks are smaller than the file buffer size, we flush after writing.

Without the explicit flush, the writes are buffered and the subsequent reads see an empty self.gzip_buffer until the file automatically flushes due to either (1) the 32KiB write buffer¹ fills or (2) the file is closed because the streaming response is complete.

Without flushing, the GZipMiddleware doesn't work as expected for streaming responses, especially not for Server-Sent Events which are expected to be delivered immediately to clients. The code as written appears to intend to flush immediately rather than buffering, as it does immediately call await self.send(message), but in practice that message is often empty.

Checklist

I understand that this PR may be closed in case there was no previous discussion. (This doesn't apply to typos!)
I've added a test for each change that was introduced, and I tried as much as possible to make a single atomic change.
I've updated the documentation accordingly.

https://github.com/python/cpython/blob/main/Lib/gzip.py#L26 ↩

Kludex · 2024-11-15T07:07:58Z

Can we add a test to prove your point?

vin · 2024-11-15T14:39:56Z

I'll work on that today. I have a small repro case I'll share but I need to make it run as a test

vin · 2024-11-16T17:30:56Z

I've added a test, but it's a bit complicated. Without the flush, the entire contents of the response are correct, but to show that they are received iteratively rather than all at once, I use a wrapping middleware to assert that GZipMiddleware isn't sending empty message bodies, which is what it does without the flush.

vin · 2024-11-25T20:25:56Z

@Kludex could you take another look or recommend a good reviewer? Thanks!

Kludex · 2024-12-05T22:18:49Z

Not related.

vin · 2024-12-13T19:16:44Z

Any concerns here, or how can we best move this forward?

Kludex · 2024-12-13T19:38:23Z

Any concerns here, or how can we best move this forward?

The best way to move forward would be to present the problem first, with an MRE, and references to other issues where other people had the same problem.

I think the current behavior is intentional, so I need to get more references around before reviewing this.

vin · 2024-12-16T22:17:23Z

Best observed in a browser, but here's a small repro:

import time
from fastapi import FastAPI
from fastapi.middleware.gzip import GZipMiddleware
from fastapi.responses import StreamingResponse

app = FastAPI()
app.add_middleware(GZipMiddleware)

@app.get("/")
def streaming_response():

    def generate():
        for i in range(10):
            yield f'event: ping\ndata: {{"i": {i}}}\n\n'
            time.sleep(1)
    return StreamingResponse(
        generate(),
        media_type="text/event-stream"
    )

and what it looks like before this PR:

and after:

In the first case, all responses appear together after 10 seconds. In the second, every 1 second a new event appears.

vin · 2024-12-16T22:24:24Z

If the behavior is intentional, we really need to update the documentation here:
https://fastapi.tiangolo.com/advanced/middleware/#gzipmiddleware

The middleware will handle both standard and streaming responses.

to indicate that this will cause streaming responses to be buffered 32KiB at a time; this was certainly a surprising result for us, and one that caused our users to report that our app appeared broken as realtime status updates stopped working.

Kludex · 2025-01-23T22:07:14Z

This seems a valid PR.

It seems other web frameworks have the same issue. I'm confused as to how no one noticed... It's even hard to find references about people having issues with this. 🤔

Kludex · 2025-01-24T07:42:34Z

Okay. I look around, and based on what you said, I understand we have a real problem with SSEs.

It seems we should either:

Ignore the middleware on SSE.
Flush the buffer on every message, otherwise we can't see the ping.

I'm not so sure about all the streaming responses.

Reference to my future self: https://www.npmjs.com/package/compression#server-sent-events

In order to better support streaming responses where the chunks are smaller than the file buffer size, we flush after writing. Without the explicit flush, the writes are buffered and the subsequent reads see an empty self.gzip_buffer until the file automatically flushes due to either (1) the write buffer fills, probably at 8kiB, or (2) the file is closed because the streaming response is complete. Without flushing, the GZipMiddleware doesn't work as expected for streaming responses, especially not for Server-Sent Events which are expected to be delivered immediately to clients. The code as written appears to intend to flush immediately rather than buffering, as it does immediately call `await self.send(message)`, but in practice that `message` is often empty.

vin · 2025-01-24T22:03:26Z

I think having both GZip + SSE should be supportable. I think the most correct behavior is to send each message as it becomes available, even if this is not the most optimal compression. But still, enabling compression for streaming responses (including SSE) can be a reasonable choice; those individual messages could theoretically be larger than 32k and may benefit from compression the same as non-streaming responses might.

It seems the current implementation (without the flush) favors optimization of the compression at the expense of timely delivery. Enabling the flush allows the developer to accept the suboptimal compression and get the expected timely delivery, or to choose to trade off timely delivery for potential compression improvements by implementing their own buffering.

vin · 2025-01-25T00:33:05Z

Similar conversation happening at tuffnatty/zstd-asgi#5. I argue that the least-surprising behavior of the middleware is to deliver small messages immediately rather than holding them in a buffer for an indefinite amount of time. As a user of the middleware, I'd rather read "compression won't be as effective for small messages; buffer them into larger chunks if you wish to maximize compression" vs "small messages will be held in a buffer and delivered 32kB at a time; if you don't want this, don't use this middleware"

Kludex · 2025-01-25T10:30:44Z

Is there any web framework in any language that does this right in the middleware level?

tuffnatty · 2025-01-25T12:23:14Z

As I suggested in tuffnatty/zstd-asgi#5, why not allow both behaviours selecting the desired one with a parameter?

Kludex · 2025-01-25T13:59:52Z

As I suggested in tuffnatty/zstd-asgi#5, why not allow both behaviours selecting the desired one with a parameter?

Because that seems to be taken responsibility from our side and give to the developers. I'm still trying to understand the implications of flushing it on every message. Seems a lot.

But for SSE, we need to send the ping messages, so there's no need for the user to configure it.

vin · 2025-01-27T00:59:52Z

I agree that adding a parameter should be avoided if possible; I'd rather the middleware figure out the correct behavior and implement it, transparently and consistently.

Do we have any evidence the current implementation was even intentional? Read-after-write without flushing is a common, easy-to-make programming error. Calling await self.send(message) at line 100 -- when message["body"] is empty -- seems questionable.

Kludex · 2025-02-14T22:46:27Z

I did some research.

actix-web in rust suffers from this: middleware::Compress seems to be buffering content indefinitely actix/actix-web#3410
gzip from Gin in golang seems to allow the user to flush on each chunk: https://github.com/gin-contrib/gzip/pull/72/files
rocket_async_compression disables compression on SSEs by default: Ameobea/rocket_async_compression@64a984d

It seems we have 2 options:

flush on every chunk
disable compression on text/event-stream

I'll investigate other frameworks.

Kludex · 2025-02-15T10:26:52Z

On Express.js user needs to flush manually: https://github.com/expressjs/compression#server-sent-events

Kludex · 2025-02-15T10:31:25Z

On tokio-rs/axum#2728, we can see:

Since the lack of compression is a bit painful when using SSE with htmx (streaming html without compression can use way too much bandwidth) [...]

But it's not supported out of the box, user needs to manually flush.

Kludex · 2025-02-15T10:54:44Z

Conclusions:

I don't think we should force a flush on every chunk. Based on the frameworks above, it seems everybody is in agreement that we shouldn't be compressing that often to small chunks.
This middleware makes it unusable to use server sent events. We don't offer any API to the user to flush, or to avoid a single endpoint - You could use middleware per Route/Router tho - but it's still surprising to find this issue.
There are cases that SSE can still send big chunks, but not big enough to fill the buffer. As mentioned here.

So... I don't think we should merge this PR, in fact, I think we should write a test to avoid making this change in the future.

I think we should start by ignoring the middleware on text/event-source content. As for the nr3 of my conclusions, I don't have an answer yet. Happy to read more thoughts.

vin · 2025-02-17T01:09:08Z

Thanks for doing the thorough research into the state of other frameworks!
I'm glad to see the alternative that will work for our specific use case. I personally still think the flush is best for the framework, since it allows the most flexibility for users:

With large non-streaming responses, it works optimally
With streaming responses, chunks are delivered without unnecessary delay
With streaming responses, it still compresses, though suboptimally
A user can choose to optimize for compression at the expense of latency by adding their own buffering
Without the flush, the framework has chosen to optimize for compression at the expense of latency, with no way of overriding besides disabling compression altogether
The text/event-stream detection streamlines the choice from (5), but still at the expense of disabling compression entirely rather than accepting suboptimal compression. I think it also just makes the middleware more surprising: it behaves one way in some cases and another way in other cases.

All that said, I'm glad to see the framework will support our specific use case without us needing to apply handler-specific workarounds, once #2871 is in. Thanks again for your collaboration here.

Kludex · 2025-02-22T12:32:40Z

I'm closing this in favor of Don't compress on server sent events #2871.

Thanks for the help. 🙏

vin force-pushed the patch-1 branch 2 times, most recently from 5f8af49 to 2450b70 Compare November 16, 2024 19:21

vin mentioned this pull request Nov 16, 2024

Always flush() between write and read tuffnatty/zstd-asgi#5

Closed

vin added 2 commits January 24, 2025 12:55

Add a test showing the need to flush gzip_file.

031bde5

vin force-pushed the patch-1 branch from 2450b70 to 031bde5 Compare January 24, 2025 20:55

Kludex mentioned this pull request Feb 8, 2025

fix(gzip): Make sure Vary header is always added if a response can be compressed #2865

Merged

3 tasks

Kludex mentioned this pull request Feb 16, 2025

Don't compress on server sent events #2871

Merged

Kludex closed this Feb 22, 2025

Uh oh!

Conversation

vin commented Nov 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Checklist

Footnotes

Uh oh!

Kludex commented Nov 15, 2024

Uh oh!

vin commented Nov 15, 2024

Uh oh!

vin commented Nov 16, 2024

Uh oh!

vin commented Nov 25, 2024

Uh oh!

Kludex commented Dec 5, 2024

Uh oh!

vin commented Dec 13, 2024

Uh oh!

Kludex commented Dec 13, 2024

Uh oh!

vin commented Dec 16, 2024

Uh oh!

vin commented Dec 16, 2024

Uh oh!

Kludex commented Jan 23, 2025

Uh oh!

Kludex commented Jan 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vin commented Jan 24, 2025

Uh oh!

vin commented Jan 25, 2025

Uh oh!

Kludex commented Jan 25, 2025

Uh oh!

tuffnatty commented Jan 25, 2025

Uh oh!

Kludex commented Jan 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vin commented Jan 27, 2025

Uh oh!

Kludex commented Feb 14, 2025

Uh oh!

Kludex commented Feb 15, 2025

Uh oh!

Kludex commented Feb 15, 2025

Uh oh!

Kludex commented Feb 15, 2025

Uh oh!

vin commented Feb 17, 2025

Uh oh!

Kludex commented Feb 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

vin commented Nov 15, 2024 •

edited

Loading

Kludex commented Jan 24, 2025 •

edited

Loading

Kludex commented Jan 25, 2025 •

edited

Loading