Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detect blocking calls in coroutines using BlockBuster #2858

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

cbornet
Copy link

@cbornet cbornet commented Feb 1, 2025

Summary

This PR uses the blockbuster library to detect blocking calls made in the asyncio event loop during unit tests.
Avoiding blocking calls is hard as these can be deeply buried in the code or made in 3rd party libraries.
Blockbuster makes it easier to detect them by raising an exception when a call is made to a known blocking function (eg: time.sleep).

Checklist

  • I understand that this PR may be closed in case there was no previous discussion. (This doesn't apply to typos!)
  • I've added a test for each change that was introduced, and I tried as much as possible to make a single atomic change.
  • I've updated the documentation accordingly.

@@ -438,7 +438,7 @@ async def write(self, data: bytes) -> None:
if self.size is not None:
self.size += len(data)

if self._in_memory:
if self._in_memory and self.file.tell() + len(data) <= self.file._max_size:
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SpooledTemporaryFile rollover is blocking

@@ -288,7 +288,7 @@ async def receive() -> Message:
await response_complete.wait()
return {"type": "http.disconnect"}

body = request.read()
body = await anyio.to_thread.run_sync(request.read)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Httpx request.read can be blocking (eg: reading multipart file).
await request.aread() can't be used as-is since the request stream is a SyncByteStream not an AsyncByteStream.

@@ -23,7 +23,7 @@ def test_templates(tmpdir: Path, test_client_factory: TestClientFactory) -> None
with open(path, "w") as file:
file.write("<html>Hello, <a href='{{ url_for('homepage') }}'>world</a></html>")

async def homepage(request: Request) -> Response:
def homepage(request: Request) -> Response:
Copy link
Author

@cbornet cbornet Feb 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it OK to change ?
templates.TemplateResponse is blocking.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is templates.TemplateResponse blocking?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It loads the index.html from the FS using jinja2.FileSystemLoader.get_source() which does blocking calls.
The path is:

test_templates.py:27: in homepage
    return templates.TemplateResponse(request, "index.html")
../starlette/templating.py:208: in TemplateResponse
    template = self.get_template(name)
../starlette/templating.py:131: in get_template
    return self.env.get_template(name)
../venv/lib/python3.11/site-packages/jinja2/environment.py:1016: in get_template
    return self._load_template(name, globals)
../venv/lib/python3.11/site-packages/jinja2/environment.py:975: in _load_template
    template = self.loader.load(self, name, self.make_globals(globals))
../venv/lib/python3.11/site-packages/jinja2/loaders.py:126: in load
    source, filename, uptodate = self.get_source(environment, name)
../venv/lib/python3.11/site-packages/jinja2/loaders.py:204: in get_source
    if os.path.isfile(filename):

path.isfile is a blocking call as it does an os.stat.
The get_source code also does a file.read() blocking call later.

@pytest.fixture(autouse=True)
def blockbuster(request):
with blockbuster_ctx("starlette") as bb:
bb.functions["os.stat"].can_block_in("/mimetypes.py", "init")
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FileResponse's constructor calls mimetypes .guess_type which is blocking the first time it is called.

@cbornet cbornet force-pushed the blockbuster branch 4 times, most recently from 09e27f0 to 8a12d86 Compare February 1, 2025 12:01
@cbornet cbornet closed this Feb 1, 2025
@cbornet cbornet reopened this Feb 1, 2025
@cbornet cbornet closed this Feb 1, 2025
@cbornet cbornet reopened this Feb 1, 2025
@cbornet cbornet closed this Feb 1, 2025
@cbornet cbornet reopened this Feb 1, 2025
@cbornet cbornet closed this Feb 1, 2025
@cbornet cbornet reopened this Feb 1, 2025
@cbornet cbornet force-pushed the blockbuster branch 6 times, most recently from 8a12d86 to d9da912 Compare February 1, 2025 13:30
Copy link
Member

@Kludex Kludex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not keen to add a dependency, but seems a cool library.

.gitignore Outdated Show resolved Hide resolved
@@ -438,7 +439,7 @@ async def write(self, data: bytes) -> None:
if self.size is not None:
self.size += len(data)

if self._in_memory:
if self._in_memory and self.file.tell() + len(data) <= getattr(self.file, "_max_size", sys.maxsize):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's this about?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the data to write makes the file exceed the SpooledTemporaryFile _max_size, self.file.write will do a blocking rollover operation.

@@ -468,7 +468,7 @@ async def cancel_on_disconnect(

# A timeout is set for 0.1 second in order to ensure that
# we never deadlock the test run in an infinite loop
with anyio.move_on_after(0.1):
with anyio.move_on_after(0.2):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

Copy link
Author

@cbornet cbornet Feb 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BlockBuster has a small toll on performance.
In CI, the test machines are already slow, the cancellation sometimes arrives too late and the test becomes flaky (I had no issue on my computer, only in CI).

@@ -23,7 +23,7 @@ def test_templates(tmpdir: Path, test_client_factory: TestClientFactory) -> None
with open(path, "w") as file:
file.write("<html>Hello, <a href='{{ url_for('homepage') }}'>world</a></html>")

async def homepage(request: Request) -> Response:
def homepage(request: Request) -> Response:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is templates.TemplateResponse blocking?

Co-authored-by: Marcelo Trylesinski <[email protected]>
@adriangb
Copy link
Member

adriangb commented Feb 1, 2025

@cbornet I'm curious about the choice to call out file IO as blocking. In my experience file IO is tricky:

  1. The issue with blocking calls is not if they are theoretically blocking or not, the issue is how long they block for. E.g. sorting a list is blocking work but as long as it's <1ms it's probably fine. When you start getting into multi ms blocks is when things get ugly since IO protocols that expect a regular tick and such may start timing out, causing cascading failures.
  2. File IO can of course block for a long time (e.g. reading GBs of data in one go) but it can also be very fast (small operation on an SSD disk). And it's hard to know ahead of time which one it will be.
  3. The overhead of starting up threads is non negligible, and anyio's to_thread has a relatively low semaphore (40 last time I checked) which is easy to accidentally exhaust, possibly leading to deadlocks.

Because of this my general approach has been to be conservative with chucking things into threads, especially file IO since it's often not a problem in practice.

@cbornet
Copy link
Author

cbornet commented Feb 1, 2025

@adriangb thanks for your feedback.
I agree with you that the impact of blocking calls depends a lot on the time it blocks. But how to know in advance ? Things can run perfectly well on your laptop or CI with SSDs and fall apart when you deploy to AWS with EFS slow network disks (bad recent personal experience).
I’ve seen that in general file ops are deferred to threads (or if using Linux, you can use aiofile which has true async support).
And I see that starlette already does it in a bunch of places (eg: UploadFile.write)
Starting a thread has a cost but I think anyio uses a thread pool ?
Anyway, if you don’t want to use threads, it’s possible to:

  • set exemptions in the places detected by blockbuster. This way, you still have a kind of warning that something could be better.
  • or completely disable file IO blockings detection by blockbuster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants