This repository has been archived by the owner on Apr 26, 2024. It is now read-only.
-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Add documentation for cancellation of request processing #12761
Merged
Merged
Changes from 1 commit
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
c09f777
Add documentation for cancellation of request processing
48d91cf
Move docs/cancellation.md to docs/development/synapse_architecture/
64db888
Fix typo (thanks anoa)
3a6e68e
Expand on the @cancellable decorator
482f4dd
Apply suggestions from code review
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
Add documentation for cancellation of request processing. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,387 @@ | ||||||
# Cancellation | ||||||
Sometimes, requests take a long time to service and clients disconnect | ||||||
before Synapse produces a response. To avoid wasting resources, Synapse | ||||||
can cancel request processing for select endpoints with the `@cancelled` | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also could you document here whether There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good spot, let's fix that. I'll write some more words on the decorator. |
||||||
decorator. | ||||||
|
||||||
Synapse makes use of Twisted's `Deferred.cancel()` feature to make | ||||||
cancellation work. | ||||||
|
||||||
## Enabling cancellation for an endpoint | ||||||
1. Check that the endpoint method, and any `async` functions in its call | ||||||
tree handle cancellation correctly. See | ||||||
[Handling cancellation correctly](#handling-cancellation-correctly) | ||||||
for a list of things to look out for. | ||||||
2. Apply the `@cancellable` decorator to the `on_GET/POST/PUT/DELETE` | ||||||
method. It's not recommended to make non-`GET` methods cancellable, | ||||||
since cancellation midway through some database updates is less | ||||||
likely to be handled correctly. | ||||||
|
||||||
## Mechanics | ||||||
There are two stages to cancellation: downward propagation of a | ||||||
`cancel()` call, followed by upwards propagation of a `CancelledError` | ||||||
out of a blocked `await`. | ||||||
Both Twisted and asyncio have a cancellation mechanism. | ||||||
|
||||||
| | Method | Exception | Exception inherits from | | ||||||
|---------------|---------------------|-----------------------------------------|-------------------------| | ||||||
| Twisted | `Deferred.cancel()` | `twisted.internet.defer.CancelledError` | `Exception` (!) | | ||||||
| asyncio | `Task.cancel()` | `asyncio.CancelledError` | `BaseException` | | ||||||
|
||||||
### Deferred.cancel() | ||||||
When Synapse starts handling a request, it runs the async method | ||||||
responsible for handling it using `defer.ensureDeferred`, which returns | ||||||
a `Deferred`. | ||||||
|
||||||
```python | ||||||
def do_something() -> Deferred[None]: | ||||||
... | ||||||
|
||||||
async def on_GET() -> Tuple[int, JsonDict]: | ||||||
d = make_deferred_yieldable(do_something()) | ||||||
await d | ||||||
return 200, {} | ||||||
|
||||||
request = defer.ensureDeferred(on_GET()) | ||||||
``` | ||||||
|
||||||
During cancellation, `Deferred.cancel()` is called on the `Deferred` | ||||||
from `defer.ensureDeferred`, `request`. Twisted knows which `Deferred` | ||||||
`request` is waiting on and passes the `cancel()` call on to `d`. | ||||||
|
||||||
The `Deferred` being waited on, `d`, may have its own handling for | ||||||
`cancel()` and pass the call on to other `Deferred`s. | ||||||
|
||||||
Eventually, a `Deferred` handles the `cancel()` call by resolving itself | ||||||
with a `CancelledError`. | ||||||
|
||||||
### CancelledError | ||||||
The `CancelledError` gets raised out of the `await` and bubbles up, as | ||||||
per normal Python exception handling. | ||||||
|
||||||
## Handling cancellation correctly | ||||||
In general, when writing code that might be subject to cancellation, two | ||||||
things must be considered: | ||||||
* The effect of `CancelledError`s raised out of `await`s. | ||||||
* The effect of `Deferred`s being `cancel()`ed. | ||||||
|
||||||
Examples of code that handles cancellation incorrectly include: | ||||||
* `try-except` blocks which swallow `CancelledError`s. | ||||||
* Code that shares the same `Deferred`, which may be cancelled, between | ||||||
multiple requests. | ||||||
* Code that starts some processing that's exempt from cancellation, but | ||||||
uses a logging context from cancellable code. The logging context | ||||||
will be finished upon cancellation, while the uncancelled processing | ||||||
is still using it. | ||||||
|
||||||
Some common patterns are listed below in more detail. | ||||||
|
||||||
### `async` function calls | ||||||
Most functions in Synapse are relatively straightforward from a | ||||||
cancellation standpoint: they don't do anything with `Deferred`s and | ||||||
purely call and `await` other `async` functions. | ||||||
|
||||||
An `async` function handles cancellation correctly if its own code | ||||||
handles cancellation correctly and all the async function it calls | ||||||
handle cancellation correctly. For example: | ||||||
```python | ||||||
async def do_two_things() -> None: | ||||||
check_something() | ||||||
await do_something() | ||||||
await do_something_else() | ||||||
``` | ||||||
`do_two_things` handles cancellation correctly if `do_something` and | ||||||
`do_something_else` handle cancellation correctly. | ||||||
|
||||||
That is, when checking whether a function handles cancellation | ||||||
correctly, its implementation and all its `async` function calls need to | ||||||
be checked, recursively. | ||||||
|
||||||
As `check_something` is not `async`, it does not need to be checked. | ||||||
|
||||||
### CancelledErrors | ||||||
Because Twisted's `CancelledError`s are `Exception`s, it's easy to | ||||||
accidentally catch and suppress them. Care must be taken to ensure that | ||||||
`CancelledError`s are allowed to propagate upwards. | ||||||
|
||||||
<table width="100%"> | ||||||
<tr> | ||||||
<td width="50%" valign="top"> | ||||||
|
||||||
**Bad**: | ||||||
```python | ||||||
try: | ||||||
await do_something() | ||||||
except Exception: | ||||||
# `CancelledError` gets swallowed here. | ||||||
logger.info(...) | ||||||
``` | ||||||
</td> | ||||||
<td width="50%" valign="top"> | ||||||
|
||||||
**Good**: | ||||||
```python | ||||||
try: | ||||||
await do_something() | ||||||
except CancelledError: | ||||||
raise | ||||||
except Exception: | ||||||
logger.info(...) | ||||||
``` | ||||||
</td> | ||||||
</tr> | ||||||
<tr> | ||||||
<td width="50%" valign="top"> | ||||||
|
||||||
**OK**: | ||||||
```python | ||||||
try: | ||||||
check_something() | ||||||
# A `CancelledError` won't ever be raised here. | ||||||
except Exception: | ||||||
logger.info(...) | ||||||
``` | ||||||
</td> | ||||||
<td width="50%" valign="top"> | ||||||
|
||||||
**Good**: | ||||||
```python | ||||||
try: | ||||||
await do_something() | ||||||
except ValueError: | ||||||
logger.info(...) | ||||||
``` | ||||||
</td> | ||||||
</tr> | ||||||
</table> | ||||||
|
||||||
#### defer.gatherResults | ||||||
`defer.gatherResults` produces a `Deferred` which: | ||||||
* broadcasts `cancel()` calls to every `Deferred` being waited on. | ||||||
* wraps the first exception it sees in a `FirstError`. | ||||||
|
||||||
Together, this means that `CancelledError`s will be wrapped in | ||||||
a `FirstError` unless unwrapped. Such `FirstError`s are liable to be | ||||||
swallowed, so they must be unwrapped. | ||||||
|
||||||
<table width="100%"> | ||||||
<tr> | ||||||
<td width="50%" valign="top"> | ||||||
|
||||||
**Bad**: | ||||||
```python | ||||||
async def do_something() -> None: | ||||||
await make_deferred_yieldable( | ||||||
defer.gatherResults([...], consumeErrors=True) | ||||||
) | ||||||
|
||||||
try: | ||||||
await do_something() | ||||||
except CancelledError: | ||||||
raise | ||||||
except Exception: | ||||||
# `FirstError(CancelledError)` gets swallowed here. | ||||||
logger.info(...) | ||||||
``` | ||||||
|
||||||
</td> | ||||||
<td width="50%" valign="top"> | ||||||
|
||||||
**Good**: | ||||||
```python | ||||||
async def do_something() -> None: | ||||||
await make_deferred_yieldable( | ||||||
defer.gatherResults([...], consumeErrors=True) | ||||||
).addErrback(unwrapFirstError) | ||||||
|
||||||
try: | ||||||
await do_something() | ||||||
except CancelledError: | ||||||
raise | ||||||
except Exception: | ||||||
logger.info(...) | ||||||
``` | ||||||
</td> | ||||||
</tr> | ||||||
</table> | ||||||
|
||||||
### Creation of `Deferred`s | ||||||
If a function creates a `Deferred`, the effect of cancelling it must be considered. `Deferred`s that get shared are likely to have unintended behaviour when cancelled. | ||||||
|
||||||
<table width="100%"> | ||||||
<tr> | ||||||
<td width="50%" valign="top"> | ||||||
|
||||||
**Bad**: | ||||||
```python | ||||||
cache: Dict[str, Deferred[None]] = {} | ||||||
|
||||||
def wait_for_room(room_id: str) -> Deferred[None]: | ||||||
deferred = cache.get(room_id) | ||||||
if deferred is None: | ||||||
deferred = Deferred() | ||||||
cache[room_id] = deferred | ||||||
# `deferred` can have multiple waiters. | ||||||
# All of them will observe a `CancelledError` | ||||||
# if any one of them is cancelled. | ||||||
return make_deferred_yieldable(deferred) | ||||||
|
||||||
# Request 1 | ||||||
await wait_for_room("!aAAaaAaaaAAAaAaAA:matrix.org") | ||||||
# Request 2 | ||||||
await wait_for_room("!aAAaaAaaaAAAaAaAA:matrix.org") | ||||||
``` | ||||||
</td> | ||||||
<td width="50%" valign="top"> | ||||||
|
||||||
**Good**: | ||||||
```python | ||||||
cache: Dict[str, Deferred[None]] = {} | ||||||
|
||||||
def wait_for_room(room_id: str) -> Deferred[None]: | ||||||
deferred = cache.get(room_id) | ||||||
if deferred is None: | ||||||
deferred = Deferred() | ||||||
cache[room_id] = deferred | ||||||
# `deferred` will never be cancelled now. | ||||||
# A `CancelledError` will still come out of | ||||||
# the `await`. | ||||||
# `delay_cancellation` may also be used. | ||||||
return make_deferred_yieldable(stop_cancellation(deferred)) | ||||||
|
||||||
# Request 1 | ||||||
await wait_for_room("!aAAaaAaaaAAAaAaAA:matrix.org") | ||||||
# Request 2 | ||||||
await wait_for_room("!aAAaaAaaaAAAaAaAA:matrix.org") | ||||||
``` | ||||||
</td> | ||||||
</tr> | ||||||
<tr> | ||||||
<td width="50%" valign="top"> | ||||||
</td> | ||||||
<td width="50%" valign="top"> | ||||||
|
||||||
**Good**: | ||||||
```python | ||||||
cache: Dict[str, List[Deferred[None]]] = {} | ||||||
|
||||||
def wait_for_room(room_id: str) -> Deferred[None]: | ||||||
if room_id not in cache: | ||||||
cache[room_id] = [] | ||||||
# Each request gets its own `Deferred` to wait on. | ||||||
deferred = Deferred() | ||||||
cache[room_id]].append(deferred) | ||||||
return make_deferred_yieldable(deferred) | ||||||
|
||||||
# Request 1 | ||||||
await wait_for_room("!aAAaaAaaaAAAaAaAA:matrix.org") | ||||||
# Request 2 | ||||||
await wait_for_room("!aAAaaAaaaAAAaAaAA:matrix.org") | ||||||
``` | ||||||
</td> | ||||||
</table> | ||||||
|
||||||
### Uncancelled processing | ||||||
Some `async` functions may kick off some `async` processing which is | ||||||
intentionally protected from cancellation, by `stop_cancellation` or | ||||||
other means. If the `async` processing inherits the logcontext of the | ||||||
request which initiated it, care must be taken to ensure that the | ||||||
logcontext is not finished before the `async` processing completes. | ||||||
|
||||||
<table width="100%"> | ||||||
<tr> | ||||||
<td width="50%" valign="top"> | ||||||
|
||||||
**Bad**: | ||||||
```python | ||||||
cache: Optional[ObservableDeferred[None]] = None | ||||||
|
||||||
async def do_something_else( | ||||||
to_resolve: Deferred[None] | ||||||
) -> None: | ||||||
await ... | ||||||
logger.info("done!") | ||||||
to_resolve.callback(None) | ||||||
|
||||||
async def do_something() -> None: | ||||||
if not cache: | ||||||
to_resolve = Deferred() | ||||||
cache = ObservableDeferred(to_resolve) | ||||||
# `do_something_else` will never be cancelled and | ||||||
# can outlive the `request-1` logging context. | ||||||
run_in_background(do_something_else, to_resolve) | ||||||
|
||||||
await make_deferred_yieldable(cache.observe()) | ||||||
|
||||||
with LoggingContext("request-1"): | ||||||
await do_something() | ||||||
``` | ||||||
</td> | ||||||
<td width="50%" valign="top"> | ||||||
|
||||||
**Good**: | ||||||
```python | ||||||
cache: Optional[ObservableDeferred[None]] = None | ||||||
|
||||||
async def do_something_else( | ||||||
to_resolve: Deferred[None] | ||||||
) -> None: | ||||||
await ... | ||||||
logger.info("done!") | ||||||
to_resolve.callback(None) | ||||||
|
||||||
async def do_something() -> None: | ||||||
if not cache: | ||||||
to_resolve = Deferred() | ||||||
cache = ObservableDeferred(to_resolve) | ||||||
run_in_background(do_something_else, to_resolve) | ||||||
# We'll wait until `do_something_else` is | ||||||
# done before raising a `CancelledError`. | ||||||
await make_deferred_yieldable( | ||||||
delay_cancellation(cache.observe()) | ||||||
) | ||||||
else: | ||||||
await make_deferred_yieldable(cache.observe()) | ||||||
|
||||||
with LoggingContext("request-1"): | ||||||
await do_something() | ||||||
``` | ||||||
</td> | ||||||
</tr> | ||||||
<tr> | ||||||
<td width="50%"> | ||||||
|
||||||
**OK**: | ||||||
```python | ||||||
cache: Optional[ObservableDeferred[None]] = None | ||||||
|
||||||
async def do_something_else( | ||||||
to_resolve: Deferred[None] | ||||||
) -> None: | ||||||
await ... | ||||||
logger.info("done!") | ||||||
to_resolve.callback(None) | ||||||
|
||||||
async def do_something() -> None: | ||||||
if not cache: | ||||||
to_resolve = Deferred() | ||||||
cache = ObservableDeferred(to_resolve) | ||||||
# `do_something_else` will get its own independent | ||||||
# logging context. `request-1` will not count any | ||||||
# metrics from `do_something_else`. | ||||||
run_as_background_process( | ||||||
"do_something_else", | ||||||
do_something_else, | ||||||
to_resolve, | ||||||
) | ||||||
|
||||||
await make_deferred_yieldable(cache.observe()) | ||||||
|
||||||
with LoggingContext("request-1"): | ||||||
await do_something() | ||||||
``` | ||||||
</td> | ||||||
<td width="50%"> | ||||||
</td> | ||||||
</tr> | ||||||
</table> |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you drop this file into a new
docs/development/synapse_architecture
folder in accordance with the documentation structure?Not that the current documentation files in there are a great example 😇 (#11274).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed! I was led astray by the existing documentation files 😢