Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSC3871: Gappy timelines #3871

Open
wants to merge 19 commits into
base: main
Choose a base branch
from
Open
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
125 changes: 125 additions & 0 deletions proposals/3871-gappy-timelines.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
# MSC3871: Gappy timeline
MadLittleMods marked this conversation as resolved.
Show resolved Hide resolved

`/messages` returns a linearized version of the event DAG. From any given
homeservers perspective of the room, the DAG can have gaps where they're missing
events. This could be because the homeserver hasn't fetched them yet or because
it failed to fetch the events because those homeservers are unreachable and no
one else knows about the event.

Currently, there is an unwritten expectation(TODO: better word) between the
server and client that the server will always return all contiguous events in
that part of the timeline. But the server has to break this promise(TODO: match
word above) sometimes when it doesn't have the event and is unable to get the
MadLittleMods marked this conversation as resolved.
Show resolved Hide resolved
event from anyone else. This MSC aims to change the
dynamic so the server can give the client feedback and an indication of where
the gaps are.

This way, clients know where they are missing events and can even retry fetching
by perhaps adding some UI to the timeline like "We failed to get some messages
in this gap, try again."

This can also make servers faster to respond to `/messages`. For example,
currently, Synapse always tries to backfill and fill in the gap (even when it
has enough messages locally to respond). In big rooms like `#matrix:matrix.org`
kegsay marked this conversation as resolved.
Show resolved Hide resolved
(Matrix HQ), almost every place you ask for has gaps in it (thousands of
backwards extremities) and lots of those events are unreachable so we try the
same thing over and over hoping the response will be different this time but
instead, we just make the `/messages` response time slow. With this MSC, we can
instead be more intelligent about backfilling in the background and just tell
the client about the gap that they can retry fetching a little later.


## Proposal

Add a `m.timeline.gap` indicator, that can be used in the `chunk` list of events
from a `GET /_matrix/client/v3/rooms/{roomId}/messages` response. There can be
multiple gaps per response.


### `m.timeline.gap`

key | type | value | description | required
--- | --- | --- | --- | ---
`gap_start_event_id` | string | Event ID | The event ID that the homeserver is missing where the gap begins | yes
`pagination_token` | string | Pagination token | A pagination token that represents the spot in the DAG after the missing `gap_start_event_id`. Useful when retrying to fetch the missing part of the timeline again via `/messages?dir=b&from=<pagination_token>` | yes

Pagination tokens are positions between events. This already an established
concept but to illustrate this better, see the following diagram:
```
pagination_token
|
<oldest-in-time> [0]<--[1] <gap> [gap_start_event_id]▼<--[4]<--[5]<--[6] <newest-in-time>
```

`m.timeline.gap` has a similar shape to a normal event so it's still easy to
iterate over the `/messages` response and process but has no `event_id` itself
MadLittleMods marked this conversation as resolved.
Show resolved Hide resolved
so it should not be mistaken as a real event in the room.

A full example of the `m.timeline.gap` indicator:

```json
{
"type": "m.timeline.gap",
"content": {
"gap_start_event_id": "$12345",
"pagination_token": "t47409-4357353_219380_26003_2265",
}
}
```

`/messages` response example with a gap:

```json
{
"chunk": [
{
"type": "m.room.message",
"content": {
"body": "foo",
}
},
{
"type": "m.timeline.gap",
"content": {
"gap_start_event": "$12345",
MadLittleMods marked this conversation as resolved.
Show resolved Hide resolved
"pagination_token": "t47409-4357353_219380_26003_2265",
}
},
{
"type": "m.room.message",
"content": {
"body": "baz",
}
},
]
}
```


## Potential issues

Lots of gaps/extremities are generated when a spam attack occurs and federation
falls behind. If clients start showing gaps with retry links, we might just be
exposing the spam more.


## Alternatives
MadLittleMods marked this conversation as resolved.
Show resolved Hide resolved

As an alternative, we can continue to do nothing as we do today and not worry
about the occasional missing events. People seem not to notice any missing
messages anyway but they do probably see our slow `/messages` pagination.



## Security considerations

Only your own homeserver controls whether a `m.timeline.gap` indicator is added to the
message response and it isn't an event of the room so there shouldn't be any weird
edge case where the gap is trying to get you to fetch spam or something.


## Unstable prefix

The `m.timeline.gap` indicator can be used in the `org.matrix.msc3871` room version.

MadLittleMods marked this conversation as resolved.
Show resolved Hide resolved