Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Faster joins: fix rejected events becoming un-rejected during resync #13413

Merged
merged 3 commits into from
Aug 1, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions changelog.d/13413.bugfix
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Faster room joins: fix a bug which caused rejected events to become un-rejected during state syncing.
29 changes: 26 additions & 3 deletions synapse/handlers/federation_event.py
Original file line number Diff line number Diff line change
Expand Up @@ -581,6 +581,13 @@ async def update_state_for_partial_state_event(
event.event_id,
)
return

# since the state at this event has changed, we should now re-evaluate
# whether it should have been rejected. We must already have all of the
# auth events (from last time we went round this path), so there is no
# need to pass the origin.
await self._check_event_auth(None, event, context)

Comment on lines +584 to +590
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we now reject previously unrejected events? If so, we might want to update the rejections table or add a todo to do so.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah; I'm working on that currently.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(this is #13006)

await self._store.update_state_for_partial_state_event(event, context)
self._state_storage_controller.notify_event_un_partial_stated(
event.event_id
Expand Down Expand Up @@ -1624,13 +1631,15 @@ async def prep(event: EventBase) -> None:
)

async def _check_event_auth(
self, origin: str, event: EventBase, context: EventContext
self, origin: Optional[str], event: EventBase, context: EventContext
) -> None:
"""
Checks whether an event should be rejected (for failing auth checks).

Args:
origin: The host the event originates from.
origin: The host the event originates from. This is used to fetch
any missing auth events. It can be set to None, but only if we are
sure that we already have all the auth events.
event: The event itself.
context:
The event context.
Expand Down Expand Up @@ -1876,7 +1885,7 @@ async def _check_for_soft_fail(
event.internal_metadata.soft_failed = True

async def _load_or_fetch_auth_events_for_event(
self, destination: str, event: EventBase
self, destination: Optional[str], event: EventBase
) -> Collection[EventBase]:
"""Fetch this event's auth_events, from database or remote

Expand All @@ -1892,12 +1901,19 @@ async def _load_or_fetch_auth_events_for_event(
Args:
destination: where to send the /event_auth request. Typically the server
that sent us `event` in the first place.

If this is None, no attempt is made to load any missing auth events:
rather, an AssertionError is raised if there are any missing events.

event: the event whose auth_events we want

Returns:
all of the events listed in `event.auth_events_ids`, after deduplication

Raises:
AssertionError if some auth events were missing and no `destination` was
supplied.

AuthError if we were unable to fetch the auth_events for any reason.
"""
event_auth_event_ids = set(event.auth_event_ids())
Expand All @@ -1909,6 +1925,13 @@ async def _load_or_fetch_auth_events_for_event(
)
if not missing_auth_event_ids:
return event_auth_events.values()
if destination is None:
# this shouldn't happen: destination must be set unless we know we have already
# persisted the auth events.
raise AssertionError(
"_load_or_fetch_auth_events_for_event() called with no destination for "
"an event with missing auth_events"
)

logger.info(
"Event %s refers to unknown auth events %s: fetching auth chain",
Expand Down
8 changes: 5 additions & 3 deletions synapse/storage/databases/main/state.py
Original file line number Diff line number Diff line change
Expand Up @@ -419,13 +419,15 @@ def _update_state_for_partial_state_event_txn(
# anything that was rejected should have the same state as its
# predecessor.
if context.rejected:
assert context.state_group == context.state_group_before_event
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote this in the git commit, but to reiterate, this was completely bogus. The value of EventContext._state_group on a rejected event is actually a dummy state group which includes the rejected event (since the state group is created before we know if the event will be rejected or not).

EventContext.state_group is a getter which has an assertion to check you aren't shooting yourself in the foot, but the right thing to do here is not to access it for rejected events.

state_group = context.state_group_before_event
else:
state_group = context.state_group

self.db_pool.simple_update_txn(
txn,
table="event_to_state_groups",
keyvalues={"event_id": event.event_id},
updatevalues={"state_group": context.state_group},
updatevalues={"state_group": state_group},
)

self.db_pool.simple_delete_one_txn(
Expand All @@ -440,7 +442,7 @@ def _update_state_for_partial_state_event_txn(
txn.call_after(
self._get_state_group_for_event.prefill,
(event.event_id,),
context.state_group,
state_group,
)


Expand Down