Skip to content

Commit

Permalink
Fix infinite loop in partial-state resync (matrix-org#13353)
Browse files Browse the repository at this point in the history
Make sure that we only pull out events from the db once they have no
prev-events with partial state.
  • Loading branch information
richvdh authored and azmeuk committed Aug 8, 2022
1 parent 797e898 commit bc309be
Show file tree
Hide file tree
Showing 3 changed files with 27 additions and 8 deletions.
1 change: 1 addition & 0 deletions changelog.d/13353.bugfix
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Fix a bug in the experimental faster-room-joins support which could cause it to get stuck in an infinite loop.
14 changes: 7 additions & 7 deletions synapse/handlers/federation_event.py
Original file line number Diff line number Diff line change
Expand Up @@ -569,15 +569,15 @@ async def update_state_for_partial_state_event(

if context is None or context.partial_state:
# this can happen if some or all of the event's prev_events still have
# partial state - ie, an event has an earlier stream_ordering than one
# or more of its prev_events, so we de-partial-state it before its
# prev_events.
# partial state. We were careful to only pick events from the db without
# partial-state prev events, so that implies that a prev event has
# been persisted (with partial state) since we did the query.
#
# TODO(faster_joins): we probably need to be more intelligent, and
# exclude partial-state prev_events from consideration
# https://github.com/matrix-org/synapse/issues/13001
# So, let's just ignore `event` for now; when we re-run the db query
# we should instead get its partial-state prev event, which we will
# de-partial-state, and then come back to event.
logger.warning(
"%s still has partial state: can't de-partial-state it yet",
"%s still has prev_events with partial state: can't de-partial-state it yet",
event.event_id,
)
return
Expand Down
20 changes: 19 additions & 1 deletion synapse/storage/databases/main/events_worker.py
Original file line number Diff line number Diff line change
Expand Up @@ -2110,11 +2110,29 @@ async def get_partial_state_events_batch(self, room_id: str) -> List[str]:
def _get_partial_state_events_batch_txn(
txn: LoggingTransaction, room_id: str
) -> List[str]:
# we want to work through the events from oldest to newest, so
# we only want events whose prev_events do *not* have partial state - hence
# the 'NOT EXISTS' clause in the below.
#
# This is necessary because ordering by stream ordering isn't quite enough
# to ensure that we work from oldest to newest event (in particular,
# if an event is initially persisted as an outlier and later de-outliered,
# it can end up with a lower stream_ordering than its prev_events).
#
# Typically this means we'll only return one event per batch, but that's
# hard to do much about.
#
# See also: https://github.com/matrix-org/synapse/issues/13001
txn.execute(
"""
SELECT event_id FROM partial_state_events AS pse
JOIN events USING (event_id)
WHERE pse.room_id = ?
WHERE pse.room_id = ? AND
NOT EXISTS(
SELECT 1 FROM event_edges AS ee
JOIN partial_state_events AS prev_pse ON (prev_pse.event_id=ee.prev_event_id)
WHERE ee.event_id=pse.event_id
)
ORDER BY events.stream_ordering
LIMIT 100
""",
Expand Down

0 comments on commit bc309be

Please sign in to comment.