Race condition with replication means `/messages` backfill lacks read-after-write consistency between workers #14211

MadLittleMods · 2022-10-17T20:28:50Z

We have have a problem where because events are persisted in a queue in a client_reader worker, there is no guarantee that they are available to read on other workers. So when we fire off a backfill request from /messages, those backfilled messages aren't necessarily available to paginate with after the backfill completes (even on the worker that put them in the persister queue).

CI failure: https://github.com/matrix-org/synapse/actions/runs/3182998161/jobs/5189731097#step:6:15343 (from discussion). This specific CI flake was addressed in matrix-org/complement#492

WORKERS=1 POSTGRES=1 COMPLEMENT_ALWAYS_PRINT_SERVER_LOGS=1 COMPLEMENT_DIR=../complement ./scripts-dev/complement.sh -run TestJumpToDateEndpoint/parallel/federation/can_paginate_after_getting_remote_event_from_timestamp_to_event_endpoint

Here is what happens:

serverB has event1 stored as an outlier from previous requests (specifically from MSC3030 jump to date pulling in a missing prev_event after backfilling)
Client on serverB calls /messages?dir=b
serverB:client_reader1 accepts the request and drives things
serverB:client_reader1 has some backward extremities in range and requests /backfill from serverA
serverB:client_reader1 processes the events from backfill including event1 and puts them in the _event_persist_queue
serverB:master picks up the events from the _event_persist_queue and persists them to the database, de-outliers event1 and invalidates its own cache and sends them over replication
serverB:client_reader1 starts assembling the /messages response and gets event1 out of the stale cache still as an outlier
serverB:client_reader1 responds to the /messages request without event1 because outliers are filtered out
serverB:client_reader1 finally gets the replication data and invalidates its own cache for event1 (too late, we already got the events from the stale cache and responded)

In a nutshell, we've written the test expecting "read-after-write consistency" but we don't have that.

-- #13185 (comment)

It's exactly this but it really sucks that calling /messages doesn't include events we just backfilled for that request. This is a general problem with Synapse though, see issues labeled with Z-Read-After-Write A lack of read-after-write consistency, usually due to cache invalidation races with workers . In this case, it's all within the same /messages request so it's a little more insidious.

Having this be possible makes it even more of a reason that we should indicate gaps in /messages, MSC3871

The text was updated successfully, but these errors were encountered:

reivilibre · 2022-12-12T21:21:55Z

It seems like we need some way for the event persister (stage 6) to signal to the client reader that the events it put in the queue have been persisted (or is the problem that we don't invalidate the outlier state of event1?).

The /messages request can block until the appropriate events have been persisted (and any relevant caches should be invalidated at this point, too).

Since this is all happening within a single /messages request it seems like it should be within our capabilities to fix (and in my opinion we really should).

Potential suggestion without too much investigation: Perhaps it may be useful to snoop on the events stream as the signalling mechanism?

MadLittleMods added A-Workers Problems related to running Synapse in Worker Mode (or replication) A-Testing Issues related to testing in complement, synapse, etc Z-Read-After-Write A lack of read-after-write consistency, usually due to cache invalidation races with workers labels Oct 17, 2022

This was referenced Oct 17, 2022

Race condition where a second join races with cache invalidation over replication and erroneously rejects joining Restricted Room #13185

Open

De-flake MSC3030 backfill test with Synapse worker replication matrix-org/complement#492

Merged

DMRobertson added the T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. label Oct 18, 2022

MatMaul added S-Tolerable Minor significance, cosmetic issues, low or no impact to users. O-Uncommon Most users are unlikely to come across this or unexpected workflow labels Oct 20, 2022

matrixbot mentioned this issue Dec 21, 2023

Race condition with replication means /messages backfill lacks read-after-write consistency between workers element-hq/synapse#14211

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Race condition with replication means `/messages` backfill lacks read-after-write consistency between workers #14211

Race condition with replication means `/messages` backfill lacks read-after-write consistency between workers #14211

MadLittleMods commented Oct 17, 2022 •

edited

Loading

reivilibre commented Dec 12, 2022

Race condition with replication means /messages backfill lacks read-after-write consistency between workers #14211

Race condition with replication means /messages backfill lacks read-after-write consistency between workers #14211

Comments

MadLittleMods commented Oct 17, 2022 • edited Loading

reivilibre commented Dec 12, 2022

Race condition with replication means `/messages` backfill lacks read-after-write consistency between workers #14211

Race condition with replication means `/messages` backfill lacks read-after-write consistency between workers #14211

MadLittleMods commented Oct 17, 2022 •

edited

Loading