-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Invalidate the global cache after a redaction #296
Conversation
It is not the creation content (i.e. the content of the m.room.create event).
63c56da
to
c1cc0eb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks very sane.
WHERE (event_type IN ('m.room.name', 'm.room.avatar', 'm.room.canonical_alias') AND state_key = '') | ||
OR (event_type = 'm.room.member' AND membership IN ('join', '_join', 'invite', '_invite')) | ||
ORDER BY event_nid ASC | ||
;`, metadata.RoomID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How long does this take on m.org?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did some testing on Matrix HQ---let me rerun and get you some numbers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to notes this was ~200ms for Matrix HQ. Running it now is ~270ms.
postgres=> \set room_id !OGEhHVWSdvArJzumhm:matrix.org \set snapshot_id 682264 \\
postgres=>
postgres=> EXPLAIN ANALYZE
postgres-> WITH snapshot(events, membership_events) AS (
postgres(> SELECT events, membership_events
postgres(> FROM syncv3_snapshots
postgres(> JOIN syncv3_rooms ON snapshot_id = current_snapshot_id
postgres(> WHERE syncv3_rooms.room_id = :'room_id'
postgres(> )
postgres-> SELECT event_id, event_type, state_key, event, membership
postgres-> FROM syncv3_events JOIN snapshot ON (
postgres(> event_nid = ANY (ARRAY_CAT(events, membership_events))
postgres(> )
postgres-> WHERE (event_type IN ('m.room.name', 'm.room.avatar', 'm.room.canonical_alias') AND state_key = '')
postgres-> OR (event_type = 'm.room.member' AND membership IN ('join', '_join', 'invite', '_invite'))
postgres-> ORDER BY event_nid ASC
postgres-> ;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------
Sort (cost=24.28..24.28 rows=2 width=501) (actual time=246.204..258.618 rows=48046 loops=1)
Sort Key: syncv3_events.event_nid
Sort Method: external merge Disk: 19976kB
-> Nested Loop (cost=1.15..24.27 rows=2 width=501) (actual time=17.091..200.713 rows=48046 loops=1)
-> Nested Loop (cost=0.71..16.75 rows=1 width=271) (actual time=0.032..0.034 rows=1 loops=1)
-> Index Scan using syncv3_rooms_pkey on syncv3_rooms (cost=0.29..8.30 rows=1 width=8) (actual time=0.020..0.021 rows=1 loops
=1)
Index Cond: (room_id = '!OGEhHVWSdvArJzumhm:matrix.org'::text)
-> Index Scan using syncv3_snapshots_snapshot_id_room_id_key on syncv3_snapshots (cost=0.42..8.44 rows=1 width=279) (actual t
ime=0.010..0.010 rows=1 loops=1)
Index Cond: (snapshot_id = syncv3_rooms.current_snapshot_id)
-> Index Scan using syncv3_events_pkey on syncv3_events (cost=0.44..7.50 rows=2 width=501) (actual time=15.542..192.561 rows=48046
loops=1)
Index Cond: (event_nid = ANY (array_cat(syncv3_snapshots.events, syncv3_snapshots.membership_events)))
Filter: (((event_type = ANY ('{m.room.name,m.room.avatar,m.room.canonical_alias}'::text[])) AND (state_key = ''::text)) OR ((ev
ent_type = 'm.room.member'::text) AND (membership = ANY ('{join,_join,invite,_invite}'::text[]))))
Rows Removed by Filter: 53701
Planning Time: 2.699 ms
Execution Time: 268.729 ms
(15 rows)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eyeballing it, the ORDER BY adds ~70ms, which surprised me! (I want to process the Nids in ascending order to pick out the heroes.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it might be quicker to do the sorting in Go? Suggest this is good enough to leave as-is for now though.
(Also I'm surprised that we use the 6 newest heroes, not the 6 oldest)
} | ||
|
||
logger.Warn().Any("before", metadata).Msgf("DMR: invalidate GLOBALLLLLL room %s", roomID) | ||
err := c.store.ResetMetadataState(metadata) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Presumably needs to be written back to c.roomIDToMetadata
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here metadata
is a pointer and I'm updating *metadata
in-place. We have a lock on roomIDToMetadataMu in the caller so this might be safe. I wasn't completely sure though, see e.g.
Lines 311 to 312 in 1b14ac3
// TODO: could be brittle. Might be better to just create a new RoomMetadata from scratch | |
// and copy over the fields we don't want to reload. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From discussion elsewhere:
it should be safe so long as you're doing that from the subscriber goroutine, and not per-conn goroutines
return fmt.Errorf("ResetMetadataState[%s]: %w", metadata.RoomID, err) | ||
} | ||
|
||
heroMemberships := circularSlice[*Event]{max: 6} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: when populating the heroes I only care about the 6 most recent heroes. However to get accurate join and invite counts we effectively need to pull all the members out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're only worrying about redactions then there aren't gonna be any count changes. But as #294 says there's the more general problem of "our state is completely out of date".
Happens a lot of the time, e.g. if we know that a non-proxy user has joined the room.
SELECT COUNT(*) | ||
FROM syncv3_events | ||
JOIN syncv3_snapshots ON event_nid = ANY (ARRAY_CAT(events, membership_events)) | ||
WHERE snapshot_id = $1 AND event_id = ANY($2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sanity checked the query plan as follows:
\set room_id !OGEhHVWSdvArJzumhm:matrix.org \\
\set snapshotID 756886 \\
\set event1 $liLBpCuSUVPqPVsff3jCFT9VN9dW551TiDFx5QE8MmI \\
\set event2 $ARQ3jrwiMsX1pLeDIjvOW_ga4CucgAnghrzHWwlULS0 \\
EXPLAIN ANALYZE SELECT COUNT(*)
FROM syncv3_events
JOIN syncv3_snapshots ON event_nid = ANY (ARRAY_CAT(events, membership_events))
WHERE snapshot_id = :snapshotID AND event_id = ANY (ARRAY[:'event1', :'event2'])
;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------
---------------------------
Aggregate (cost=25.64..25.65 rows=1 width=8) (actual time=3.044..3.045 rows=1 loops=1)
-> Nested Loop (cost=0.99..25.64 rows=1 width=0) (actual time=1.495..3.040 rows=2 loops=1)
Join Filter: (syncv3_events.event_nid = ANY (array_cat(syncv3_snapshots.events, syncv3_snapshots.membership_events)))
-> Index Scan using syncv3_snapshots_snapshot_id_room_id_key on syncv3_snapshots (cost=0.42..8.44 rows=1 width=271) (actual time=0.
024..0.025 rows=1 loops=1)
Index Cond: (snapshot_id = 756886)
-> Index Scan using syncv3_events_event_id_key on syncv3_events (cost=0.56..17.15 rows=2 width=8) (actual time=0.024..0.038 rows=2
loops=1)
Index Cond: (event_id = ANY ('{$liLBpCuSUVPqPVsff3jCFT9VN9dW551TiDFx5QE8MmI,$ARQ3jrwiMsX1pLeDIjvOW_ga4CucgAnghrzHWwlULS0}'::tex
t[]))
Planning Time: 0.577 ms
Execution Time: 3.073 ms
(9 rows)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent
Part of #294.
TODO: