-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Filter out auth chain queries that don't exist #16552
Changes from 5 commits
71d8645
98f253e
76e2aca
5791fa2
08c27f0
21c44fc
94e22c2
3eaeaff
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
Reduce a little database load while processing state auth chains. |
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -299,9 +299,13 @@ def _get_auth_chain_ids_using_cover_index_txn( | |||||
) | ||||||
|
||||||
# Add the initial set of chains, excluding the sequence corresponding to | ||||||
# initial event. | ||||||
# initial event. Ultimately, the chains dict will be what is pulled from the | ||||||
# database, there is a chance that a sequence number here will end up being a | ||||||
# '0', which doesn't exist. Don't bother sending that to the database query. | ||||||
for chain_id, seq_no in event_chains.items(): | ||||||
chains[chain_id] = max(seq_no - 1, chains.get(chain_id, 0)) | ||||||
max_sequence_result = max(seq_no - 1, chains.get(chain_id, 0)) | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we understand when max_sequence_result is 0? I think I've convinced myself that sequence numbers are strictly positive, and chains is a map to sequence numbers. Therefore we would need to have There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I guess this corresponds to a brand-new chain that isn't the target of some other chain. Which sounds like the kind of thing that only happens at the start of a room? (cc @erikjohnston: is any of this sane?) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is basically handling the case where an initial event is the first thing in the chain (and nothing else references it). This is actually pretty common, as we start a new chain whenever we see a new type / state key pair. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd think I'd replace this change with something like: for chain_id, seq_no in event_chains.items():
# Check if the initial event is the first item in the chain. If so, then
# there is nothing new to add from this chain.
if seq_no == 1:
continue
chains[chain_id] = max(seq_no - 1, chains.get(chain_id, 0)) Maybe? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not sure if that is clearer? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Works for me There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Replaced logic in 94e22c2 Would you like me to lose that comment above the condition? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks! Yeah, I think that comment is now redundant There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Removed in 3eaeaff |
||||||
if max_sequence_result > 0: | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wonder if comparing to != 0 would be clearer?
Suggested change
|
||||||
chains[chain_id] = max_sequence_result | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. With your change, is it possible that |
||||||
|
||||||
# Now for each chain we figure out the maximum sequence number reachable | ||||||
# from *any* event ID. Events with a sequence less than that are in the | ||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that chain IDs are never 0. They come from a DB-backed sequence:
synapse/synapse/storage/databases/main/events_worker.py
Lines 293 to 300 in 12ca87f
synapse/synapse/storage/schema/main/delta/59/04_event_auth_chains.sql.postgres
Line 16 in 25f43fa
which per https://www.postgresql.org/docs/16/sql-createsequence.html#id-1.9.3.81.6 should start at 1 and increase by 1.
Therefore querying for events in chain 0 will never return anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, your point is that the sequence_number in the chains is never 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like these are also seeded at 1 and grow by 1:
synapse/synapse/storage/databases/main/events.py
Lines 941 to 950 in e9069c9