-
Notifications
You must be signed in to change notification settings - Fork 190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sliding sync: various fixups to the background update #17652
Conversation
@@ -1978,7 +1978,9 @@ def _get_sliding_sync_insert_values_from_state_map( | |||
elif state_key == (EventTypes.Name, ""): | |||
room_name = event.content.get(EventContentFields.ROOM_NAME) | |||
# Scrutinize JSON values | |||
if room_name is None or isinstance(room_name, str): | |||
if room_name is None or ( | |||
isinstance(room_name, str) and "\0" not in room_name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we care to check this? As far as I can tell, it's valid JSON. Is this a Matrix spec thing?
Perhaps it's because we have to mix with a system that is sensitive to null-terminated strings (C strings)? Doing some quick searching, it seems like Postgres does not allow null bytes in TEXT
fields but it is allowed in SQLite.
The only other place we do this is in
synapse/synapse/storage/databases/main/stats.py
Lines 268 to 288 in 391c4f8
# Ensure that the values to update are valid, they should be strings and | |
# not contain any null bytes. | |
# | |
# Invalid data gets overwritten with null. | |
# | |
# Note that a missing value should not be overwritten (it keeps the | |
# previous value). | |
sentinel = object() | |
for col in ( | |
"join_rules", | |
"history_visibility", | |
"encryption", | |
"name", | |
"topic", | |
"avatar", | |
"canonical_alias", | |
"guest_access", | |
"room_type", | |
): | |
field = fields.get(col, sentinel) | |
if field is not sentinel and (not isinstance(field, str) or "\0" in field): |
Why aren't we checking other content values like we do in the stats code?
Why aren't we checking this in the events
code where we also insert these state values into the database? Do we disallow this with some validation somewhere to prevent this with new data?
We should explain the reason why we're doing this. I assume it's the Postgres reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
D'oh! Sorry yeah, postgres TEXT
fields can't have \0
bytes in them (nyargghghgh why). Will update with comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why aren't we checking other content values like we do in the stats code?
^
Why aren't we checking this in the
events
code where we also insert these state values into the database? Do we disallow this with some validation somewhere to prevent this with new data?
^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah, I forgot that room_type
is also a thing. The rest I think are fine, e.g. tombstone room ID should be a valid room ID (but we may as well check that too).
AND (c.membership != ? OR c.user_id != e.sender) | ||
ORDER BY c.room_id ASC, c.user_id ASC | ||
LIMIT ? | ||
""", | ||
(last_room_id, last_user_id, batch_size), | ||
(last_room_id, last_user_id, Membership.LEAVE, batch_size), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer not to exclude leave memberships. It's just extra things to think about.
We want the table to be complete in the end so that means we will need to add another background update to fill in the leave
events. If we're going to split the background update, we should just do that in this PR so we don't need to remove those assertions from the tests that will probably get lost.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit in tow minds about this. The problem is that this is a surprisingly huge amount of data that will just never be read. On the other hand, it feels inconsistent to not port them over but to keep future old rows.
I wonder if the right thing to do here maybe is to skip these rows, but add a background job that clears out old left rooms from the table? Which is possible since we can response with M_UNKNOWN_POS
if we get a sliding sync request that has a pos from before we purged?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to accept the forever background update to clean-up leaves and keep the size of the database table down, that can work 👍
I assume we want some grace period for left rooms (a day)? That way people don't immediately get a M_UNKNOWN_POS
a soon as they leave a room and allows other clients to catch-up gracefully if the room is left on another client.
Is it really worth this complexity though? We're not storing every membership ever, just the latest membership of a given user for a given room.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we sure that we're not going to add a include_leave
option like Sync v2 has?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume we want some grace period for left rooms (a day)?
Yeah, probably something like a week or something, maybe longer.
Is it really worth this complexity though? We're not storing every membership ever, just the latest membership of a given user fro a given room.
It's kinda sucky to keep around things forever. It's fine when it's not causing issues, but it'll take a really long time for the background updates to run for data that we currently don't use.
Are we sure that we're not going to add a
include_leave
option like Sync v2 has?
That's a fair question, I guess it would be good to know if any clients actually use that option.
I guess a potential half-way house is for us to not port over the metadata for left rooms for now? Which also seems wrong but will be a lot quicker and have a chance to actually complete.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some discussion in an internal room (light discussion with people on both sides)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I remove this patch and open a separate PR are you happy with the rest? We probably want the actual bug fixes to go into the RC
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good 🙂 ("Ignore leave events for bg updates" moved to another PR)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
c.f. #17699
e398290
to
1679ba0
Compare
Co-authored-by: Eric Eastwood <[email protected]>
Co-authored-by: Eric Eastwood <[email protected]>
a1d3838
to
06173a3
Compare
Follow-up to #17641, #17634, #17631 and #17632 to fix-up #17512
A few things are going on here.
forgotten
state for those memberships, but I don't actually think we need to. If a row already exists for that room/user then it was inserted as a new one by the event persist path, in which case it is by definition not forgotten, so we don't need to fix that path up.Ignore leave events for bg updatesReviewable commit-by-commit.