-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Conversation
2bb6b70
to
b43fc5e
Compare
synapse/replication/tcp/handler.py
Outdated
# If the position token matches our current token then we're up to date | ||
# and there's nothing to do. Otherwise, fetch all updates between then | ||
# and now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# If the position token matches our current token then we're up to date | |
# and there's nothing to do. Otherwise, fetch all updates between then | |
# and now. | |
# If the incoming position is at least our current position then we're | |
# up to date and there's nothing to do. Otherwise, fetch all updates | |
# between then and now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Example: I think the stream is at position 10.
I receive POSITION 4 8
.
Uh oh, I might have missed stream entries 5, 6, 7 and 8. Need to rescan for them.
(Here prev < next < current.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, no. POSITION 4 8
means that the new token for that instance should be set to 8, and nothing has happened between 4 and 8 from that instance. (The reason for this is that normally we'd just get a stream of RDATA
from the instance, which implicitly bumps the token. If for some reason the token has advanced without any updates to send, it sends a POSITION
, hence why it gives the range where things have not happened).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# If the position token matches our current token then we're up to date | |
# and there's nothing to do. Otherwise, fetch all updates between then | |
# and now. | |
# If the incoming previous position is less than our current position | |
# then we're up to date and there's nothing to do. Otherwise, fetch | |
# all updates between then and now. |
Maybe?
synapse/replication/tcp/handler.py
Outdated
# Note: We have to check that `current_token` is within the range, to | ||
# handle the case where the stream gets "reset" (e.g. for `caches` and | ||
# `typing` after the writer's restart). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Note: We have to check that `current_token` is within the range, to | |
# handle the case where the stream gets "reset" (e.g. for `caches` and | |
# `typing` after the writer's restart). | |
# Note: We also have to check that `current_token` is at least the | |
# previous position, to handle the case where the stream gets "reset" | |
# (e.g. for `caches` and `typing` after the writer's restart). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Example: I think the stream is at position 10.
I receive POSITION 12 15
.
Uh oh, I might have missed stream entries 11, 12, 13, 14, and 15. Need to rescan for them.
(Here current < prev < next.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uh oh, I might have missed stream entries 11, 12, 13, 14, and 15. Need to rescan for them.
You might have missed entries 11 and 12.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does POSITION
need some better documentation somewhere (either https://matrix-org.github.io/synapse/develop/development/synapse_architecture/streams.html or maybe in the code?) describing how it works? I think I was also under the impression that it told you were it had updates, not where it jumped to without updates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://matrix-org.github.io/synapse/develop/tcp_replication.html#position-s probably needs to be updated too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Not blocking this PR, but would appreciate a follow-up or at least an issue.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#16560 since its a Friday afternoon and I'm liable to forget to follow up otherwise
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Note: We have to check that `current_token` is within the range, to | |
# handle the case where the stream gets "reset" (e.g. for `caches` and | |
# `typing` after the writer's restart). | |
# Note: We also have to check that `current_token` is at least the | |
# new position, to handle the case where the stream gets "reset" | |
# (e.g. for `caches` and `typing` after the writer's restart). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is my understanding correct? LGTM if so. Wording tweaks are just suggestions, ignore at your discretion.
# Note: We have to check that `current_token` is within the range, to | ||
# handle the case where the stream gets "reset" (e.g. for `caches` and | ||
# `typing` after the writer's restart). | ||
missing_updates = not (cmd.prev_token <= current_token <= cmd.new_token) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sanity check: are the inequalities correct here? E.g. should one of them be <
instead of <=
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point that it looks weird. But I think they're right as we're comparing "upper bounds"
Co-authored-by: Patrick Cloke <[email protected]>
The
POSITION
command includes two tokens, the "prev" and "new" tokens. The "new" token is what we should set the current token for the instance to be, and the "prev" token was where it was previously (i.e. the instance produced no rows between the "prev" and "new" tokens).This means that we can safely relax the condition to only check for missing updates if the current token for the writer is strictly less than the "prev" token.
A common case is receiving something like
POSITIONS event_persister1 events 15 16
, which indicates the writer advanced their events stream token from 15 to 16 (usually as a response to another instance persisting an event with stream ID 16). Usually, the other instances would already have inferred thatevent_persister1
must have advanced their token, so thecurrent_token
would already be set to 16, thus triggering a lookup for missing updates between tokens 16 (cmd.new_token
) and 16 (current_token
), which is silly.We also move the logging back into the loop so that we don't log in the common case.