You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.
Synapse instances sometimes gain problems with the whole incoming federation process, because of only one broken room.
For example, this is happened with our ru-matrix.org homeserver: by reason of one broken event in room (problem is described in #10589) - incoming federation was working very slowly (delays of many hours for most of incoming messages from popular homeservers) for years, and E2EE encryption (because of timeouts E2EE keys exchange is not happened most of times), calls and other stuff that requires robust federation - was totally broken too! And admins spend a lot of time to understand the source of this problem, because of missing ERROR level logs (#10597).
This is only one example, but there may be many different situations when one room becomes broken, and this leads to broken federation of all other rooms too!
To prevent this will be good to implement some workaround in Synapse (and maybe in Spec too), that skips broken room and continue syncing federated data for other non-broken rooms.
The text was updated successfully, but these errors were encountered:
And as of v1.38.0 we do the majority of the processing of events in the background to ensure that /send request returns quickly (c.f. #10284).
I'm not that surprised there is a bug somewhere in the logic, but we do already have the infrastructure in place to stop one broken room from breaking inbound federation. I'm going to close this for now and try and track the instances where this is not the case as separate bugs, if that is OK @MurzNN ?
Description:
Synapse instances sometimes gain problems with the whole incoming federation process, because of only one broken room.
For example, this is happened with our ru-matrix.org homeserver: by reason of one broken event in room (problem is described in #10589) - incoming federation was working very slowly (delays of many hours for most of incoming messages from popular homeservers) for years, and E2EE encryption (because of timeouts E2EE keys exchange is not happened most of times), calls and other stuff that requires robust federation - was totally broken too! And admins spend a lot of time to understand the source of this problem, because of missing ERROR level logs (#10597).
This is only one example, but there may be many different situations when one room becomes broken, and this leads to broken federation of all other rooms too!
To prevent this will be good to implement some workaround in Synapse (and maybe in Spec too), that skips broken room and continue syncing federated data for other non-broken rooms.
The text was updated successfully, but these errors were encountered: