-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Initial and incremental syncs with presence enabled eat up RAM until an OOM hits #13901
Comments
What is Synapse doing before it is killed? Can you share the last five minutes of logs prior to it getting OOMed? To best advise you we'd need to see metrics to understand where Synapse is allocating RAM and what Synapse is doing. Are you able to setup Prometheus and Grafana? See https://matrix-org.github.io/synapse/latest/metrics-howto.html |
Here is the log from when everything is fine and when the client connected.
|
Are you able to change the logging config to output DEBUG logs at the root level and include those? See |
|
The only request which is received but not completed (
This looks like an initial sync ( How many rooms is that user in? You could try turning down cache factors in config. This would make Synapse slower and use more database IO, but would decrease its memory usage. Other issues: #12158, #12971, #3720 I would also maybe try turning presence off to see if that helps---it's not well-used and known to be a bit buggy. |
( matrix-org/matrix-spec-proposals#3575 is the long-term answer to "sync is slow") |
8 rooms where communication is rare. 550, 300 and 200 people maximum in the biggest ones. The log is for the Element web client. Unfortunately, after the problem appeared, I cleared the browser cache. However, similar issues are seen on the Element mobile client. But some fresh messages manage to reach it before the server crashes for the same reason. It doesn't matter what device you're connecting from. The fall is still very fast, about a minute.
The value was set by me to 10K. Reducing the value to 1K does not solve the problem. |
The problem is not slow synchronization, but the fact that one connected user can completely paralyze the server! While a user is connected, other users cannot communicate normally. |
1.68 :( |
In my case, the memory leak occurs very quickly, within a minute after connecting one particular user. As long as this user has the client open, the server keeps crashing. You can see the speed of the fall in the video on YouTube. |
@fsa We've experienced similar although we have many more rooms, they're essentially all DMs. This makes it more difficult to track but this was one of my theories that it could be specific users. If you can't turn presence off for whatever reason, can you at least give your version history? Did it start after recently upgrading? Did it just happen randomly? I've been banging my head against a wall for close to two months on this, we can't have presence on without it causing the service to fail within minutes, and this (seems to be) closest to what we've been seeing that I've seen on here. |
It happened suddenly. There is a slight chance that there was an update before this, but I don't remember that. I discovered that something was wrong with the server during the day, on Monday, September 26, when I was walking with a child on the street. I could not write to another contact from my server. The day before, I had freely communicated with others. You need to look at when there were releases in the Ubuntu repository. When I came home, I found that something was happening with the server. After that, I moved the server and its configuration to another similar server in order to secure all the other services that I had there. After the transfer, the picture was preserved. As long as I don't connect to the server, others can communicate through it. |
Agreed. But initial syncs are slow because they are expensive, and that expensiveness may explain why the you see OOMs.
I'm not sure what this means. Are you saying that you still see the problem after upgrading to 1.68?
To reiterate: can you turn off presence and let us know if it allows you to join without OOMkilling your server? |
Memory consumption stopped. I was able to log in with my account and the server is working fine. |
The title has been incorrectly corrected. Even after the initial synchronization of the client, the above option cannot be disabled. Any user connection causes memory consumption when the option is disabled. |
When you say
do you mean that reverting to
and restarting synapse causes the same OOM problem as before once a client connects? |
Right. The memory consumption was even when connected from a mobile client that I had synced. There, even, it was possible to have time to receive some of the messages before the server was killed. |
I suggest you permanently leave
until we can find a proper fix for this issue. |
Yes. I'll have to use this crutch. Otherwise my server will not work properly. |
Description
Synapse eat all ram (8GB) and swap until OOM hits.
Steps to reproduce
There are 3 accounts on the server. This happens when one of the users connects (my account is an administrator). Did not connect to other rooms with a large number of connections before the problem occurred.
Homeserver
tavda.org
Synapse Version
{"server_version":"1.67.0","python_version":"3.10.6"}
Installation Method
Debian packages from packages.matrix.org
Platform
Ubuntu 22.04 LTS
Relevant log output
Anything else that would be useful to know?
During normal server operation, the top shows 7.5GB of free memory. The other two users can communicate with each other.
The text was updated successfully, but these errors were encountered: