Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Synchroton causes increased CPU usage requesting non existing users devices #11928

Closed
deepbluev7 opened this issue Feb 7, 2022 · 23 comments
Closed
Assignees

Comments

@deepbluev7
Copy link
Contributor

deepbluev7 commented Feb 7, 2022

Description

My synapse shows increased load on its sync worker:

image

image

In the logs it repeats the following:

master.log:

2022-02-07 15:39:45,021 - synapse.handlers.device - 883 - WARNING - _maybe_retry_device_resync-16597 - Failed to handle device list update for @kenji_schram:escapethematrix.ddns.net: 403: Forbidden
2022-02-07 15:39:45,050 - synapse.http.matrixfederationclient - 607 - INFO - _maybe_retry_device_resync-16597 - {GET-O-14600367} [ethnobar.xyz] Got response headers: 403 Forbidden
2022-02-07 15:39:45,051 - synapse.http.matrixfederationclient - 685 - WARNING - _maybe_retry_device_resync-16597 - {GET-O-14600367} [ethnobar.xyz] Request failed: GET matrix://ethnobar.xyz/_matrix/federation/v1/user/devices/%40barkeep%3Aethnobar.xyz: HttpResponseException('403: Forbidden')
2022-02-07 15:39:45,051 - synapse.handlers.device - 883 - WARNING - _maybe_retry_device_resync-16597 - Failed to handle device list update for @barkeep:ethnobar.xyz: 403: Forbidden
2022-02-07 15:39:45,226 - synapse.access.http.9893 - 427 - INFO - POST-3202731 - 127.0.0.1 - 9893 - {None} Processed request: 0.000sec/0.000sec (0.000sec, 0.000sec) (0.000sec/0.000sec/0) 2B 200 "POST /_synapse/replication/presence_set_state/%40deepbluev7%3Aneko.dev HTTP/1.1" "Synapse/1.52.0rc1" [0 dbevts]
2022-02-07 15:39:45,408 - synapse.http.matrixfederationclient - 607 - INFO - _maybe_retry_device_resync-16597 - {GET-O-14600372} [synapse.solarinet.org] Got response headers: 403 Forbidden
2022-02-07 15:39:45,409 - synapse.http.matrixfederationclient - 685 - WARNING - _maybe_retry_device_resync-16597 - {GET-O-14600372} [synapse.solarinet.org] Request failed: GET matrix://synapse.solarinet.org/_matrix/federation/v1/user/devices/%40uhjxlm%3Asynapse.solarinet.org: HttpResponseException('403: Forbidden')
2022-02-07 15:39:45,409 - synapse.handlers.device - 883 - WARNING - _maybe_retry_device_resync-16597 - Failed to handle device list update for @uhjxlm:synapse.solarinet.org: 403: Forbidden
2022-02-07 15:39:45,438 - synapse.http.matrixfederationclient - 607 - INFO - _maybe_retry_device_resync-16597 - {GET-O-14600373} [matrix.tilos.hu] Got response headers: 403 Forbidden
2022-02-07 15:39:45,438 - synapse.http.matrixfederationclient - 685 - WARNING - _maybe_retry_device_resync-16597 - {GET-O-14600373} [matrix.tilos.hu] Request failed: GET matrix://matrix.tilos.hu/_matrix/federation/v1/user/devices/%40sdd-%3Amatrix.tilos.hu: HttpResponseException('403: Forbidden')
2022-02-07 15:39:45,438 - synapse.handlers.device - 883 - WARNING - _maybe_retry_device_resync-16597 - Failed to handle device list update for @sdd-:matrix.tilos.hu: 403: Forbidden
2022-02-07 15:39:45,440 - synapse.replication.http.federation - 176 - INFO - POST-3202732 - Got 'm.presence' edu from nul.ie
2022-02-07 15:39:45,441 - synapse.access.http.9893 - 427 - INFO - POST-3202732 - 127.0.0.1 - 9893 - {None} Processed request: 0.001sec/0.000sec (0.001sec, 0.000sec) (0.000sec/0.000sec/0) 4B 200 "POST /_synapse/replication/fed_send_edu/m.presence/XaiPwpDeHh HTTP/1.1" "Synapse/1.52.0rc1" [0 dbevts]
2022-02-07 15:39:45,472 - synapse.http.matrixfederationclient - 607 - INFO - _maybe_retry_device_resync-16597 - {GET-O-14600392} [matrix.ee.ethz.ch] Got response headers: 403 Forbidden
2022-02-07 15:39:45,472 - synapse.http.matrixfederationclient - 685 - WARNING - _maybe_retry_device_resync-16597 - {GET-O-14600392} [matrix.ee.ethz.ch] Request failed: GET matrix://matrix.ee.ethz.ch/_matrix/federation/v1/user/devices/%40stroth%3Amatrix.ee.ethz.ch: HttpResponseException('403: Forbidden')
2022-02-07 15:39:45,472 - synapse.handlers.device - 883 - WARNING - _maybe_retry_device_resync-16597 - Failed to handle device list update for @stroth:matrix.ee.ethz.ch: 403: Forbidden
2022-02-07 15:39:45,505 - synapse.http.matrixfederationclient - 607 - INFO - _maybe_retry_device_resync-16597 - {GET-O-14600399} [chat.alabasta.net] Got response headers: 400 Bad Request
2022-02-07 15:39:45,506 - synapse.http.matrixfederationclient - 685 - WARNING - _maybe_retry_device_resync-16597 - {GET-O-14600399} [chat.alabasta.net] Request failed: GET matrix://chat.alabasta.net/_matrix/federation/v1/user/devices/%40florent.poupard%3Achat.alabasta.net: HttpResponseException('400: Bad Request')
2022-02-07 15:39:45,506 - synapse.handlers.device - 883 - WARNING - _maybe_retry_device_resync-16597 - Failed to handle device list update for @florent.poupard:chat.alabasta.net: 400: Bad Request
2022-02-07 15:39:45,548 - synapse.replication.http.federation - 176 - INFO - POST-3202733 - Got 'm.typing' edu from synod.im
2022-02-07 15:39:45,553 - synapse.handlers.typing - 364 - INFO - POST-3202733 - Got typing update from @versoreo:synod.im: {'room_id': '!YTvKGNlinIzlkMTVRl:matrix.org', 'typing': True, 'user_id': '@versoreo:synod.im'}
2022-02-07 15:39:45,553 - synapse.replication.tcp.resource - 191 - INFO - replication_notifier-3884196 - Streaming: typing -> 137110
2022-02-07 15:39:45,555 - synapse.access.http.9893 - 427 - INFO - POST-3202733 - 127.0.0.1 - 9893 - {None} Processed request: 0.006sec/0.000sec (0.005sec, 0.000sec) (0.000sec/0.000sec/0) 4B 200 "POST /_synapse/replication/fed_send_edu/m.typing/WWJgGIOCVP HTTP/1.1" "Synapse/1.52.0rc1" [0 dbevts]
2022-02-07 15:39:45,631 - synapse.http.matrixfederationclient - 607 - INFO - _maybe_retry_device_resync-16597 - {GET-O-14600409} [citadel7.org] Got response headers: 403 Forbidden
2022-02-07 15:39:45,631 - synapse.http.matrixfederationclient - 685 - WARNING - _maybe_retry_device_resync-16597 - {GET-O-14600409} [citadel7.org] Request failed: GET matrix://citadel7.org/_matrix/federation/v1/user/devices/%40ektgrh%3Acitadel7.org: HttpResponseException('403: Forbidden')
2022-02-07 15:39:45,631 - synapse.handlers.device - 883 - WARNING - _maybe_retry_device_resync-16597 - Failed to handle device list update for @ektgrh:citadel7.org: 403: Forbidden
2022-02-07 15:39:45,649 - synapse.access.http.9893 - 427 - INFO - POST-3202734 - 127.0.0.1 - 9893 - {None} Processed request: 0.000sec/0.001sec (0.000sec, 0.000sec) (0.000sec/0.000sec/0) 2B 200 "POST /_synapse/replication/presence_set_state/%40echo%3Aneko.dev HTTP/1.1" "Synapse/1.52.0rc1" [0 dbevts]
2022-02-07 15:39:45,961 - synapse.http.matrixfederationclient - 607 - INFO - _maybe_retry_device_resync-16597 - {GET-O-14600432} [remerge.net] Got response headers: 403 Forbidden
2022-02-07 15:39:45,962 - synapse.http.matrixfederationclient - 685 - WARNING - _maybe_retry_device_resync-16597 - {GET-O-14600432} [remerge.net] Request failed: GET matrix://remerge.net/_matrix/federation/v1/user/devices/%40legogris%3Aremerge.net: HttpResponseException('403: Forbidden')

Synchrotron logs just show normal sync traffic:

2022-02-07 15:42:39,219 - synapse.http.client - 446 - INFO - GET-2346862 - Received response to POST http://127.0.0.1:9893/_synapse/replication/presence_set_state/%40deepbluev7%3Aneko.dev: 200
2022-02-07 15:42:39,721 - synapse.util.caches.lrucache - 154 - INFO - LruCache._expire_old_entries-17567 - Dropped 22 items from caches
2022-02-07 15:42:39,849 - synapse.access.http.8091 - 427 - INFO - GET-2346861 - ::1 - 8091 - {@deepbluev7:neko.dev} Processed request: 0.698sec/0.001sec (0.143sec, 0.016sec) (0.098sec/0.459sec/633) 666B 200 "GET /_matrix/client/r0/sync?timeout=30000&since=s9486185_161372649_137213_13859288_670672_347_275576_136955531_31&filter=10&set_presence=offline HTTP/1.0" "mautrix-telegram/0.11.0+dev.unknown mautrix-python/0.14.2 aiohttp/3.7.4 Python/3.9.9" [0 dbevts]
2022-02-07 15:42:39,849 - synapse.access.http.8091 - 427 - INFO - GET-2346862 - 127.0.0.1 - 8091 - {@deepbluev7:neko.dev} Processed request: 0.632sec/0.001sec (0.025sec, 0.000sec) (0.000sec/0.000sec/1) 667B 200 "GET /_matrix/client/r0/sync?set_presence=online&since=s9486185_161372649_137213_13859288_670672_347_275576_136955531_31&timeout=30000 HTTP/1.0" "mtxclient v0.6.1" [0 dbevts]
2022-02-07 15:42:39,894 - synapse.http.client - 446 - INFO - GET-2346864 - Received response to POST http://127.0.0.1:9893/_synapse/replication/presence_set_state/%40deepbluev7%3Aneko.dev: 200
2022-02-07 15:42:42,065 - synapse.access.http.8091 - 427 - INFO - GET-2346864 - 127.0.0.1 - 8091 - {@deepbluev7:neko.dev} Processed request: 2.173sec/0.000sec (0.002sec, 0.000sec) (0.000sec/0.001sec/1) 489B 200 "GET /_matrix/client/r0/sync?set_presence=online&since=s9486185_161372649_137218_13859288_670672_347_275576_136955531_31&timeout=30000 HTTP/1.0" "mtxclient v0.6.1" [0 dbevts]
2022-02-07 15:42:42,108 - synapse.http.client - 446 - INFO - GET-2346865 - Received response to POST http://127.0.0.1:9893/_synapse/replication/presence_set_state/%40deepbluev7%3Aneko.dev: 200
2022-02-07 15:42:42,453 - synapse.access.http.8091 - 427 - INFO - GET-2346815 - 127.0.0.1 - 8091 - {@dimensionbot:neko.dev} Processed request: 10.004sec/0.001sec (0.002sec, 0.000sec) (0.000sec/0.000sec/1) 231B 200 "GET /_matrix/client/r0/sync?full_state=false&timeout=10000&since=s9486096_161372576_137204_13859269_670672_347_275576_136955531_31 HTTP/1.0" "-" [0 dbevts]
2022-02-07 15:42:42,460 - synapse.http.client - 446 - INFO - GET-2346866 - Received response to POST http://127.0.0.1:9893/_synapse/replication/presence_set_state/%40dimensionbot%3Aneko.dev: 200
2022-02-07 15:42:42,464 - synapse.access.http.8091 - 427 - INFO - GET-2346866 - 127.0.0.1 - 8091 - {@dimensionbot:neko.dev} Processed request: 0.007sec/0.000sec (0.003sec, 0.000sec) (0.001sec/0.000sec/1) 391B 200 "GET /_matrix/client/r0/sync?full_state=false&timeout=10000&since=s9486116_161372585_137210_13859279_670672_347_275576_136955531_31 HTTP/1.0" "-" [0 dbevts]
2022-02-07 15:42:42,471 - synapse.http.client - 446 - INFO - GET-2346867 - Received response to POST http://127.0.0.1:9893/_synapse/replication/presence_set_state/%40dimensionbot%3Aneko.dev: 200
2022-02-07 15:42:42,628 - synapse.access.http.8091 - 427 - INFO - GET-2346863 - ::1 - 8091 - {@deepbluev7:neko.dev} Processed request: 2.775sec/0.001sec (0.385sec, 0.045sec) (0.205sec/0.976sec/1315) 1263B 200 "GET /_matrix/client/r0/sync?timeout=30000&since=s9486185_161372649_137218_13859288_670672_347_275576_136955531_31&filter=10&set_presence=offline HTTP/1.0" "mautrix-telegram/0.11.0+dev.unknown mautrix-python/0.14.2 aiohttp/3.7.4 Python/3.9.9" [2 dbevts]
2022-02-07 15:42:42,628 - synapse.access.http.8091 - 427 - INFO - GET-2346865 - ::1 - 8091 - {@deepbluev7:neko.dev} Processed request: 0.522sec/0.001sec (0.043sec, 0.003sec) (0.003sec/0.016sec/7) 2340B 200 "GET /_matrix/client/r0/sync?set_presence=online&since=s9486185_161372650_137218_13859288_670672_347_275576_136955531_31&timeout=30000 HTTP/1.0" "mtxclient v0.6.1" [0 dbevts]
2022-02-07 15:42:42,695 - synapse.http.client - 446 - INFO - GET-2346869 - Received response to POST http://127.0.0.1:9893/_synapse/replication/presence_set_state/%40deepbluev7%3Aneko.dev: 200
2022-02-07 15:42:43,739 - synapse.access.http.8091 - 427 - INFO - GET-2346842 - ::1 - 8091 - {@abuse-management:neko.dev} Processed request: 10.004sec/0.000sec (0.002sec, 0.000sec) (0.000sec/0.002sec/2) 231B 200 "GET /_matrix/client/r0/sync?full_state=false&timeout=10000&since=s9486096_161372576_137204_13859269_670672_347_275576_136955531_31 HTTP/1.0" "-" [0 dbevts]
2022-02-07 15:42:43,745 - synapse.http.client - 446 - INFO - GET-2346870 - Received response to POST http://127.0.0.1:9893/_synapse/replication/presence_set_state/%40abuse-management%3Aneko.dev: 200
2022-02-07 15:42:43,750 - synapse.access.http.8091 - 427 - INFO - GET-2346870 - 127.0.0.1 - 8091 - {@abuse-management:neko.dev} Processed request: 0.008sec/0.000sec (0.002sec, 0.000sec) (0.001sec/0.001sec/2) 807B 200 "GET /_matrix/client/r0/sync?full_state=false&timeout=10000&since=s9486180_161372586_137210_13859284_670672_347_275576_136955531_31 HTTP/1.0" "-" [0 dbevts]
2022-02-07 15:42:43,756 - synapse.http.client - 446 - INFO - GET-2346871 - Received response to POST http://127.0.0.1:9893/_synapse/replication/presence_set_state/%40abuse-management%3Aneko.dev: 200
2022-02-07 15:42:43,878 - synapse.access.http.8091 - 427 - INFO - GET-2346869 - ::1 - 8091 - {@deepbluev7:neko.dev} Processed request: 1.185sec/0.001sec (0.039sec, 0.003sec) (0.002sec/0.016sec/6) 1358B 200 "GET /_matrix/client/r0/sync?set_presence=online&since=s9486186_161372651_137219_13859291_670672_347_275576_136955531_31&timeout=30000 HTTP/1.0" "mtxclient v0.6.1" [0 dbevts]
2022-02-07 15:42:43,941 - synapse.http.client - 446 - INFO - GET-2346872 - Received response to POST http://127.0.0.1:9893/_synapse/replication/presence_set_state/%40deepbluev7%3Aneko.dev: 200
2022-02-07 15:42:45,037 - synapse.access.http.8091 - 427 - INFO - GET-2346872 - 127.0.0.1 - 8091 - {@deepbluev7:neko.dev} Processed request: 1.099sec/0.000sec (0.034sec, 0.002sec) (0.002sec/0.017sec/6) 1476B 200 "GET /_matrix/client/r0/sync?set_presence=online&since=s9486187_161372652_137219_13859291_670672_347_275576_136955531_31&timeout=30000 HTTP/1.0" "mtxclient v0.6.1" [0 dbevts]
2022-02-07 15:42:45,102 - synapse.http.client - 446 - INFO - GET-2346873 - Received response to POST http://127.0.0.1:9893/_synapse/replication/presence_set_state/%40deepbluev7%3Aneko.dev: 200
2022-02-07 15:42:45,788 - synapse.access.http.8091 - 427 - INFO - GET-2346873 - ::1 - 8091 - {@deepbluev7:neko.dev} Processed request: 0.689sec/0.001sec (0.033sec, 0.003sec) (0.003sec/0.016sec/6) 1548B 200 "GET /_matrix/client/r0/sync?set_presence=online&since=s9486188_161372655_137219_13859291_670672_347_275576_136955531_31&timeout=30000 HTTP/1.0" "mtxclient v0.6.1" [0 dbevts]
2022-02-07 15:42:45,853 - synapse.http.client - 446 - INFO - GET-2346874 - Received response to POST http://127.0.0.1:9893/_synapse/replication/presence_set_state/%40deepbluev7%3Aneko.dev: 200

Steps to reproduce

I have no idea, but it might be related to the spam attack servers getting deleted?

Version information

  • Homeserver: neko.dev

If not matrix.org:

  • Version: 1.52.0_rc1

  • Install method: ebuild

  • Platform: Gentoo
@deepbluev7
Copy link
Contributor Author

Since restarting might fix it, I will just let the worker keep on running for now and see, if it recovers on its own.

@deepbluev7
Copy link
Contributor Author

Ah, I guess this is a dupe of #8983 #9603 or #10218 ?

@deepbluev7
Copy link
Contributor Author

Might be related to presence?

image

@deepbluev7
Copy link
Contributor Author

deepbluev7 commented Feb 8, 2022

I did now delete those users from the device_lists_remote_resync table as well as disabled double puppeting for the telegram bridge, but the usage didn't change... And it is not like that worker is seeing much traffic:

ET /_matrix/client/r0/sync?full_state=false&timeout=10000&since=s9503886_161511578_156_13876120_671217_347_276844_136969026_31 HTTP/1.0" "-" [0 dbevts]
@400000006202796616e57c9a 2022-02-08 14:08:01,384 - synapse.http.client - 446 - INFO - GET-1140 - Received response to POST http://127.0.0.1:9893/_synapse/replication/presence_set_state/%40dimensionbot%3Aneko.dev: 200
@40000000620279662893b727 2022-02-08 14:08:01,680 - synapse.access.http.8091 - 427 - INFO - GET-1136 - ::1 - 8091 - {@deepbluev7:neko.dev} Processed request: 2.623sec/0.000sec (0.268sec, 0.034sec) (0.127sec/0.859sec/1261) 11979B 200 "GET /_matrix/client/r0/sync?set_presence=online&since=s9503886_161511591_156_13876127_671217_347_276844_136969026_31&timeout=30000 HTTP/1.0" "mtxclient v0.6.1" [0 dbevts]
@40000000620279662d51e8a8 2022-02-08 14:08:01,760 - synapse.http.client - 446 - INFO - GET-1141 - Received response to POST http://127.0.0.1:9893/_synapse/replication/presence_set_state/%40deepbluev7%3Aneko.dev: 200
@40000000620279680477a933 2022-02-08 14:08:03,074 - synapse.access.http.8091 - 427 - INFO - GET-1141 - ::1 - 8091 - {@deepbluev7:neko.dev} Processed request: 1.316sec/0.000sec (0.274sec, 0.044sec) (0.163sec/0.877sec/1268) 2902B 200 "GET /_matrix/client/r0/sync?set_presence=online&since=s9503886_161511675_156_13876128_671217_347_276844_136969026_31&timeout=30000 HTTP/1.0" "mtxclient v0.6.1" [0 dbevts]
@400000006202796807c4a663 2022-02-08 14:08:03,130 - synapse.http.client - 446 - INFO - GET-1142 - Received response to POST http://127.0.0.1:9893/_synapse/replication/presence_set_state/%40deepbluev7%3Aneko.dev: 200
@40000000620279681fedfa76 2022-02-08 14:08:03,535 - synapse.util.caches.lrucache - 154 - INFO - LruCache._expire_old_entries-27 - Dropped 0 items from caches
@40000000620279690b8b7925 2022-02-08 14:08:04,193 - synapse.access.http.8091 - 427 - INFO - GET-1142 - 127.0.0.1 - 8091 - {@deepbluev7:neko.dev} Processed request: 1.066sec/0.000sec (0.140sec, 0.016sec) (0.074sec/0.408sec/638) 907B 200 "GET /_matrix/client/r0/sync?set_presence=online&since=s9503886_161511676_156_13876134_671217_347_276844_136969026_31&timeout=30000 HTTP/1.0" "mtxclient v0.6.1" [1 dbevts]
@40000000620279690e416eec 2022-02-08 14:08:04,239 - synapse.http.client - 446 - INFO - GET-1143 - Received response to POST http://127.0.0.1:9893/_synapse/replication/presence_set_state/%40deepbluev7%3Aneko.dev: 200
@4000000062027969190b2c6f 2022-02-08 14:08:04,420 - synapse.access.http.9106 - 427 - INFO - GET-1144 - ::1 - 9106 - {None} Processed request: 0.019sec/-0.000sec (0.019sec, 0.000sec) (0.000sec/0.000sec/0) 248855B 200 "GET /_synapse/metrics HTTP/1.1" "Prometheus/2.26.1" [0 dbevts]
@400000006202796d2da1257f 2022-02-08 14:08:08,765 - synapse.access.http.8091 - 427 - INFO - GET-1143 - ::1 - 8091 - {@deepbluev7:neko.dev} Processed request: 4.529sec/0.000sec (0.265sec, 0.038sec) (0.124sec/0.839sec/1267) 378B 200 "GET /_matrix/client/r0/sync?set_presence=online&since=s9503887_161511676_156_13876134_671217_347_276844_136969026_31&timeout=30000 HTTP/1.0" "mtxclient v0.6.1" [0 dbevts]
@400000006202796d301b8ba8 2022-02-08 14:08:08,807 - synapse.http.client - 446 - INFO - GET-1145 - Received response to POST http://127.0.0.1:9893/_synapse/replication/presence_set_state/%40deepbluev7%3Aneko.dev: 200
@400000006202796f261ad6fd 2022-02-08 14:08:10,639 - synapse.access.http.8091 - 427 - INFO - GET-1145 - 127.0.0.1 - 8091 - {@deepbluev7:neko.dev} Processed request: 1.834sec/0.000sec (0.272sec, 0.037sec) (0.140sec/0.814sec/1267) 384B 200 "GET /_matrix/client/r0/sync?set_presence=online&since=s9503887_161511677_156_13876134_671217_347_276844_136969026_31&timeout=30000 HTTP/1.0" "mtxclient v0.6.1" [0 dbevts]
@400000006202796f288fd334 2022-02-08 14:08:10,680 - synapse.http.client - 446 - INFO - GET-1146 - Received response to POST http://127.0.0.1:9893/_synapse/replication/presence_set_state/%40deepbluev7%3Aneko.dev: 200
@40000000620279700d2f140c 2022-02-08 14:08:11,221 - synapse.access.http.8091 - 427 - INFO - GET-1138 - 127.0.0.1 - 8091 - {@abuse-management:neko.dev} Processed request: 10.003sec/0.000sec (0.001sec, 0.001sec) (0.000sec/0.000sec/1) 228B 200 "GET /_matrix/client/r0/sync?full_state=false&timeout=10000&since=s9503886_161511676_156_13876131_671217_347_276844_136969026_31 HTTP/1.0" "-" [0 dbevts]
@40000000620279700db4e194 2022-02-08 14:08:11,229 - synapse.http.client - 446 - INFO - GET-1147 - Received response to POST http://127.0.0.1:9893/_synapse/replication/presence_set_state/%40abuse-management%3Aneko.dev: 200
@400000006202797016feed5d 2022-02-08 14:08:11,385 - synapse.access.http.8091 - 427 - INFO - GET-1140 - 127.0.0.1 - 8091 - {@dimensionbot:neko.dev} Processed request: 10.003sec/0.000sec (0.000sec, 0.000sec) (0.000sec/0.000sec/1) 228B 200 "GET /_matrix/client/r0/sync?full_state=false&timeout=10000&since=s9503886_161511676_156_13876131_671217_347_276844_136969026_31 HTTP/1.0" "-" [0 dbevts]
@4000000062027970174d0cb8 2022-02-08 14:08:11,390 - synapse.http.client - 446 - INFO - GET-1148 - Received response to POST http://127.0.0.1:9893/_synapse/replication/presence_set_state/%40dimensionbot%3Aneko.dev: 200
@400000006202797103696efb 2022-02-08 14:08:12,057 - synapse.access.http.8091 - 427 - INFO - GET-1146 - ::1 - 8091 - {@deepbluev7:neko.dev} Processed request: 1.379sec/0.000sec (0.141sec, 0.021sec) (0.067sec/0.418sec/630) 752B 200 "GET /_matrix/client/r0/sync?set_presence=online&since=s9503887_161511678_156_13876134_671217_347_276844_136969026_31&timeout=30000 HTTP/1.0" "mtxclient v0.6.1" [0 dbevts]
@4000000062027971061bab3c 2022-02-08 14:08:12,102 - synapse.http.client - 446 - INFO - GET-1149 - Received response to POST http://127.0.0.1:9893/_synapse/replication/presence_set_state/%40deepbluev7%3Aneko.dev: 200
@400000006202797209eefd72 2022-02-08 14:08:13,166 - synapse.access.http.8091 - 427 - INFO - GET-1149 - 127.0.0.1 - 8091 - {@deepbluev7:neko.dev} Processed request: 1.067sec/0.000sec (0.135sec, 0.016sec) (0.067sec/0.401sec/635) 733B 200 "GET /_matrix/client/r0/sync?set_presence=online&since=s9503887_161511678_156_13876135_671217_347_276844_136969026_31&timeout=30000 HTTP/1.0" "mtxclient v0.6.1" [0 dbevts]
@40000000620279720c88752f 2022-02-08 14:08:13,210 - synapse.http.client - 446 - INFO - GET-1150 - Received response to POST http://127.0.0.1:9893/_synapse/replication/presence_set_state/%40deepbluev7%3Aneko.dev: 200
@40000000620279720db1c33d 2022-02-08 14:08:13,229 - synapse.metrics._gc - 118 - INFO - sentinel - Collecting gc 1
@400000006202797236164336 2022-02-08 14:08:13,907 - synapse.access.http.8091 - 427 - INFO - GET-1150 - ::1 - 8091 - {@deepbluev7:neko.dev} Processed request: 0.699sec/0.000sec (0.141sec, 0.016sec) (0.066sec/0.494sec/635) 1762B 200 "GET /_matrix/client/r0/sync?set_presence=online&since=s9503887_161511678_156_13876136_671217_347_276844_136969026_31&timeout=30000 HTTP/1.0" "mtxclient v0.6.1" [0 dbevts]
@400000006202797239055bb6 2022-02-08 14:08:13,956 - synapse.http.client - 446 - INFO - GET-1151 - Received response to POST http://127.0.0.1:9893/_synapse/replication/presence_set_state/%40deepbluev7%3Aneko.dev: 200
@40000000620279740b674979 2022-02-08 14:08:15,191 - synapse.access.http.8091 - 427 - INFO - GET-1151 - 127.0.0.1 - 8091 - {@deepbluev7:neko.dev} Processed request: 1.237sec/0.000sec (0.269sec, 0.037sec) (0.132sec/0.830sec/1267) 523B 200 "GET /_matrix/client/r0/sync?set_presence=online&since=s9503887_161511678_156_13876139_671217_347_276844_136969026_31&timeout=30000 HTTP/1.0" "mtxclient v0.6.1" [0 dbevts]
@40000000620279740de03b21 2022-02-08 14:08:15,232 - synapse.http.client - 446 - INFO - GET-1152 - Received response to POST http://127.0.0.1:9893/_synapse/replication/presence_set_state/%40deepbluev7%3Aneko.dev: 200
@40000000620279751c5f7453 2022-02-08 14:08:16,475 - synapse.access.http.8091 - 427 - INFO - GET-1152 - ::1 - 8091 - {@deepbluev7:neko.dev} Processed request: 1.245sec/0.000sec (0.277sec, 0.030sec) (0.125sec/0.858sec/1267) 648B 200 "GET /_matrix/client/r0/sync?set_presence=online&since=s9503887_161511680_156_13876139_671217_347_276844_136969026_31&timeout=30000 HTTP/1.0" "mtxclient v0.6.1" [0 dbevts]
@40000000620279751ee8d8ba 2022-02-08 14:08:16,518 - synapse.http.client - 446 - INFO - GET-1153 - Received response to POST http://127.0.0.1:9893/_synapse/replication/presence_set_state/%40deepbluev7%3Aneko.dev: 200
@40000000620279762b859b18 2022-02-08 14:08:17,730 - synapse.access.http.8091 - 427 - INFO - GET-1153 - 127.0.0.1 - 8091 - {@deepbluev7:neko.dev} Processed request: 1.214sec/0.000sec (0.252sec, 0.037sec) (0.118sec/0.842sec/1267) 628B 200 "GET /_matrix/client/r0/sync?set_presence=online&since=s9503887_161511681_156_13876139_671217_347_276844_136969026_31&timeout=30000 HTTP/1.0" "mtxclient v0.6.1" [0 dbevts]
@40000000620279762e00c89d 2022-02-08 14:08:17,771 - synapse.http.client - 446 - INFO - GET-1154 - Received response to POST http://127.0.0.1:9893/_synapse/replication/presence_set_state/%40deepbluev7%3Aneko.dev: 200
@400000006202797800ac38a0 2022-02-08 14:08:19,011 - synapse.access.http.8091 - 427 - INFO - GET-1154 - ::1 - 8091 - {@deepbluev7:neko.dev} Processed request: 1.241sec/0.000sec (0.260sec, 0.040sec) (0.157sec/0.820sec/1267) 478B 200 "GET /_matrix/client/r0/sync?set_presence=online&since=s9503887_161511684_156_13876139_671217_347_276844_136969026_31&timeout=30000 HTTP/1.0" "mtxclient v0.6.1" [0 dbevts]
@40000000620279780328b350 2022-02-08 14:08:19,052 - synapse.http.client - 446 - INFO - GET-1155 - Received response to POST http://127.0.0.1:9893/_synapse/replication/presence_set_state/%40deepbluev7%3Aneko.dev: 200
@400000006202797819207f08 2022-02-08 14:08:19,421 - synapse.access.http.9106 - 427 - INFO - GET-1156 - ::1 - 9106 - {None} Processed request: 0.019sec/-0.000sec (0.019sec, 0.000sec) (0.000sec/0.000sec/0) 248907B 200 "GET /_synapse/metrics HTTP/1.1" "Prometheus/2.26.1" [0 dbevts]
@400000006202797912274055 2022-02-08 14:08:20,304 - synapse.access.http.8091 - 427 - INFO - GET-1155 - 127.0.0.1 - 8091 - {@deepbluev7:neko.dev} Processed request: 1.253sec/0.000sec (0.284sec, 0.027sec) (0.131sec/0.835sec/1268) 1130B 200 "GET /_matrix/client/r0/sync?set_presence=online&since=s9503887_161511685_156_13876139_671217_347_276844_136969026_31&timeout=30000 HTTP/1.0" "mtxclient v0.6.1" [0 dbevts]
@400000006202797914cc2bac 2022-02-08 14:08:20,348 - synapse.http.client - 446 - INFO - GET-1157 - Received response to POST http://127.0.0.1:9893/_synapse/replication/presence_set_state/%40deepbluev7%3Aneko.dev: 200
@400000006202797a0dd9e157 2022-02-08 14:08:21,232 - synapse.access.http.8091 - 427 - INFO - GET-1147 - 127.0.0.1 - 8091 - {@abuse-management:neko.dev} Processed request: 10.004sec/0.000sec (0.001sec, 0.000sec) (0.000sec/0.001sec/2) 228B 200 "GET /_matrix/client/r0/sync?full_state=false&timeout=10000&since=s9503886_161511676_156_13876131_671217_347_276844_136969026_31 HTTP/1.0" "-" [0 dbevts]
@400000006202797a0e2b3c6f 2022-02-08 14:08:21,237 - synapse.http.client - 446 - INFO - GET-1158 - Received response to POST http://127.0.0.1:9893/_synapse/replication/presence_set_state/%40abuse-management%3Aneko.dev: 200
@400000006202797a176a613b 2022-02-08 14:08:21,392 - synapse.access.http.8091 - 427 - INFO - GET-1148 - ::1 - 8091 - {@dimensionbot:neko.dev} Processed request: 10.004sec/0.000sec (0.000sec, 0.000sec) (0.000sec/0.000sec/1) 228B 200 "GET /_matrix/client/r0/sync?full_state=false&timeout=10000&since=s9503886_161511676_156_13876131_671217_347_276844_136969026_31 HTTP/1.0" "-" [0 dbevts]
@400000006202797a17bead8c 2022-02-08 14:08:21,398 - synapse.http.client - 446 - INFO - GET-1159 - Received response to POST http://127.0.0.1:9893/_synapse/replication/presence_set_state/%40dimensionbot%3Aneko.dev: 200
@400000006202797c01a49c64 2022-02-08 14:08:23,027 - synapse.access.http.8091 - 427 - INFO - GET-1157 - ::1 - 8091 - {@deepbluev7:neko.dev} Processed request: 2.680sec/0.000sec (0.264sec, 0.041sec) (0.127sec/0.817sec/1262) 385B 200 "GET /_matrix/client/r0/sync?set_presence=online&since=s9503887_161511688_156_13876140_671217_347_276844_136969026_31&timeout=30000 HTTP/1.0" "mtxclient v0.6.1" [0 dbevts]
@400000006202797c05ae023d 2022-02-08 14:08:23,095 - synapse.http.client - 446 - INFO - GET-1160 - Received response to POST http://127.0.0.1:9893/_synapse/replication/presence_set_state/%40deepbluev7%3Aneko.dev: 200

@babolivier
Copy link
Contributor

Could you try to profile the synchrotron using py-spy? It would help figuring out where the CPU is spending its time.

Alternately, you could try removing the users that are causing issues from the device_lists_remote_resync database table and see if it fixes things (Synapse doesn't use a cache to query from this table so you don't need to restart the synchrotron after updating it).

@babolivier babolivier added the X-Needs-Info This issue is blocked awaiting information from the reporter label Feb 8, 2022
@deepbluev7
Copy link
Contributor Author

I actually already removed those users. It seems that only a specific client of mine causes the issues because it sets the presence in the sync call. I'm still digging why though.

@deepbluev7
Copy link
Contributor Author

profile

Pretty much matches my grafana.

@deepbluev7
Copy link
Contributor Author

Since I don't think any update caused it, could it be postgres switching to a different query plan?

@babolivier babolivier removed the X-Needs-Info This issue is blocked awaiting information from the reporter label Feb 9, 2022
@babolivier
Copy link
Contributor

babolivier commented Feb 9, 2022

It seems that only a specific client of mine causes the issues because it sets the presence in the sync call

So when you close this client the issue disappears?

Since I don't think any update caused it, could it be postgres switching to a different query plan?

Maybe. Looking at both your Grafana screenshots and the flame graph, it looks like it's struggling on this request:

sql = """
SELECT state_key FROM room_memberships as m
INNER JOIN current_state_events as c
ON m.event_id = c.event_id
AND m.room_id = c.room_id
AND m.user_id = c.state_key
WHERE c.type = 'm.room.member' AND c.room_id = ? AND m.membership = ?
"""

So it could be worth doing an EXPLAIN on it to figure out whether the issue is coming from the postgres query plan. Annoyingly, the only way to figure out which specific room it chokes on would be to turn on debug logging (I don't assume it's already on for this worker), which would require a restart.

@deepbluev7
Copy link
Contributor Author

I'll do that later today, the CPU usage persists across restarts anyway...

So when you close this client the issue disappears?

Yes.

@babolivier
Copy link
Contributor

babolivier commented Feb 9, 2022

I'll do that later today, the CPU usage persists across restarts anyway...

Oh interesting. In which case I would suggest turning on debug logging for SQL requests on for this synchrotron (see this log config doc) and hunting down the room ID (and if there's just one room being problematic or multiple ones). With debug logging on Synapse logs the request's string on one line and the arguments on another, so that's how you retrieve them. Using the room ID with EXPLAIN could help figuring out if it's the query planner causing issues.

So when you close this client the issue disappears?

Yes

Just to be clear, this is the only client setting the presence in the sync request?

Also, re

I actually already removed those users.

Sorry, I completely missed you mentioned it when I first read your first few messages 😅

@deepbluev7
Copy link
Contributor Author

Just to be clear, this is the only client setting the presence in the sync request?

No, there are a few other clients too, but not for that account, I stopped them all for debugging

@deepbluev7
Copy link
Contributor Author

Also, that client saw no code changes related to presence in the last few months, so while it could be a client side bug, I don't think it was an issue before.

@deepbluev7
Copy link
Contributor Author

deepbluev7 commented Feb 9, 2022

Without any interaction my synapse stopped doing this before I came home... So I can't reproduce it anymore D:

image

@deepbluev7
Copy link
Contributor Author

I got a few device_list updates and a few thousand presence updates at that time and then it stopped doing that, so that probably unbricked it.

@deepbluev7
Copy link
Contributor Author

This is the event creator log from around that time:

2022-02-09 13:11:07,755 - synapse.replication.tcp.resource - 191 - INFO - replication_notifier-20541 - Streaming: events -> 9518585
2022-02-09 13:11:10,380 - synapse.replication.http.federation - 132 - INFO - POST-19862 - Got 1 events from federation
2022-02-09 13:11:10,388 - synapse.access.http.9894 - 427 - INFO - POST-19862 - 127.0.0.1 - 9894 - {None} Processed request: 0.008sec/0.000sec (0.001sec, 0.000sec) (0.000sec/0.000sec/0) 25B 200 "POST /_synapse/replication/fed_send_events/ElLjoYmXTY HTTP/1.1" "Synapse/1.52.0" [0 dbevts]
2022-02-09 13:11:10,391 - synapse.replication.tcp.resource - 191 - INFO - replication_notifier-20543 - Streaming: events -> 9518586
2022-02-09 13:11:11,724 - synapse.access.http.9102 - 427 - INFO - GET-19863 - ::1 - 9102 - {None} Processed request: 0.040sec/-0.000sec (0.040sec, 0.000sec) (0.000sec/0.000sec/0) 616327B 200 "GET /_synapse/metrics HTTP/1.1" "Prometheus/2.26.1" [0 dbevts]
2022-02-09 13:11:16,153 - synapse.replication.http.federation - 132 - INFO - POST-19864 - Got 1 events from federation
2022-02-09 13:11:16,201 - synapse.state - 554 - INFO - persist_events-13887 - Resolving state for !mjbDjyNsRXndKLkHIe:matrix.org with groups [615585, 616565, 1792494]
2022-02-09 13:11:16,341 - synapse.storage.databases.main.event_federation - 404 - INFO - persist_events-13887 - Unexpectedly found that events don't have chain IDs in room !mjbDjyNsRXndKLkHIe:matrix.org: {'$XPX2fEsjxyIZTR9-k5pdpsVwFFPwt6oBox-BweWeb3s'}
2022-02-09 13:11:17,197 - synapse.state.v2 - 533 - WARNING - persist_events-13887 - auth_event id $zTjQbxV26Zs_Jan_Km9kbq7YScafrGYhYMT0rKtJP7o for event $XPX2fEsjxyIZTR9-k5pdpsVwFFPwt6oBox-BweWeb3s is missing
2022-02-09 13:11:17,840 - synapse.storage.databases.main.event_push_actions - 608 - INFO - event_push_action_stream_orderings-137 - Searching for stream ordering 1 month ago
2022-02-09 13:11:17,851 - synapse.storage.databases.main.event_push_actions - 612 - INFO - event_push_action_stream_orderings-137 - Found stream ordering 1 month ago: it's 8967457
2022-02-09 13:11:17,851 - synapse.storage.databases.main.event_push_actions - 615 - INFO - event_push_action_stream_orderings-137 - Searching for stream ordering 1 day ago
2022-02-09 13:11:17,861 - synapse.storage.databases.main.event_push_actions - 619 - INFO - event_push_action_stream_orderings-137 - Found stream ordering 1 day ago: it's 9503181
2022-02-09 13:11:17,921 - synapse.replication.http.federation - 132 - INFO - POST-19865 - Got 1 events from federation
2022-02-09 13:11:17,927 - synapse.util.caches.lrucache - 154 - INFO - LruCache._expire_old_entries-2759 - Dropped 10 items from caches
2022-02-09 13:11:18,130 - synapse.replication.http.federation - 132 - INFO - POST-19866 - Got 1 events from federation
2022-02-09 13:11:18,143 - synapse.access.http.9894 - 427 - INFO - POST-19866 - 127.0.0.1 - 9894 - {None} Processed request: 0.013sec/0.000sec (0.001sec, 0.000sec) (0.000sec/0.000sec/0) 25B 200 "POST /_synapse/replication/fed_send_events/UkksBLMmgR HTTP/1.1" "Synapse/1.52.0" [0 dbevts]
2022-02-09 13:11:18,342 - synapse.state - 554 - INFO - persist_events-13888 - Resolving state for !OGEhHVWSdvArJzumhm:matrix.org with groups [1792171, 1792205, 1792173, 1792495, 1791632, 1791156, 1791001, 1792155, 1792381, 1791198]
2022-02-09 13:11:20,312 - synapse.replication.tcp.resource - 191 - INFO - replication_notifier-20545 - Streaming: caches -> 6088463
2022-02-09 13:11:20,342 - synapse.access.http.9894 - 427 - INFO - POST-19864 - 127.0.0.1 - 9894 - {None} Processed request: 4.159sec/0.030sec (0.001sec, 0.000sec) (0.000sec/0.000sec/0) 25B 200 "POST /_synapse/replication/fed_send_events/gUkONjXClS HTTP/1.1" "Synapse/1.52.0" [0 dbevts]
2022-02-09 13:11:20,427 - synapse.replication.tcp.resource - 191 - INFO - replication_notifier-20545 - Streaming: events -> 9518587
2022-02-09 13:11:21,343 - synapse.replication.tcp.resource - 191 - INFO - replication_notifier-20545 - Streaming: caches -> 6088463
2022-02-09 13:11:21,907 - synapse.replication.tcp.resource - 191 - INFO - replication_notifier-20545 - Streaming: events -> 9518588
2022-02-09 13:11:22,570 - synapse.replication.http.federation - 132 - INFO - POST-19867 - Got 1 events from federation
2022-02-09 13:11:22,624 - synapse.access.http.9894 - 427 - INFO - POST-19867 - 127.0.0.1 - 9894 - {None} Processed request: 0.054sec/0.000sec (0.001sec, 0.000sec) (0.000sec/0.000sec/0) 25B 200 "POST /_synapse/replication/fed_send_events/RRCGnEollD HTTP/1.1" "Synapse/1.52.0" [0 dbevts]
2022-02-09 13:11:22,628 - synapse.replication.tcp.resource - 191 - INFO - replication_notifier-20546 - Streaming: events -> 9518589
2022-02-09 13:11:22,706 - synapse.metrics._gc - 118 - INFO - sentinel - Collecting gc 1
2022-02-09 13:11:23,828 - synapse.replication.http.federation - 132 - INFO - POST-19868 - Got 1 events from federation
2022-02-09 13:11:23,910 - synapse.replication.tcp.resource - 191 - INFO - replication_notifier-20547 - Streaming: caches -> 6088464
2022-02-09 13:11:23,912 - synapse.access.http.9894 - 427 - INFO - POST-19868 - 127.0.0.1 - 9894 - {None} Processed request: 0.083sec/0.000sec (0.001sec, 0.000sec) (0.000sec/0.000sec/0) 25B 200 "POST /_synapse/replication/fed_send_events/sDQkKquhGU HTTP/1.1" "Synapse/1.52.0" [0 dbevts]
2022-02-09 13:11:23,916 - synapse.replication.tcp.resource - 191 - INFO - replication_notifier-20548 - Streaming: events -> 9518590
2022-02-09 13:11:25,256 - synapse.replication.tcp.resource - 191 - INFO - replication_notifier-20549 - Streaming: caches -> 6088465
2022-02-09 13:11:25,258 - synapse.access.http.9894 - 427 - INFO - POST-19865 - 127.0.0.1 - 9894 - {None} Processed request: 7.337sec/0.001sec (0.000sec, 0.000sec) (0.000sec/0.000sec/0) 25B 200 "POST /_synapse/replication/fed_send_events/WFDoqaxhbh HTTP/1.1" "Synapse/1.52.0" [0 dbevts]
2022-02-09 13:11:25,263 - synapse.replication.tcp.resource - 191 - INFO - replication_notifier-20550 - Streaming: events -> 9518591
2022-02-09 13:11:26,628 - synapse.replication.http.federation - 132 - INFO - POST-19869 - Got 1 events from federation
2022-02-09 13:11:26,638 - synapse.access.http.9894 - 427 - INFO - POST-19869 - 127.0.0.1 - 9894 - {None} Processed request: 0.010sec/0.000sec (0.001sec, 0.000sec) (0.000sec/0.000sec/0) 25B 200 "POST /_synapse/replication/fed_send_events/mPjUawVyOR HTTP/1.1" "Synapse/1.52.0" [0 dbevts]
2022-02-09 13:11:26,643 - synapse.replication.tcp.resource - 191 - INFO - replication_notifier-20551 - Streaming: events -> 9518592
2022-02-09 13:11:26,728 - synapse.access.http.9102 - 427 - INFO - GET-19870 - ::1 - 9102 - {None} Processed request: 0.044sec/-0.000sec (0.044sec, 0.000sec) (0.000sec/0.000sec/0) 616733B 200 "GET /_synapse/metrics HTTP/1.1" "Prometheus/2.26.1" [0 dbevts]
2022-02-09 13:11:28,499 - synapse.replication.http.federation - 132 - INFO - POST-19871 - Got 1 events from federation
2022-02-09 13:11:28,511 - synapse.replication.tcp.resource - 191 - INFO - replication_notifier-20552 - Streaming: caches -> 6088466
2022-02-09 13:11:28,513 - synapse.access.http.9894 - 427 - INFO - POST-19871 - 127.0.0.1 - 9894 - {None} Processed request: 0.013sec/0.001sec (0.002sec, 0.000sec) (0.000sec/0.000sec/0) 25B 200 "POST /_synapse/replication/fed_send_events/nSWKDmjPLD HTTP/1.1" "Synapse/1.52.0" [0 dbevts]
2022-02-09 13:11:28,518 - synapse.replication.tcp.resource - 191 - INFO - replication_notifier-20553 - Streaming: events -> 9518593
2022-02-09 13:11:30,325 - synapse.replication.http.federation - 132 - INFO - POST-19872 - Got 1 events from federation
2022-02-09 13:11:30,337 - synapse.replication.tcp.resource - 191 - INFO - replication_notifier-20554 - Streaming: caches -> 6088467
2022-02-09 13:11:30,338 - synapse.access.http.9894 - 427 - INFO - POST-19872 - 127.0.0.1 - 9894 - {None} Processed request: 0.013sec/0.001sec (0.002sec, 0.000sec) (0.000sec/0.000sec/0) 25B 200 "POST /_synapse/replication/fed_send_events/zKUndgXmqR HTTP/1.1" "Synapse/1.52.0" [0 dbevts]
2022-02-09 13:11:30,344 - synapse.replication.tcp.resource - 191 - INFO - replication_notifier-20555 - Streaming: events -> 9518594
2022-02-09 13:11:34,001 - synapse.replication.http.federation - 132 - INFO - POST-19873 - Got 1 events from federation
2022-02-09 13:11:34,011 - synapse.access.http.9894 - 427 - INFO - POST-19873 - 127.0.0.1 - 9894 - {None} Processed request: 0.010sec/0.000sec (0.001sec, 0.000sec) (0.000sec/0.000sec/0) 25B 200 "POST /_synapse/replication/fed_send_events/SEgGAmzjga HTTP/1.1" "Synapse/1.52.0" [0 dbevts]
2022-02-09 13:11:34,015 - synapse.replication.tcp.resource - 191 - INFO - replication_notifier-20557 - Streaming: events -> 9518595
2022-02-09 13:11:34,258 - synapse.replication.http.federation - 132 - INFO - POST-19874 - Got 1 events from federation

I think one event over federation caused this to fix itself and it seemed the event creator fixed it:

Screenshot_20220209_193148

Could it have been the missing chain ids?

@deepbluev7
Copy link
Contributor Author

So it started happening again all of a sudden and I think I have a useful logfile, but as it contains my whole roomlist, I would prefer to not post it publicly. Feel free to dm me for the file and I also sent it to @babolivier already. But the gist of it is, that it generates a room entry for pretty much my whole room list (and also fetches all of their users). My assumption is that maybe a state reset happened in one of the rooms, which confused the sync worker and causes it to fetch more rooms than necessary.

It is basically entering

async def handle_room_entries(room_entry: "RoomSyncResultBuilder") -> None:
with 600 rooms. For every incremental sync. So generating the response takes a almost a second and uses quite a bit of CPU. And as I get presence updates multiple times a second, this shows up in my CPU graphs.

@erikjohnston
Copy link
Member

FTR #11974 may help with _generate_sync_entry_for_device_list that one of your graphs was showing using lots of CPU

@deepbluev7
Copy link
Contributor Author

So you patch clearly does reduce CPU usage for the _generate_sync_entry_for_device_list, but I don't think it fixes the root cause that increased my sync CPU load:

image

Maybe I just hit a magic threshold when joining a few more rooms that makes postgres go crazy?

@deepbluev7
Copy link
Contributor Author

Maybe this is actually the root cause, that looks like a lot of evictions:

image

@deepbluev7
Copy link
Contributor Author

deepbluev7 commented Feb 15, 2022

So I just reverted a8da046 from #11878 and that fixes my CPU usage. So I guess that PR breaks something on my synapse?

image

image

image

(patch applied means a reversed patch of a8da046 )

@deepbluev7
Copy link
Contributor Author

Seems like I was wrong, it's back. Seems like I need 25x the user cache instead of the usual 2x I use:

image

@deepbluev7
Copy link
Contributor Author

I guess I should close this, since it got fixed by just increasing the size for that specific cache. Having autoadapting caches in the future would be great though :3

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants