Skip to content

Conversation

MadLittleMods
Copy link
Contributor

@MadLittleMods MadLittleMods commented Oct 9, 2025

Move unique snowflake homeserver background tasks to start_background_tasks (the standard pattern for this kind of thing)

Pull Request Checklist

  • Pull request is based on the develop branch
  • Pull request includes a changelog file. The entry should:
    • Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from EventStore to EventWorkerStore.".
    • Use markdown where necessary, mostly for code blocks.
    • End with either a period (.) or an exclamation mark (!).
    • Start with a capital letter.
    • Feel free to credit yourself, by adding a sentence "Contributed by @github_username." or "Contributed by [Your Name]." to the end of the entry.
  • Code style is correct (run the linters)

Comment on lines +647 to +648
self.get_common_usage_metrics_manager().setup()
start_phone_stats_home(self)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to HomeServer.start_background_tasks() (base class for main and worker instances) as these were previously part of base.start() which also applies main and worker instances.

"""
return await self._collect()

async def setup(self) -> None:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed async as nothing was using await here and fits the pattern better for start_background_tasks

@MadLittleMods MadLittleMods changed the title Move snowflake homeserver background tasks to start_background_tasks Move unique snowflake homeserver background tasks to start_background_tasks Oct 9, 2025
Comment on lines -113 to -118
# XXX some of this is redundant. poking things into the config shouldn't
# work, and in any case it's not obvious what we expect to happen when
# we advance the reactor.
self.hs.config.server.max_mau_value = 0
#
# The `start_phone_stats_home()` looping call will cause us to run
# `reap_monthly_active_users` after the time has advanced
self.reactor.advance(FORTY_DAYS)
self.hs.config.server.max_mau_value = 5
Copy link
Contributor Author

@MadLittleMods MadLittleMods Oct 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

XXX: Just to call it out from the rest of test change noise, this is a non StreamTestCase change!

As this comment was hinting at, this was pretty strange.

Since we moved start_phone_stats_home() in this PR, it actually gets run in tests now. This causes the normal loop reap_monthly_active_users() to happen.

This caused this test to fail because setting self.hs.config.server.max_mau_value = 0 and then advancing time causes the looping call to call reap_monthly_active_users() again and remove all users (including reserved ones). This is because get_registered_reserved_users() will clamp to the configured max_mau_value. Note: I also think is pretty strange and tenuous behavior but that's just how things work with MAU and I'm not about to refactor everything there.

```shell
$ SYNAPSE_POSTGRES=1 SYNAPSE_POSTGRES_USER=postgres SYNAPSE_TEST_LOG_LEVEL=INFO poetry run trial tests.replication.tcp.streams.test_federation.FederationStreamTestCase.test_catchup
tests.replication.tcp.streams.test_federation
  FederationStreamTestCase
    test_catchup ...                                                     [FAIL]

===============================================================================
[FAIL]
Traceback (most recent call last):
  File "synapse/tests/replication/tcp/streams/test_federation.py", line 58, in test_catchup
    request = self.handle_http_replication_attempt()
  File "synapse/tests/replication/_base.py", line 217, in handle_http_replication_attempt
    self.assertEqual(
  File "virtualenvs/matrix-synapse-xCtC9ulO-py3.13/lib/python3.13/site-packages/twisted/trial/_synctest.py", line 444, in assertEqual
    super().assertEqual(first, second, msg)
  File "/usr/lib/python3.13/unittest/case.py", line 907, in assertEqual
    assertion_func(first, second, msg=msg)
  File "/usr/lib/python3.13/unittest/case.py", line 900, in _baseAssertEqual
    raise self.failureException(msg)
twisted.trial.unittest.FailTest: 2 != 1 : Expected to handle exactly one HTTP replication request but saw 2 - requests=[<SynapseRequest at 0x7fb6f53512b0 method='GET' uri='/_synapse/replication/get_repl_stream_updates/federation/NjbpGwVKHD?from_token=0&upto_token=1' clientproto='HTTP/1.1' site='test'>, <SynapseRequest at 0x7fb6f533d450 method='GET' uri='/_synapse/replication/get_repl_stream_updates/federation/juIpgKTVlj?from_token=0&upto_token=1' clientproto='HTTP/1.1' site='test'>]

tests.replication.tcp.streams.test_federation.FederationStreamTestCase.test_catchup
-------------------------------------------------------------------------------
Ran 1 tests in 6.233s

FAILED (failures=1)
```
Comment on lines +49 to +54
received_account_data_rows = [
row
for row in self.test_handler.received_rdata_rows
if row[0] == AccountDataStream.NAME
]
self.assertEqual([], received_account_data_rows)
Copy link
Contributor Author

@MadLittleMods MadLittleMods Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to filter the rdata rows by relevant type. We already had this pattern for some tests. See

received_rows = [
upd
for upd in self.test_handler.received_rdata_rows
if upd[0] == ThreadSubscriptionsStream.NAME
]

Not all of these tests were failing but some tests were failing because we had some caches rows flying around because we moved start_phone_stats_home() to a place that also runs in the tests now. ex. ('caches', 2, CachesStream.CachesStreamRow(cache_func='user_last_seen_monthly_active', keys=None, invalidation_ts=0))

While we don't care about phone home stats in tests, it's at-least a good sign that our homeserver is more closely running to how it would in the real-life case.

Comment on lines +47 to 50
# FIXME: This seems odd, why aren't we calling `self.replicate()` here? but also
# doing so, causes other assumptions to fail (multiple HTTP replication attempts
# are made).
self.reactor.advance(0)
Copy link
Contributor Author

@MadLittleMods MadLittleMods Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only place where I can't figure out exactly why we have to self.reactor.advance(0) here.

Partly because I'm not sure how exactly we expect replication to work.

Opting to just leave it as-is until we can figure it out ⏩

Comment on lines -38 to -40
def _build_replication_data_handler(self) -> Mock:
self.mock_handler = Mock(wraps=super()._build_replication_data_handler())
return self.mock_handler
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored some of these tests to use the new BaseStreamTestCase pattern to inspect self.test_handler.received_rdata_rows instead of using self.mock_handler


typing._push_update(member=RoomMember(ROOM_ID, USER_ID), typing=True)

self.reactor.advance(0)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, these self.reactor.advance(0) and self.pump(0.1) calls just needed to be replaced with self.replicate() which is more clear what's happening as well.

@MadLittleMods MadLittleMods marked this pull request as ready for review October 10, 2025 23:30
@MadLittleMods MadLittleMods requested a review from a team as a code owner October 10, 2025 23:30
Copy link
Contributor

@reivilibre reivilibre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like lots of thought has gone into it as always, but can't disagree with anything

@MadLittleMods MadLittleMods merged commit d2c582e into develop Oct 13, 2025
76 of 78 checks passed
@MadLittleMods MadLittleMods deleted the madlittlemods/move-homeserver-background-tasks branch October 13, 2025 15:19
@MadLittleMods
Copy link
Contributor Author

Thanks for the review @reivilibre 🐡

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants