Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

mypy plugin to check @cached return types #14911

Merged
merged 44 commits into from
Oct 2, 2023
Merged

Conversation

DMRobertson
Copy link
Contributor

@DMRobertson DMRobertson commented Jan 25, 2023

The things that are cached should be immutable to avoid callers modifying cached values (which has lead to bugs in the past). This updates our mypy plug-in to enforce this via the type system.

Complements #13755.

Errors which have been fixed

The main obstruction seems to be that we have lots of Dict[str, Any] or Dict[str, str] in caches.

Sample:

+ mypy
synapse/handlers/read_marker.py:44: error: function get_account_data_for_room_and_type of AccountDataWorkerStore is @cached, but has mutable return value Union[dict[str, Any], None]  [synapse-@cached-mutable]
synapse/appservice/__init__.py:187: error: function get_local_users_in_room of RoomMemberWorkerStore is @cached, but has mutable return value list[str]  [synapse-@cached-mutable]
synapse/appservice/__init__.py:248: error: function get_aliases_for_room of DirectoryWorkerStore is @cached, but has mutable return value list[str]  [synapse-@cached-mutable]
synapse/app/phone_stats_home.py:203: error: function get_monthly_active_count_by_service of MonthlyActiveUsersWorkerStore is @cached, but has mutable return value dict[str, int]  [synapse-@cached-mutable]
synapse/api/filtering.py:170: error: function get_user_filter of FilteringWorkerStore is @cached, but has mutable return value dict[str, Any]  [synapse-@cached-mutable]
synapse/storage/databases/main/receipts.py:204: error: function _get_receipts_for_user_with_orderings of ReceiptsWorkerStore is @cached, but has mutable return value dict[str, Any]  [synapse-@cached-mutable]
synapse/storage/databases/main/receipts.py:283: error: function _get_linearized_receipts_for_rooms of ReceiptsWorkerStore is @cached, but has mutable return value dict[str, list[dict[str, Any]]]  [synapse-@cached-mutable]
synapse/storage/databases/main/receipts.py:309: error: function _get_linearized_receipts_for_room of ReceiptsWorkerStore is @cached, but has mutable return value list[dict[str, Any]]  [synapse-@cached-mutable]
synapse/storage/databases/main/events_worker.py:1618: error: function _have_seen_events_dict of EventsWorkerStore is @cached, but has mutable return value dict[str, bool]  [synapse-@cached-mutable]
synapse/storage/databases/main/events_worker.py:1668: error: function _have_seen_events_dict of EventsWorkerStore is @cached, but has mutable return value dict[str, bool]  [synapse-@cached-mutable]
synapse/storage/databases/main/signatures.py:77: error: function get_event_reference_hashes of SignatureWorkerStore is @cached, but has mutable return value dict[str, dict[str, bytes]]  [synapse-@cached-mutable]
synapse/storage/databases/main/roommember.py:385: error: function get_invited_rooms_for_local_user of RoomMemberWorkerStore is @cached, but has mutable return value list[synapse.storage.roommember.RoomsForUser]  [synapse-@cached-mutable]
synapse/storage/databases/main/roommember.py:723: error: function _get_rooms_for_users of RoomMemberWorkerStore is @cached, but has mutable return value dict[str, frozenset[str]]  [synapse-@cached-mutable]
synapse/storage/databases/main/roommember.py:795: error: function get_users_in_room of RoomMemberWorkerStore is @cached, but has mutable return value list[str]  [synapse-@cached-mutable]
synapse/storage/databases/main/roommember.py:861: error: function _get_user_ids_from_membership_event_ids of RoomMemberWorkerStore is @cached, but has mutable return value dict[str, Union[str, None]]  [synapse-@cached-mutable]
synapse/storage/databases/main/roommember.py:969: error: function get_users_in_room of RoomMemberWorkerStore is @cached, but has mutable return value list[str]  [synapse-@cached-mutable]
synapse/storage/databases/main/roommember.py:1020: error: function get_current_hosts_in_room of RoomMemberWorkerStore is @cached, but has mutable return value set[str]  [synapse-@cached-mutable]
synapse/storage/databases/main/roommember.py:1098: error: function _get_joined_hosts_cache of RoomMemberWorkerStore is @cached, but has mutable return value synapse.storage.databases.main.roommember._JoinedHostsCache (non-frozen attrs class)  [synapse-@cached-mutable]
synapse/storage/databases/main/roommember.py:1098: note: Error code "synapse-@cached-mutable" not covered by "type: ignore" comment

Some outstanding TODOs:

  • check that frozen attrs classes have immutable fields
  • work out how to handle types written in Rust

@clokep
Copy link
Member

clokep commented Jan 25, 2023

Can we return Mapping in that case? Or is your point that we need to audit a bunch of stuff?

Copy link
Member

@clokep clokep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You grabbed my interest. Seems reasonable but I have a couple thoughts!

scripts-dev/mypy_synapse_plugin.py Outdated Show resolved Hide resolved
scripts-dev/mypy_synapse_plugin.py Outdated Show resolved Hide resolved
@DMRobertson
Copy link
Contributor Author

(Thanks for your thoughts, Patrick. I'll need to give this and them more thought, when I get some spare time to stare at this again)

@DMRobertson
Copy link
Contributor Author

DMRobertson commented Sep 12, 2023

Can we return Mapping in that case? Or is your point that we need to audit a bunch of stuff?

Yep, I think we should return a Mapping to make it clear that the caller can't mutate dictionaries that are kept as cache values.

@@ -710,7 +710,7 @@ async def on_send_join_request(
state_event_ids: Collection[str]
servers_in_room: Optional[Collection[str]]
if caller_supports_partial_state:
summary = await self.store.get_room_summary(room_id)
summary = await self.store.get_room_summary(room_id) # type: ignore[synapse-@cached-mutable]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exciting to see!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Terrifying IMO of how we use this.

@DMRobertson
Copy link
Contributor Author

Does this slow down mypy a lot?

Testing with:

git rev-parse --short HEAD
rm -r .mypy_cache/
poetry install --all-extras --sync
mypy --version
time mypy

On this branch:

$ git rev-parse --short HEAD
51a1a5fc4
$ mypy --version
mypy 1.5.1 (compiled: yes)
$ time mypy
Success: no issues found in 851 source files

real	0m36.189s
user	0m34.196s
sys	0m1.093s

On develop:

$ git rev-parse --short HEAD
7ec0a141b
$ mypy --version
mypy 1.5.1 (compiled: yes)
$ time mypy
Success: no issues found in 851 source files

real	0m33.438s
user	0m31.706s
sys	0m0.934s

So ~3s slower, say 10% overhead. I think I'd consider that good enough as-is.

@clokep
Copy link
Member

clokep commented Sep 20, 2023

Yeah that sounds reasonable to me.

@clokep clokep marked this pull request as ready for review September 21, 2023 11:44
@clokep clokep requested a review from a team as a code owner September 21, 2023 11:44
@clokep
Copy link
Member

clokep commented Sep 21, 2023

I think this probably needs a review from not @DMRobertson and not myself since we collaborated on this.

Copy link
Member

@erikjohnston erikjohnston left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise looks good!

synapse/federation/federation_server.py Outdated Show resolved Hide resolved
synapse/storage/databases/main/relations.py Show resolved Hide resolved
Copy link
Member

@erikjohnston erikjohnston left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! 🎉

scripts-dev/mypy_synapse_plugin.py Show resolved Hide resolved
Comment on lines 61 to 63
"""
Get the "final" return type of a callable which might return returned an Awaitable/Deferred.
"""
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside: If we have function returning e.g. Deferred[Deferred[T]], it looks like this function doesn't loop and extract the inner T. But we shouldn't be doing that in the first place.

(Would such a type even be legitimate?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Would such a type even be legitimate?)

I think it wouldn't make sense? Please don't do that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that's even really doable since a Deferred returning a Deferred will wait on the inner Deferred? https://docs.twistedmatrix.com/en/stable/core/howto/defer.html#chaining-deferreds

Anyway, that interaction always confuses me and I'm willing to not support it.

Comment on lines -96 to +99
# its not the end of the world.
# it's not the end of the world.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<3

Comment on lines +159 to +161
"""Asserts that the signature of a method returns a value which can be cached.

Makes no changes to the provided method signature.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a sentence explaining why there is a wrapper? It looks like the point is mainly to check that we have a proper CallableType(?)

OTOH if it's just "to have smaller functions" then let's leave it as is.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OTOH if it's just "to have smaller functions" then let's leave it as is.

I almost inlined it, but it seemed nice to keep them separate.

scripts-dev/mypy_synapse_plugin.py Outdated Show resolved Hide resolved
Co-authored-by: David Robertson <[email protected]>
@clokep clokep enabled auto-merge (squash) October 2, 2023 12:36
@clokep clokep merged commit 1026776 into develop Oct 2, 2023
38 checks passed
@clokep clokep deleted the dmr/mypy-check-at-cached branch October 2, 2023 14:22
@clokep
Copy link
Member

clokep commented Oct 2, 2023

Thanks so much for the help @erikjohnston and @DMRobertson! Happy to see this merged! 🎉

Fizzadar added a commit to beeper/synapse-legacy-fork that referenced this pull request Oct 30, 2023
No significant changes since 1.94.0rc1.

- Render plain, CSS, CSV, JSON and common image formats in the browser (inline) when requested through the /download endpoint. ([\matrix-org#15988](matrix-org#15988))
- Add experimental support for [MSC4028](matrix-org/matrix-spec-proposals#4028) to push all encrypted events to clients. ([\matrix-org#16361](matrix-org#16361))
- Minor performance improvement when sending presence to federated servers. ([\matrix-org#16385](matrix-org#16385))
- Minor performance improvement by caching server ACL checking. ([\matrix-org#16360](matrix-org#16360))

- Add developer documentation concerning gradual schema migrations with column alterations. ([\matrix-org#15691](matrix-org#15691))
- Improve documentation of the user directory search algorithm. ([\matrix-org#16320](matrix-org#16320))
- Fix rendering of user admin API documentation around deactivation. This was broken in Synapse 1.91.0. ([\matrix-org#16355](matrix-org#16355))
- Update documentation around message retention policies. ([\matrix-org#16382](matrix-org#16382))
- Add note to `federation_domain_whitelist` config option to clarify its usage. ([\matrix-org#16416](matrix-org#16416))
- Improve legacy release notes. ([\matrix-org#16418](matrix-org#16418))

- Remove Python version from `/_synapse/admin/v1/server_version`. ([\matrix-org#16380](matrix-org#16380))

- Avoid running CI steps when the files they check have not been changed. ([\matrix-org#14745](matrix-org#14745), [\matrix-org#16387](matrix-org#16387))
- Improve type hints. ([\matrix-org#14911](matrix-org#14911), [\matrix-org#16350](matrix-org#16350), [\matrix-org#16356](matrix-org#16356), [\matrix-org#16395](matrix-org#16395))
- Added support for pydantic v2 in addition to pydantic v1. Contributed by Maxwell G (@gotmax23). ([\matrix-org#16332](matrix-org#16332))
- Get CI to check PRs have been signed-off. ([\matrix-org#16348](matrix-org#16348))
- Add missing licence header. ([\matrix-org#16359](matrix-org#16359))
- Improve type hints, and bump types-psycopg2 from 2.9.21.11 to 2.9.21.14. ([\matrix-org#16381](matrix-org#16381))
- Improve comments in `StateGroupBackgroundUpdateStore`. ([\matrix-org#16383](matrix-org#16383))
- Update maturin configuration. ([\matrix-org#16394](matrix-org#16394))
- Downgrade replication stream time out error log lines to warning. ([\matrix-org#16401](matrix-org#16401))

* Bump actions/checkout from 3 to 4. ([\matrix-org#16250](matrix-org#16250))
* Bump cryptography from 41.0.3 to 41.0.4. ([\matrix-org#16362](matrix-org#16362))
* Bump dawidd6/action-download-artifact from 2.27.0 to 2.28.0. ([\matrix-org#16374](matrix-org#16374))
* Bump docker/setup-buildx-action from 2 to 3. ([\matrix-org#16375](matrix-org#16375))
* Bump gitpython from 3.1.35 to 3.1.37. ([\matrix-org#16376](matrix-org#16376))
* Bump msgpack from 1.0.5 to 1.0.6. ([\matrix-org#16377](matrix-org#16377))
* Bump msgpack from 1.0.6 to 1.0.7. ([\matrix-org#16412](matrix-org#16412))
* Bump phonenumbers from 8.13.19 to 8.13.22. ([\matrix-org#16413](matrix-org#16413))
* Bump psycopg2 from 2.9.7 to 2.9.8. ([\matrix-org#16409](matrix-org#16409))
* Bump pydantic from 2.3.0 to 2.4.2. ([\matrix-org#16410](matrix-org#16410))
* Bump regex from 1.9.5 to 1.9.6. ([\matrix-org#16408](matrix-org#16408))
* Bump sentry-sdk from 1.30.0 to 1.31.0. ([\matrix-org#16378](matrix-org#16378))
* Bump types-netaddr from 0.8.0.9 to 0.9.0.1. ([\matrix-org#16411](matrix-org#16411))
* Bump types-psycopg2 from 2.9.21.11 to 2.9.21.14. ([\matrix-org#16381](matrix-org#16381))
* Bump urllib3 from 1.26.15 to 1.26.17. ([\matrix-org#16422](matrix-org#16422))

# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE8SRSDO7gYkSP4chELS76LzL74EcFAmUlINwACgkQLS76LzL7
# 4EdvExAAgjk6+/Fu45MRG7u5kFmFzoZWLOPD10XROANaSeqW1l/pBhFh+XvwR4TZ
# l/FdkSfS9YpHnw3aof13TclLu6IVWDM+vqYFuY2HSY/yzbcGvJFHqr26kOccpTTd
# 2r9m/AkguyHEBECDW8qJLXb8M7dqNa2SydTBu1+IrKfj6nq+fRxVyQhyAJXrI1Ta
# Dnz0XJ4TcwTrMPVk4MYrAcYjID6IV89dtp7ttH4DwXKDeSjMtxM/46EIg4u+VXDz
# fzK25JHVFYJA5+/rOn/RslmxjJHQfEIEB6NYxQwLeMeZuGSZooTebKn1odwogvhI
# Srtfsytum+twgSHD1s+7KldM+EjTiu7ouKi8VcfOlFuLnuBiROEc5WUljcL5K63F
# kVx2bXGU/eNkPp6ntNhYfgswx+yk2rXFqkTjz+xZQIZcOBqehHBDy8VhtwlRkTUw
# bzocdKkLMA4nfSlq5fFOAErMqJKsPS8aN9yYPShqEUiSUOKle8eHfA1cTXJuK0MS
# K2/YcDDZmJBrwVADyNDk5GKaDx39rR752OSuJb57Sp/edwUg6+H1I6lIN6YTeoJw
# FzJwGMzuMCktOQRW2enxQiA6RZjXFCwvD1LoWMjyO4YTXQwXxNCXsb0kLKUqfwsy
# qMGphWEl3rdzVSuFapNAgOLF0RfFNYZdhQnk+3fNEwxumxoqgho=
# =hx8G
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue Oct 10 11:01:00 2023 BST
# gpg:                using RSA key F124520CEEE062448FE1C8442D2EFA2F32FBE047
# gpg: Can't check signature: No public key

# Conflicts:
#	.github/workflows/docker.yml
#	.github/workflows/docs-pr-netlify.yaml
#	.github/workflows/docs-pr.yaml
#	.github/workflows/docs.yaml
#	.github/workflows/latest_deps.yml
#	.github/workflows/poetry_lockfile.yaml
#	.github/workflows/push_complement_image.yml
#	.github/workflows/release-artifacts.yml
#	.github/workflows/tests.yml
#	.github/workflows/twisted_trunk.yml
#	poetry.lock
#	rust/src/push/base_rules.rs
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants