-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Describe which rate limiter was hit in logs #16135
Conversation
Make it a dataclass, with a free function for construction
so that the latter has a key to use
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems reasonable overall.
synapse/config/ratelimiting.py
Outdated
self.per_second = config.get("per_second", defaults["per_second"]) | ||
self.burst_count = int(config.get("burst_count", defaults["burst_count"])) | ||
|
||
def parse_rate_limit( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could also use a class-method, which is what we usually do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was my first instinct---though I shied away from it because it's mildly irritating to type classmethods correctly in an inheritance-aware way. (typing.Self
is the one true way, but that's in Python 3.11 and I didn't want to fiddle with typing_extensions etc).
I'll quickly make this a class method and just -> "RatelimitSettings"
, which is good enough.
@property | ||
def debug_context(self) -> Optional[str]: | ||
return self.limiter_name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
debug_context
could probably just be a class-property too and __init__
could set self.debug_context = limiter_name
. I don't feel strongly either way though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No strong feelings here either. Suggest we leave it as-is in the interests of landing this sooner rather than later.
…-in-err In particular this conflicted with #15840
Would you mind taking another look? I had to merge in #16136, but I think I did it properly. (Let's see if the tests pass, I guess!) |
Test failure was a known flake. |
self, | ||
store: DataStore, | ||
clock: Clock, | ||
cfg: RatelimitSettings, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sumnerevans pointed out to me that the docstring needs updating for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #16255.
- Add configuration setting for CAS protocol version. Contributed by Aurélien Grimpard. ([\#15816](#15816)) - Suppress notifications from message edits per [MSC3958](matrix-org/matrix-spec-proposals#3958). ([\#16113](#16113)) - Return a `Retry-After` with `M_LIMIT_EXCEEDED` error responses. ([\#16136](#16136)) - Add `last_seen_ts` to the [admin users API](https://matrix-org.github.io/synapse/latest/admin_api/user_admin_api.html). ([\#16218](#16218)) - Improve resource usage when sending data to a large number of remote hosts that are marked as "down". ([\#16223](#16223)) - Fix IPv6-related bugs on SMTP settings, adding groundwork to fix similar issues. Contributed by @evilham and @telmich (ungleich.ch). ([\#16155](#16155)) - Fix a spec compliance issue where requests to the `/publicRooms` federation API would specify `include_all_networks` as a string. ([\#16185](#16185)) - Fix inaccurate error message while attempting to ban or unban a user with the same or higher PL by spliting the conditional statements. Contributed by @leviosacz. ([\#16205](#16205)) - Fix a rare bug that broke looping calls, which could lead to e.g. linearly increasing memory usage. Introduced in v1.90.0. ([\#16210](#16210)) - Fix a long-standing bug where uploading images would fail if we could not generate thumbnails for them. ([\#16211](#16211)) - Fix a long-standing bug where we did not correctly back off from servers that had "gone" if they returned 4xx series error codes. ([\#16221](#16221)) - Update links to the [matrix.org blog](https://matrix.org/blog/). ([\#16008](#16008)) - Document which [admin APIs](https://matrix-org.github.io/synapse/latest/usage/administration/admin_api/index.html) are disabled when experimental [MSC3861](matrix-org/matrix-spec-proposals#3861) support is enabled. ([\#16168](#16168)) - Document [`exclude_rooms_from_sync`](https://matrix-org.github.io/synapse/v1.92/usage/configuration/config_documentation.html#exclude_rooms_from_sync) configuration option. ([\#16178](#16178)) - Prepare unit tests for Python 3.12. ([\#16099](#16099)) - Fix nightly CI jobs. ([\#16121](#16121), [\#16213](#16213)) - Describe which rate limiter was hit in logs. ([\#16135](#16135)) - Simplify presence code when using workers. ([\#16170](#16170)) - Track per-device information in the presence code. ([\#16171](#16171), [\#16172](#16172)) - Stop using the `event_txn_id` table. ([\#16175](#16175)) - Use `AsyncMock` instead of custom code. ([\#16179](#16179), [\#16180](#16180)) - Improve error reporting of invalid data passed to `/_matrix/key/v2/query`. ([\#16183](#16183)) - Task scheduler: add replication notify for new task to launch ASAP. ([\#16184](#16184)) - Improve type hints. ([\#16186](#16186), [\#16188](#16188), [\#16201](#16201)) - Bump black version to 23.7.0. ([\#16187](#16187)) - Log the details of background update failures. ([\#16212](#16212)) - Cache device resync requests over replication. ([\#16241](#16241)) * Bump anyhow from 1.0.72 to 1.0.75. ([\#16141](#16141)) * Bump furo from 2023.7.26 to 2023.8.19. ([\#16238](#16238)) * Bump phonenumbers from 8.13.18 to 8.13.19. ([\#16237](#16237)) * Bump psycopg2 from 2.9.6 to 2.9.7. ([\#16196](#16196)) * Bump regex from 1.9.3 to 1.9.4. ([\#16195](#16195)) * Bump ruff from 0.0.277 to 0.0.286. ([\#16198](#16198)) * Bump sentry-sdk from 1.29.2 to 1.30.0. ([\#16236](#16236)) * Bump serde from 1.0.184 to 1.0.188. ([\#16194](#16194)) * Bump serde_json from 1.0.104 to 1.0.105. ([\#16140](#16140)) * Bump types-psycopg2 from 2.9.21.10 to 2.9.21.11. ([\#16200](#16200)) * Bump types-pyyaml from 6.0.12.10 to 6.0.12.11. ([\#16199](#16199))
We're already telling the client exactly how long they have to wait. How is telling them which of the limits they ran into worse? I'd like to return this info to the clients, as it's more accessible than the logs. Would you be open to a PR for changing that behaviour? If necessary, behind a config flag? |
I do not believe we'd accept a PR to change this behavior. There's two reasons:
|
I've threatened to do this before, and found the need today while trying to run a bunch of tests locally.
I originally punted telling this to the clients. But it might be a bad idea to tell bad actors which rate limit they've tripped. Instead, err towards caution and only log this info.
Commitwise reviewable.