Skip to content
This repository has been archived by the owner on Sep 10, 2024. It is now read-only.

Rate-limit password-based logins and registrations #2541

Closed
sandhose opened this issue Mar 22, 2024 · 4 comments · Fixed by #3093
Closed

Rate-limit password-based logins and registrations #2541

sandhose opened this issue Mar 22, 2024 · 4 comments · Fixed by #3093
Labels
A-Local-Password Related to the local password database T-Enhancement New feature of request

Comments

@sandhose
Copy link
Member

Questions to answer:

  • what algorithm we should use?
  • should the rate limits be shared with all the MAS instances (e.g. Redis-based rate limits?) or can be instance-local?
  • should they be scoped to IPs, IP ranges, global?
  • what good defaults are?
@sandhose sandhose added T-Enhancement New feature of request A-Local-Password Related to the local password database labels Mar 22, 2024
@H-Shay
Copy link

H-Shay commented Jul 22, 2024

Rate limiting isn't my area of expertise but my first instinct is to look at what Synapse currently does and use that as a baseline:

It looks like the algorithm used is "leaky bucket as a meter" (https://github.com/element-hq/synapse/blob/d221512498f7e3267916a289dd2ef4f3e00728e8/synapse/api/ratelimiting.py#L40)

As for keys, synapse rate limits against several keys: ip address, account, and failed attempts. The rate limits are configurable and the defaults are:

address:
    per_second: 0.15
    burst_count: 5
account:
    per_second: 0.18
    burst_count: 4
failed_attempts:
    per_second: 0.19
    burst_count: 7

These limits are in-process/non-persistent, and from what I can glean from the code the rate limits appear to be applied per worker, but I would just double-check in the backend room that I am not mistaken about that.
As for config options, the ones listed above are the only ones I see related to rate limiting logins - I also don't see any options related to ignoring ip ranges for login (although there are ip block/allowlists for federation and url previews), I wonder if traditionally that has been handled by the load balancer or something similar. I'd also definitely ask in the security lobby if they have any insight on this issue - I am not sure how much the MAS usecase differs from Synapse, but I suspect they might have more insight/thoughs on this.

@davidegirardi
Copy link

I'd take some inspiration from https://owasp.org/www-community/controls/Blocking_Brute_Force_Attacks and then here are some suggestions:

  • When the application keeps state (number of logins attempts per IP address for example) make sure keeping that state does not cost more to you than the attacker. Otherwise it opens itself up for denial of service attacks.
  • Consider implementing an incremental timeout between attempts, for example by doubling a specific value every time. After the first failed login, you have to wait 1 second, then 2, then 4, 8, 16, etc. This will be a minor inconvenience for legitimate users but will hinder automated attacks.
  • The process above should not lock resources or consume synchronous CPU time.
  • A CAPTCHA could also be an alternative.
  • Make sure all the limiting is implemented server-side.

@sandhose
Copy link
Member Author

sandhose commented Jul 25, 2024

Thanks both for the input! I had a first shot at it in #3013

I'm using the governor crate, which uses the Generic Cell Rate Algorithm.
With that PR, the quotas are hard-coded (I'll introduce configuration in another PR), and I only did logins for now.

For logins, I'm keeping two rate limiting buckets:

  • one per IP address, which has 3 attempts per minute (equivalent to per_second=0.05 burst_count=3)
  • one per user account, which as 1800 attempts per hour (equivalent to per_second=0.5 burst_count=1800)

(…writing this, it feels obvious that I need to tweak those numbers 😅)

The state is in-process, checked server-side, and shouldn't cost much, as governor looks quite efficient

I don't distinguish failed attempts, because the library doesn't let me 'test' the rate limiter without consuming a cell

@davidegirardi
Copy link

Unless those limits just count the failed attempts, I'd relax the one per IP address quite a bit, for example if a company has just one public IP for lots of employees, or a library or a school. per_second=10 burst_count=50.

For the second one, I'd go much more aggressive instead: per_second=0.2, bust_count=3.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Local-Password Related to the local password database T-Enhancement New feature of request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants