-
Notifications
You must be signed in to change notification settings - Fork 270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add rate-limiters to ServerPolicy #3305
base: main
Are you sure you want to change the base?
Conversation
0536b1f
to
8636f8e
Compare
This adds the local_rate_limit module to the server-policy crate, that `ServerPolicy` uses for its new `local_rate_limit` field, containing three optional rate-limiters: total, identity, overrides (this one is really a vector of limiters, one per configured override). I tried putting that under `Protocol` instead, but the `PartialEq` requirement made it very hard to follow. `Server` OTOH doesn't really require that trait, so I was able to remove it and accommodate the limiters. I made sure to avoid pulling the dashmap dependency in `governor`; I haven't checked yet the necessity of the "jitter" and "quanta" features. This temporarily overrides linkerd2-proxy-api dependency to pick changes from linkerd/linkerd2-proxy-api#388
8636f8e
to
b580e65
Compare
#[derive(Debug, Default)] | ||
pub struct HttpLocalRateLimit { | ||
pub total: Option<RateLimiter<NotKeyed, InMemoryState, DefaultClock>>, | ||
pub identity: Option<RateLimiter<String, DefaultKeyedStateStore<String>, DefaultClock>>, | ||
pub overrides: Vec<HttpLocalRateLimitOverride>, | ||
} | ||
|
||
#[derive(Debug)] | ||
pub struct HttpLocalRateLimitOverride { | ||
pub ids: Vec<String>, | ||
pub rate_limit: RateLimiter<Vec<String>, DefaultKeyedStateStore<Vec<String>>, DefaultClock>, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I think we can maintain a narrower API--no need to expose all of the governor types
- It would be preferable to use the Identity types directly here rather than require String round-tripping (allocations).
- I think the override rate limiter is misconfigured in your version--it should not be keyed on Vec.
- Furthermore, it seems preferable to store the overrides for efficient lookup (i.e. in a HashMap).
#[derive(Debug, Default)] | |
pub struct HttpLocalRateLimit { | |
pub total: Option<RateLimiter<NotKeyed, InMemoryState, DefaultClock>>, | |
pub identity: Option<RateLimiter<String, DefaultKeyedStateStore<String>, DefaultClock>>, | |
pub overrides: Vec<HttpLocalRateLimitOverride>, | |
} | |
#[derive(Debug)] | |
pub struct HttpLocalRateLimitOverride { | |
pub ids: Vec<String>, | |
pub rate_limit: RateLimiter<Vec<String>, DefaultKeyedStateStore<Vec<String>>, DefaultClock>, | |
} | |
#[derive(Debug, Default)] | |
pub struct LocalRateLimit { | |
total: Option<RateLimit<InMemoryState>>, | |
per_identity: Option<RateLimit<DefaultKeyedStateStore<Option<Id>>>>, | |
overrides: HashMap<Id, Arc<RateLimit<InMemoryState>>>, | |
} | |
#[derive(Debug)] | |
struct RateLimit<S: StateStore> { | |
rps: NonZeroU32, | |
limiter: RateLimiter<S::Key, S, DefaultClock>, | |
} | |
#[derive(Debug, thiserror::Error, PartialEq, Eq)] | |
pub enum RateLimitError { | |
#[error("total rate limit exceeded: {0}rps")] | |
Total(NonZeroU32), | |
#[error("per-identity rate limit exceeded: {0}rps")] | |
PerIdentity(NonZeroU32), | |
#[error("override rate limit exceeded: {0}rps")] | |
Override(NonZeroU32), | |
} |
impl LocalRateLimit {
pub fn check(&self, id: Option<&Id>) -> Result<(), RateLimitError> {
if let Some(lib) = &self.total {
if lim.limiter.check().is_err() {
return Err(RateLimitError::Total(lim.rps));
}
}
if let Some(id) = id {
if let Some(lim) = self.overrides.get(id) {
if lim.limiter.check().is_err() {
return Err(RateLimitError::Override(lim.rps));
}
return Ok(());
}
}
if let Some(lim) = &self.per_identity {
if limiter.check_key(&lim.id.cloned()).is_err() {
return Err(RateLimitError::PerIdentity(lim.rps));
}
}
Ok(())
}
}
The server-policy crate can hide all of the specifics about rate limiting and let callers have a very simple interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the override rate limiter is misconfigured in your version--it should not be keyed on Vec.
As per linkerd/linkerd2#13231, an HTTPLocalRateLimitPolicy
would look something like this:
apiVersion: policy.linkerd.io/v1alpha1
kind: HTTPLocalRateLimitPolicy
metadata:
namespace: emojivoto
name: web-rl
spec:
targetRef:
group: policy.linkerd.io
kind: Server
name: web-http
total:
requestsPerSecond: 100
identity:
requestsPerSecond: 20
overrides:
- requestsPerSecond: 10
clientRefs:
- kind: ServiceAccount
namespace: emojivoto
name: default
Under overrides, I implemented the clientRefs
as an array, which explains the types I was hinting at (I may have made that an array after our conversations, sorry about that). It's definitely simpler to have it be just one object though like in your suggestion, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The override representation in the YAML doesn't dictate how we store them at runtime.
Also note that I interpret the overrides to support "pooled" limits. For example:
- requestsPerSecond: 10
clientRefs:
- kind: ServiceAccount
namespace: emojivoto
name: web
clientRefs:
- kind: ServiceAccount
namespace: emojivoto
name: default
In this situation I would expect these two accounts to share a rate limit of 10RPS. This is distinct from the configuration in which they have distinct but identical limits:
- requestsPerSecond: 10
clientRefs:
- kind: ServiceAccount
namespace: emojivoto
name: web
- requestsPerSecond: 10
clientRefs:
- kind: ServiceAccount
namespace: emojivoto
name: default
I imagine that we could build a map of overrides from the existing spec:
let overrides = proto
.overrides
.into_iter()
.flat_map(|ovr| {
let Some(lim) = ovr.limit else {
return vec![];
};
let Some(rps) = NonZeroU32::new(lim.requests_per_second) else {
return vec![];
};
let limiter = RateLimiter::direct(Quota::per_second(rps));
let limit = Arc::new(RateLimit { rps, limiter });
ovr.clients
.into_iter()
.flat_map(|cl| {
cl.identities
.into_iter()
.filter_map(|id| id.name.parse::<Id>().ok())
})
.map(move |id| (id, limit.clone()))
.collect::<Vec<(Id, Arc<RateLimit<InMemoryState>>)>>()
})
.collect::<HashMap<Id, Arc<RateLimit<InMemoryState>>>>();
This supports shared limits and efficient map-based lookups.
let total = proto.total.map(|lim| { | ||
let quota = Quota::per_second(NonZeroU32::new(lim.requests_per_second).unwrap()); | ||
RateLimiter::direct(quota) | ||
}); | ||
|
||
let identity = proto.identity.map(|lim| { | ||
let quota = Quota::per_second(NonZeroU32::new(lim.requests_per_second).unwrap()); | ||
RateLimiter::keyed(quota) | ||
}); | ||
|
||
let overrides = proto | ||
.overrides | ||
.into_iter() | ||
.flat_map(|ovr| { | ||
ovr.limit.map(|lim| { | ||
let ids = ovr | ||
.clients | ||
.into_iter() | ||
.flat_map(|cl| cl.identities.into_iter().map(|id| id.name)) | ||
.collect(); | ||
let quota = | ||
Quota::per_second(NonZeroU32::new(lim.requests_per_second).unwrap()); | ||
let rate_limit = RateLimiter::keyed(quota); | ||
HttpLocalRateLimitOverride { ids, rate_limit } | ||
}) | ||
}) | ||
.collect(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that we cannot call unwrap
(or expect) when handling input from the control plane. We need to make sure the None-ness propagates to cause the item to be ignored:
let total = proto.total.map(|lim| { | |
let quota = Quota::per_second(NonZeroU32::new(lim.requests_per_second).unwrap()); | |
RateLimiter::direct(quota) | |
}); | |
let identity = proto.identity.map(|lim| { | |
let quota = Quota::per_second(NonZeroU32::new(lim.requests_per_second).unwrap()); | |
RateLimiter::keyed(quota) | |
}); | |
let overrides = proto | |
.overrides | |
.into_iter() | |
.flat_map(|ovr| { | |
ovr.limit.map(|lim| { | |
let ids = ovr | |
.clients | |
.into_iter() | |
.flat_map(|cl| cl.identities.into_iter().map(|id| id.name)) | |
.collect(); | |
let quota = | |
Quota::per_second(NonZeroU32::new(lim.requests_per_second).unwrap()); | |
let rate_limit = RateLimiter::keyed(quota); | |
HttpLocalRateLimitOverride { ids, rate_limit } | |
}) | |
}) | |
.collect(); | |
let total = proto.total.and_then(|lim| { | |
let rps = NonZeroU32::new(lim.requests_per_second)?; | |
let limiter = RateLimiter::direct(Quota::per_second(rps)); | |
Some(RateLimit { rps, limiter }) | |
}); | |
let per_identity = proto.identity.and_then(|lim| { | |
let rps = NonZeroU32::new(lim.requests_per_second)?; | |
let limiter = RateLimiter::hashmap(Quota::per_second(rps)); | |
Some(RateLimit { rps, limiter }) | |
}); | |
let overrides = proto | |
.overrides | |
.into_iter() | |
.flat_map(|ovr| { | |
let Some(lim) = ovr.limit else { | |
return vec![]; | |
}; | |
let Some(rps) = NonZeroU32::new(lim.requests_per_second) else { | |
return vec![]; | |
}; | |
let limiter = RateLimiter::direct(Quota::per_second(rps)); | |
let limit = Arc::new(RateLimit { rps, limiter }); | |
ovr.clients | |
.into_iter() | |
.flat_map(|cl| { | |
cl.identities | |
.into_iter() | |
.filter_map(|id| id.name.parse::<Id>().ok()) | |
}) | |
.map(move |id| (id, limit.clone())) | |
.collect::<Vec<(Id, Arc<RateLimit<InMemoryState>>)>>() | |
}) | |
.collect::<HashMap<Id, Arc<RateLimit<InMemoryState>>>>(); | |
This adds the local_rate_limit module to the server-policy crate, that
ServerPolicy
uses for its newlocal_rate_limit
field, containingthree optional rate-limiters: total, identity, overrides (this one is
really a vector of limiters, one per configured override).
I tried putting that under
Protocol
instead, but thePartialEq
requirement made it very hard to follow.
Server
OTOH doesn't reallyrequire that trait, so I was able to remove it and accommodate the
limiters.
I made sure to avoid pulling the dashmap dependency in
governor
; Ihaven't checked yet the necessity of the "jitter" and "quanta" features.
This temporarily overrides linkerd2-proxy-api dependency to pick changes
from linkerd/linkerd2-proxy-api#388