Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Support aggregation options #961

Merged
merged 10 commits into from
Apr 10, 2024
Merged

Conversation

JorTurFer
Copy link
Member

@JorTurFer JorTurFer commented Apr 1, 2024

Until this PR, we only supported scaling based on "pending" requests. This can work on high load scenarios as it measures directly the amount of requests on a given moment, but it doesn't work for other scenarios because the scaler doesn't aggregate the requests in any way.

With this PR, the add-on will support 2 scaling options, based on concurrency (current approach just renamed) and based on the request rate for a given time window (more reasons for the renaming below).

These options are configured though a new section:

spec
  scalingMetric: # requestRate and concurrency are mutually exclusive
    requestRate:
      granularity: 1s
      targetValue: 100
      window: 1m
    concurrency:
      targetValue: 100

The idea behind this change is adding support (in the future) for both metrics as RPS is a good metric for regular scaling as it's more fuzzy thanks to the time window, but in the other hand, peaks are better handled by concurrency. Having this future change in mind, it makes sense to move the metric configuration to a nested section instead of sharing the targetPendingRequests key (which is also not aligned with new naming).

During the PR I've found some small changes (such as release process to include the docs update or the logger unification to have a better observability) that I've included as part of this PR to not forget them (or because they help me directly, like the logger change)

Why have I renamed the scaling?

Although pending seems worth, being accurate we are scaling based on concurrent request (or in-flight), pending can sound like request not proxied yet, but already proxied requests are taken into account even though it's the backend who hasn't answered yet. Using concurrent, we are more accurate with the real scaling behavior.
There is another reason behind this change and it's to be aligned with current Knative scaling naming. At this moment, Knative is "the king" from HTTP scaling pov, so aligning our naming with them can make sense for making the things easier for end-users.

Checklist

Fixes #882
Fixes #958

@JorTurFer JorTurFer requested a review from a team as a code owner April 1, 2024 20:37
This was referenced Apr 4, 2024
Signed-off-by: Jorge Turrado <[email protected]>
Signed-off-by: Jorge Turrado <[email protected]>
Signed-off-by: Jorge Turrado <[email protected]>
Signed-off-by: Jorge Turrado <[email protected]>
Signed-off-by: Jorge Turrado <[email protected]>
Signed-off-by: Jorge Turrado <[email protected]>
Signed-off-by: Jorge Turrado <[email protected]>
Signed-off-by: Jorge Turrado <[email protected]>
Signed-off-by: Jorge Turrado <[email protected]>
@JorTurFer JorTurFer enabled auto-merge (squash) April 10, 2024 21:14
@JorTurFer JorTurFer merged commit a4f9f39 into kedacore:main Apr 10, 2024
19 checks passed
@JorTurFer JorTurFer deleted the add-aggregations branch April 10, 2024 21:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unify loggers Understanding scaledownPeriod
2 participants