cron-scaler scales higher than expected #5820

dagvl · 2024-05-22T22:21:03Z

Report

When using cron triggers and a scaleDown policy of 5% pods every 2 seconds, the deployment never scales down to the expected number of pods.

E.g. if i have a cron trigger saying 20 pods, and then I edit that cron trigger to 10 pods, the deployment scales down to 11 pods instead of 10.

This is related to the scaleDown policy, because if i set a policy of 100% Pods every 2 seconds, it correctly scales down to 10 pods.

Expected Behavior

I expect the number of replicas to match the desiredReplicas in the cron trigger

Actual Behavior

I get more than 10 replicas.

Steps to Reproduce the Problem

First create a ScaledObject referencing a deployment with a cron trigger requesting 20 pods:

spec:
  advanced:
    horizontalPodAutoscalerConfig:
      behavior:
        scaleDown:
          policies:
          - periodSeconds: 2
            type: Percent
            value: 5
          - periodSeconds: 2
            type: Pods
            value: 1
          stabilizationWindowSeconds: 2
      name: scaled-object-test-hpa
    scalingModifiers: {}
  cooldownPeriod: 2
  fallback:
    failureThreshold: 3
    replicas: 1
  maxReplicaCount: 200
  minReplicaCount: 10
  pollingInterval: 30
  scaleTargetRef:
    name: scaled-object-test
  triggers:
  - metadata:
      value: "80"
    metricType: Utilization
    type: cpu
  - metadata:
      desiredReplicas: "20"
      end: 59 23 * * 6
      start: 0  0  * * 0
      timezone: Europe/Oslo
    type: cron

(note that this has a CPU utilization trigger also just because of the internal tooling we use to generate the ScaledObject, but this trigger is not a factor as average CPU usage is 0% in my pods [its an idle nginx container]).

Note the scaleDown setting allows max(1, pods*0.05) to scale down.

Apply this scaledobject and see that the deployment scales up to 20

then change desiredReplicas to 10 and reapply.

The deployment starts to slowly scale down, but the scaledown ends at 11 replicas instead of 10.

If you set the policy to 100% percent and do the same thing, the scaledown ends at 10 pods as expected.

Logs from KEDA operator

No response

KEDA Version

2.14.0

Kubernetes Version

1.29

Platform

Amazon Web Services

Scaler Details

cron

Anything else?

One thought that crossed my mind, but i can't verify ,is that this is the HPA scaling to within a tolerance level instead of to a exact value.

E.g. right now, i have the cron desiredReplicas set to 10, but the deployment is stuck at 11.

If i look at the HPA, i see this:

  "s1-cron-Europe-Oslo-00xx0-5923xx6" (target average value):  910m / 1

10/11 = 0.910 which seems like cron is emitting the correct metric but the hpa is not reacting to it. In the production case it is similar:
The scaledObject cron trigger is emitting 220 desiredReplicas, but we have 244 currently. Looking at the hpa we have:

  "s1-cron-Australia-Sydney-01xx1-010xx4" (target average value):  902m / 1

220/244 = 0.902 so again we are within 10% of the target value

The text was updated successfully, but these errors were encountered:

JorTurFer · 2024-05-26T17:29:02Z

Hello,
You're right, the problem here is the 10% of tolerance and currently there isn't any solution :(
IDK if @SpiritZhou will finally contribute to the upstream with this feature, do you have any extra info @SpiritZhou ?

SpiritZhou · 2024-05-27T02:34:28Z

I am still working on it.

stale · 2024-07-26T23:44:54Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

stale · 2024-08-03T09:31:43Z

This issue has been automatically closed due to inactivity.

jpriebe · 2024-11-13T13:04:49Z

We are also experiencing an issue like this. We use a cron trigger along with an SQS trigger.

The cron trigger brings up the pod count to 500 in anticipation of a daily workload and holds it there throughout the expected workload period.

The workload starts, the SQS trigger scales the pods to some higher number (say 900 pods). As the SQS queue comes under control, the scaledown starts. It never quite gets back to 500. It will land somewhere like 540 pods, which is within 10% of the cron desired value.

But that's 40 too many pods, and this has real cost implications.

JorTurFer · 2024-11-20T22:36:02Z

We are also experiencing an issue like this. We use a cron trigger along with an SQS trigger.

The cron trigger brings up the pod count to 500 in anticipation of a daily workload and holds it there throughout the expected workload period.

The workload starts, the SQS trigger scales the pods to some higher number (say 900 pods). As the SQS queue comes under control, the scaledown starts. It never quite gets back to 500. It will land somewhere like 540 pods, which is within 10% of the cron desired value.

But that's 40 too many pods, and this has real cost implications.

I understand the issue, but KEDA relies on the HPA controller and we can't do it if the upstream doesn't support it. I'd suggest asking about it in the upstream issue -> kubernetes/kubernetes#116984

dagvl added the bug Something isn't working label May 22, 2024

stale bot added the stale All issues that are marked as stale due to inactivity label Jul 26, 2024

stale bot closed this as completed Aug 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cron-scaler scales higher than expected #5820

cron-scaler scales higher than expected #5820

dagvl commented May 22, 2024 •

edited

Loading

JorTurFer commented May 26, 2024 •

edited

Loading

SpiritZhou commented May 27, 2024

stale bot commented Jul 26, 2024

stale bot commented Aug 3, 2024

jpriebe commented Nov 13, 2024

JorTurFer commented Nov 20, 2024

cron-scaler scales higher than expected #5820

cron-scaler scales higher than expected #5820

Comments

dagvl commented May 22, 2024 • edited Loading

Report

Expected Behavior

Actual Behavior

Steps to Reproduce the Problem

Logs from KEDA operator

KEDA Version

Kubernetes Version

Platform

Scaler Details

Anything else?

JorTurFer commented May 26, 2024 • edited Loading

SpiritZhou commented May 27, 2024

stale bot commented Jul 26, 2024

stale bot commented Aug 3, 2024

jpriebe commented Nov 13, 2024

JorTurFer commented Nov 20, 2024

dagvl commented May 22, 2024 •

edited

Loading

JorTurFer commented May 26, 2024 •

edited

Loading