StatefulSet not scaled #1940

avdhoot · 2021-07-07T09:39:34Z

Report

keda not able to scale StatefulSet even triggers condition meet. Check attached screen shot where threshold beyond the trigger(75) in keda metrics but StatefulSet not have scaled .

Expected Behavior

After crossing threshold StatefulSet should have scaled.

Actual Behavior

After crossing threshold StatefulSet not scaled.

Steps to Reproduce the Problem

Logs from KEDA operator

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: fluentd-logs
  namespace: fluent
spec:
  scaleTargetRef:
    name: fluentd-logs
    kind: StatefulSet
  pollingInterval: 30
  minReplicaCount: 2
  maxReplicaCount: 10
  advanced:
    horizontalPodAutoscalerConfig:
      behavior:
        scaleUp:
          policies:
          - periodSeconds: 60
            type: Pods
            value: 1
        scaleDown:
          stabilizationWindowSeconds: 900
          policies:
          - periodSeconds: 300
            type: Pods
            value: 1
  triggers:
  - type: prometheus
    metadata:
      serverAddress: http://prometheus-infra.prometheus.svc.cluster.local:9090
      metricName: namespace_pod_name_container_name:container_cpu_usage_seconds_total:sum_rate
      query:  avg(namespace_pod_name_container_name:container_cpu_usage_seconds_total:sum_rate{namespace="fluent", container_name="fluentd"}) * 100
      threshold: '75'

Name:   keda-hpa-fluentd-logs
Namespace: fluent
Labels: app.kubernetes.io/managed-by=keda-operator
        app.kubernetes.io/name=keda-hpa-fluentd-logs
        app.kubernetes.io/part-of=fluentd-logs
        app.kubernetes.io/version=2.2.0
        scaledObjectName=fluentd-logs
Annotations:                                                                                                                                                                                                                      <none>
CreationTimestamp:                                                                                                                                                                                                                Wed, 07 Jul 2021 14:44:08 +0530
Reference:                                                                                                                                                                                                                        StatefulSet/fluentd-logs
Metrics: ( current / target )
  "prometheus-http---prometheus-infra-prometheus-svc-cluster-local-9090-(namespace_pod_name_container_name-container_cpu_usage_seconds_total-sum_rate{namespace=\"fluent\", container_name=\"fluentd\"}" (target average value):  46 / 75
Min replicas:                                                                                                                                                                                                                     2
Max replicas:                                                                                                                                                                                                                     10
Behavior:
  Scale Up:
    Stabilization Window: 0 seconds
    Select Policy: Max
    Policies:
      - Type: Pods  Value: 1  Period: 60 seconds
  Scale Down:
    Stabilization Window: 900 seconds
    Select Policy: Max
    Policies:
      - Type: Pods  Value: 1  Period: 300 seconds
StatefulSet pods:   2 current / 2 desired
Conditions:
  Type            Status  Reason              Message
  ----            ------  ------              -------
  AbleToScale     True    ReadyForNewScale    recommended size matches current size
  ScalingActive   True    ValidMetricFound    the HPA was able to successfully calculate a replica count from external metric prometheus-http---prometheus-infra-prometheus-svc-cluster-local-9090-(namespace_pod_name_container_name-container_cpu_usage_seconds_total-sum_rate{namespace="fluent", container_name="fluentd"}(&LabelSelector{MatchLabels:map[string]string{scaledObjectName: fluentd-logs,},MatchExpressions:[]LabelSelectorRequirement{},})
  ScalingLimited  False   DesiredWithinRange  the desired count is within the acceptable range
Events:           <none>

KEDA Version

2.2.0

Kubernetes Version

1.19

Platform

Amazon Web Services

Scaler Details

prometheus

Anything else?

Not sure why current value in hpa is 46 ?

❯ kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/fluent/prometheus-http---prometheus-infra-prometheus-svc-cluster-local-9090-(namespace_pod_name_container_name-container_cpu_usage_seconds_total-sum_rate{namespace=\"fluent\", container_name=\"fluentd\"}" | jq
{
  "kind": "ExternalMetricValueList",
  "apiVersion": "external.metrics.k8s.io/v1beta1",
  "metadata": {},
  "items": [
    {
      "metricName": "prometheus-http---prometheus-infra-prometheus-svc-cluster-local-9090-(namespace_pod_name_container_name-container_cpu_usage_seconds_total-sum_rate{namespace=\"fluent\", container_name=\"fluentd\"}",
      "metricLabels": null,
      "timestamp": "2021-07-07T10:52:25Z",
      "value": "93"
    }
  ]
}

The text was updated successfully, but these errors were encountered:

zroubalik · 2021-07-09T09:06:44Z

Is this really related to StatefulSet? What happens if you use Deployment?

From the HPA logs seems like, that it doesn't need to scale. Are you sure that your query is correct?

coderanger · 2021-07-09T09:10:30Z

Also if all you want to scale on is CPU usage, normal HPA objects (or the passthrough for HPA's cpu/memory metric scaling) might be a better fit than roundtripping through Keda.

avdhoot · 2021-07-09T09:56:11Z

@zroubalik Sorry for confusion. I do not think so it related to StatefulSet. I can change issue title if you want.

Looks like HPA controller divide current metrics by no. of current replica ref. In this situation Currently we have 2 pod hence current metrics(91) get divided by 2 ie. 46. Trigger value is 75 so HPA never thinks it should scale.

if above theory is right not sure how people scaling stuff on aggregated metrics which are not related to the no. pod.

Please let me know if I am wrong this my attempt understand why.

@coderanger
That is plan B. But wanted to understand why this is not working. In theory it should

coderanger · 2021-07-09T10:03:44Z

All metrics computations can be in either Value or AverageValue mode. In general this is currently hard-coded per scaler based on the first use case it dealt with. There is a vague plan to make it configurable and more consistent overall but for now just check each scaler's code to see which mode it uses.

zroubalik · 2021-07-09T10:12:01Z

I think that it is pretty safe to say that all scalers are using AverageValue mode. https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#algorithm-details

zroubalik · 2021-07-09T10:15:27Z

Just for a reference: #1314 for enabling Value mode for Rabbit MQ, but I'd rather see it done on a global level, as @coderanger mentioned.

avdhoot · 2021-07-09T11:26:33Z

Thanks for confirming the behavior & providing ref. Any idea how you guys think it should get implemented?

avdhoot · 2021-07-10T10:42:47Z

Suggestion Any thoughts on exposing MetricTargetType in scaler.metadata like.

  triggers:
  - type: prometheus
    metadata:
      serverAddress: 
      metricTargetType: AverageValue # default value can be Value
      metricName: 
      query:  
      threshold:

zroubalik · 2021-07-12T09:19:35Z

I would love to see some generic approach for all scalers.
So it might even be a new field next to metadata section (similar way how this PR adds Fallback: https://github.com/kedacore/keda/pull/1910/files#diff-33506d72fc24194f1ac7ad0a8963c2f19a49c4a46b6d559c70f8f2b5c27d0837R110).

The only thing I am concerned about is what would be the actual behavior when there will be mutliple triggers in one ScaledObject with mixed metric target types? For example 2 triggers in ScaledObject, first using AverageValue, the latter Value.

So eventually we would need to set this setting on a ScaledObject level and apply to all triggers?

But I am not sure about this and we need some investigation on this topic.

stale · 2021-10-13T18:40:18Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

stale · 2021-10-20T19:12:56Z

This issue has been automatically closed due to inactivity.

avdhoot added the bug Something isn't working label Jul 7, 2021

avdhoot changed the title ~~staefulset not scaled~~ StatefulSet not scaled Jul 7, 2021

amirschw mentioned this issue Aug 12, 2021

Add support for the "Value" metric type in addition to "AverageValue" #2030

Closed

stale bot added the stale All issues that are marked as stale due to inactivity label Oct 13, 2021

stale bot closed this as completed Oct 20, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

StatefulSet not scaled #1940

StatefulSet not scaled #1940

avdhoot commented Jul 7, 2021 •

edited

Loading

zroubalik commented Jul 9, 2021

coderanger commented Jul 9, 2021

avdhoot commented Jul 9, 2021

coderanger commented Jul 9, 2021

zroubalik commented Jul 9, 2021

zroubalik commented Jul 9, 2021

avdhoot commented Jul 9, 2021

avdhoot commented Jul 10, 2021

zroubalik commented Jul 12, 2021 •

edited

Loading

stale bot commented Oct 13, 2021

stale bot commented Oct 20, 2021

StatefulSet not scaled #1940

StatefulSet not scaled #1940

Comments

avdhoot commented Jul 7, 2021 • edited Loading

Report

Expected Behavior

Actual Behavior

Steps to Reproduce the Problem

Logs from KEDA operator

KEDA Version

Kubernetes Version

Platform

Scaler Details

Anything else?

zroubalik commented Jul 9, 2021

coderanger commented Jul 9, 2021

avdhoot commented Jul 9, 2021

coderanger commented Jul 9, 2021

zroubalik commented Jul 9, 2021

zroubalik commented Jul 9, 2021

avdhoot commented Jul 9, 2021

avdhoot commented Jul 10, 2021

zroubalik commented Jul 12, 2021 • edited Loading

stale bot commented Oct 13, 2021

stale bot commented Oct 20, 2021

avdhoot commented Jul 7, 2021 •

edited

Loading

zroubalik commented Jul 12, 2021 •

edited

Loading