Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How do I gain access to the cluster autoscaler on GKE? #966

Closed
chrissound opened this issue Jun 14, 2018 · 20 comments
Closed

How do I gain access to the cluster autoscaler on GKE? #966

chrissound opened this issue Jun 14, 2018 · 20 comments
Labels
area/cluster-autoscaler area/provider/gcp Issues or PRs related to gcp provider

Comments

@chrissound
Copy link

I am looking to modify some of the auto scaling options, but this does not seem to be possible on GKE?

It's not clear where to run these 'flags' mentioned in the FAQ or even where these command line flags need to be executed on.

Similar issue is brought up here:
https://stackoverflow.com/questions/48963625/where-to-config-the-kubernetes-cluster-autoscaler-on-google-cloud

@aleksandra-malinowska aleksandra-malinowska added area/cluster-autoscaler area/provider/gcp Issues or PRs related to gcp provider labels Jun 14, 2018
@aleksandra-malinowska
Copy link
Contributor

You're correct. On GKE, Cluster Autoscaler is always configured automatically. If you run your own cluster on GCE and have access to the master machine, you can change them in Cluster Autoscaler pod's manifest.

@chrissound
Copy link
Author

Thanks!

@joshwand
Copy link

joshwand commented Dec 7, 2018

It would be good to be able to do some configuration on CA in GKE-- as in the referenced issue, I'd like to reduce the --scale-down-unneeded-time so as to not waste money for 10 minutes of unneeded capacity.

@glapark
Copy link

glapark commented Jan 7, 2019

I wonder if emptying a node completely helps the autoscaler to quickly remove it. For interactive analytic applications, the default value of 10 minutes for --scale-down-unneeded-time seems too large.

@aleksandra-malinowska
Copy link
Contributor

I wonder if emptying a node completely helps the autoscaler to quickly remove it. For interactive analytic applications, the default value of 10 minutes for --scale-down-unneeded-time seems too large.

It helps by eliminating drain time, and also increases throughput by allowing bulk deletes.

As for default 10 minutes wait, it's a compromise of sorts - we don't want the user to wait for nodes to be added because we removed them too quickly between jobs. This being said, we haven't revised this value for a while, so if you any have feedback regarding this behavior, especially production experience with it, please let us know.

@glapark
Copy link

glapark commented Jan 7, 2019

Thanks for the reply. At the moment, we are still implementing a new service and don't have any production-level experience with it yet (but will publish the result when it is ready).

@joshwand
Copy link

joshwand commented Jan 7, 2019

We spin up expensive high-memory instances on demand as slaves for our integration tests. The load is intermittent, so that extra 10 minutes for 10-30 instances, multiple times a day, gets quite expensive.

@glapark
Copy link

glapark commented Jun 28, 2019

I wonder if there is any update on the default value of --scale-down-unneeded-time. I think the default value of 10 minutes is fine, but I hope GKE allows users to change the value for their own cluster, because if --scale-down-unneeded-time is set to a new value, the users should know what that actually means.

For us, we would like to implement an autoscaling logic for an analytics system based on Apache Hive, and we would like to remove nodes as soon as possible once the autoscaling logic decides to retire them.

@Luke-Vear
Copy link

Would be nice to be able to configure things like skip-nodes-with-system-pods or skip-nodes-with-local-storage, there's tons of config that we can't touch.

@MaciekPytel
Copy link
Contributor

You can now choose predefined config for more aggressive scale-down: https://cloud.google.com/kubernetes-engine/docs/concepts/cluster-autoscaler#autoscaling_profiles (which doesn't help for the flags you listed, but it is what is requested in comments above).

@seeruk
Copy link

seeruk commented Jul 15, 2020

I've been using the optimize-utilization profile, and unfortunately as you've said, it doesn't solve this issue. When using Linkerd, similarly to Istio, it creates emptyDir volume on every pod that you have a sidecar on. This prevents the cluster autoscaler from scaling down because of pretty much every application we have in the cluster.

The current workaround I've had to resort to is this: #3322

The other solution I've been considering so we don't have to maintain a fork of the autoscaler is building some kind of admission controller to add the safe-to-evict annotation to every pod unless an annotation (unsafe-to-evict?) is present, as in the cluster I'm working within local storage should be an extremely exceptional scenario. Using PDBs for the kube-system pods is good, I'd rather know that those pods are being migrated more gracefully.

Being able to just configure the GKE autoscaler would completely solve this though. Perhaps configuration could be exposed in a ConfigMap instead, allowing the solution to be more platform agnostic.

@adinhodovic
Copy link

adinhodovic commented Aug 22, 2020

Also, I'd prefer be able to monitor the cluster autoscaler using Prometheus.

@danielyaa5
Copy link

Not sure why this is closed, I guess google doesn't prioritize features that save people money

@zenyui
Copy link

zenyui commented Jan 6, 2022

@seeruk curious - did you solve this? did you end up writing that admission controller? funny, i was thinking of writing the same thing.

@seeruk
Copy link

seeruk commented Jan 7, 2022

Yeah, it was a really simple one in the end and it's still working to this day! Unfortunately it's closed source currently.

@MaciekPytel
Copy link
Contributor

In 1.22+ GKE no longer blocks scale-down on pods with local storage https://cloud.google.com/kubernetes-engine/docs/release-notes#October_27_2021, so an admission controller may no longer be needed.

@zenyui
Copy link

zenyui commented Jan 11, 2022

I just open sourced our pod labeler in case this is useful for anyone. You can use this to add the safe-to-evict annotation. @seeruk lmk your thoughts!

https://github.com/troop-dev/k8s-pod-labeler

@vadasambar
Copy link
Member

I use custom-autoscaler on GKE to test my PRs.

If anyone's interested I wrote a blogpost around how to delpoy your own cluster-autoscaler on GKE: https://vadasambar.com/post/kubernetes/how-to-deploy-custom-ca-on-gcp/

If you don't want to jump to the blogpost, here's a summarized version:

  1. Enable Workload Identity for the GKE cluster
  2. Deploy your cluster-autoscaler helm chart in a non-kube-system namespace
helm install custom-ca autoscaler/cluster-autoscaler \
--set "autoscalingGroupsnamePrefix[0].name=gke-cluster-1,autoscalingGroupsnamePrefix[0].maxSize=10,autoscalingGroupsnamePrefix[0].minSize=1" \
--set autoDiscovery.clusterName=cluster-1 \
--set "rbac.serviceAccount.annotations.iam\.gke\.io\/gcp-service-account=cluster-autoscaler@my-project-123456.iam.gserviceaccount.com" \
--set cloudProvider=gce \
--version=9.25.0 \
--namespace=default
  1. Create ResourceQuota for system-cluster-critical PriorityClass in your namespace
apiVersion: v1
kind: ResourceQuota
metadata:
  labels:
    addonmanager.kubernetes.io/mode: Reconcile
  name: gcp-critical-pods
  namespace: default
spec:
  hard:
    pods: 2 # 2 because we need it only for cluster-autoscaler
  scopeSelector:
    matchExpressions:
    - operator: In
      scopeName: PriorityClass
      values:
      - system-cluster-critical # cluster-autoscaler priority class

  1. Create GCP service account and attach it to the role you want
  2. Create Kubernetes Service Account to GCP IAM Service Account binding
  3. Annotate your K8s Service Account for cluster-autoscaler to use workload identity federation

@Nickmman
Copy link

@vadasambar Do your cluster-autoscaler logs show that it manages only one instance group and the others should not be processed by cluster autoscaler (no node group config)?

@vadasambar
Copy link
Member

@vadasambar Do your cluster-autoscaler logs show that it manages only one instance group and the others should not be processed by cluster autoscaler (no node group config)?

@Nickmman , not sure if this answers your question but I have used the custom cluster-autoscaler (multiple times) to manage only 1 instance group and it works fine with me. If you check this comment, you will see I have 2 instance groups backed nodepools called default-pool and pool-1 but I use cluster-autoscaler to manage only pool-1 (you can see gke-cluster-1-pool-1 value in the flags in the screenshot).

yaroslava-serdiuk pushed a commit to yaroslava-serdiuk/autoscaler that referenced this issue Feb 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cluster-autoscaler area/provider/gcp Issues or PRs related to gcp provider
Projects
None yet
Development

No branches or pull requests