Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slowness due to client-side throttling in v0.32.0 #2582

Closed
Ubiquitine opened this issue Mar 4, 2024 · 8 comments
Closed

Slowness due to client-side throttling in v0.32.0 #2582

Ubiquitine opened this issue Mar 4, 2024 · 8 comments
Labels
bug Something isn't working question Further information is requested

Comments

@Ubiquitine
Copy link

Ubiquitine commented Mar 4, 2024




Describe the bug
After upgrading to v0.32.0, k9s started to response really slow when switching resources after few minutes of running.

To Reproduce
Steps to reproduce the behavior:

  1. Launch k9s in terminal
  2. Switch between resources/namespaces
  3. After several minutes, switching between resources becomes slow - taking up to 10-20 seconds, during which k9 is unresponsive

Historical Documents
INFO log show lines like this during the issue:
I0304 18:39:43.076819 114467 request.go:697] Waited for 18.03660912s due to client-side throttling, not priority and fairness, request: GET:https://API_URL/apis/RESOURCES_PATHS

Expected behavior
Namespace and resources switch should not take more than couple seconds (at least it didn't in previous versions).

Versions (please complete the following information):

  • OS: Linux
  • K9s: 0.32.0
  • K8s: v1.24.17-eks-5e0fdde

Additional context
I also tried to clean ~/.kube/cache directory, but the issue keeps coming back.
To be fair, I have 120+ namespaces with deployments and jobs in each, but I guess something has been changed in client behavior in 0.32.0 that introduced this throttling and causes huge delays.

@derailed
Copy link
Owner

derailed commented Mar 4, 2024

@Ubiquitine Thank you for this report! Yikes not what I was expecting after a perf improvement pass ;(
Can you add more specifics here in terms of which cluster env (gke, azk,...) resource/namespace that suddenly caused the lags. What is your refresh rate set to? Also how many resources are we expecting when the lags occurred ie nsX/resY. Including logs could help us zero this in as well. Tx!
NOTE: I had tested with 10k namespaces/pods and did not see any issues...

@derailed derailed added bug Something isn't working question Further information is requested labels Mar 4, 2024
@Ubiquitine
Copy link
Author

Ubiquitine commented Mar 5, 2024

Hi, More info:
Cluster: EKS v1.24.17-eks-5e0fdde.
21 nodes, 123 namespaces. In most namespaces there are 2-4 pods.
The issue observed when I try to switch to pods in some namespace by typing in console, e.g. pods namespace. But pretty much every resource that I try to see takes several seconds to display.
This is what I get in INFO log during the event:

I0305 10:39:51.001098  118588 request.go:697] Waited for 17.867920295s due to client-side throttling, not priority and fairness, request: GET:https://MASKED.gr7.us-east-1.eks.amazonaws.com/apis/ecr.aws.crossplane.io/v1alpha1?timeout=32s
I0305 10:40:01.001092  118588 request.go:697] Waited for 7.86665376s due to client-side throttling, not priority and fairness, request: GET:https://MASKED.gr7.us-east-1.eks.amazonaws.com/apis/monitoring.coreos.com/v1alpha1?timeout=32s
I0305 10:40:11.199175  118588 request.go:697] Waited for 18.064888002s due to client-side throttling, not priority and fairness, request: GET:https://MASKED.gr7.us-east-1.eks.amazonaws.com/apis/keycloak.org/v1alpha1?timeout=32s
......
......
I0305 10:46:48.582541  118588 request.go:697] Waited for 15.250904406s due to client-side throttling, not priority and fairness, request: GET:https://MASKED.gr7.us-east-1.eks.amazonaws.com/apis/acme.cert-manager.io/v1?timeout=32s
I0305 10:46:58.780758  118588 request.go:697] Waited for 5.463363034s due to client-side throttling, not priority and fairness, request: GET:https://MASKED.gr7.us-east-1.eks.amazonaws.com/apis/discovery.k8s.io/v1beta1?timeout=32s
I0305 10:47:08.781924  118588 request.go:697] Waited for 15.464305047s due to client-side throttling, not priority and fairness, request: GET:https://MASKED.gr7.us-east-1.eks.amazonaws.com/apis/route53.aws.crossplane.io/v1alpha1?timeout=32s
I0305 10:47:18.980168  118588 request.go:697] Waited for 5.666558067s due to client-side throttling, not priority and fairness, request: GET:https://MASKED.gr7.us-east-1.eks.amazonaws.com/apis/autoscaling/v2beta1?timeout=32s
......
......
I0305 11:00:56.732437  118778 request.go:697] Waited for 19.045184644s due to client-side throttling, not priority and fairness, request: GET:https://MASKED.gr7.us-east-1.eks.amazonaws.com/apis/vpcresources.k8s.aws/v1alpha1?timeout=32s
I0305 11:01:06.930518  118778 request.go:697] Waited for 9.261881093s due to client-side throttling, not priority and fairness, request: GET:https://MASKED.gr7.us-east-1.eks.amazonaws.com/apis/redshift.aws.crossplane.io/v1alpha1?timeout=32s
I0305 11:01:18.730249  118778 request.go:697] Waited for 1.068109485s due to client-side throttling, not priority and fairness, request: GET:https://MASKED.gr7.us-east-1.eks.amazonaws.com/apis/events.k8s.io/v1beta1?timeout=32s
I0305 11:01:28.730454  118778 request.go:697] Waited for 11.068214575s due to client-side throttling, not priority and fairness, request: GET:https://MASKED.gr7.us-east-1.eks.amazonaws.com/apis/vpcresources.k8s.aws/v1beta1?timeout=32s
......
......
I0305 11:01:38.929292  118778 request.go:697] Waited for 1.263211789s due to client-side throttling, not priority and fairness, request: GET:https://MASKED.gr7.us-east-1.eks.amazonaws.com/apis/apps/v1?timeout=32s
I0305 11:01:48.929449  118778 request.go:697] Waited for 11.263007094s due to client-side throttling, not priority and fairness, request: GET:https://MASKED.gr7.us-east-1.eks.amazonaws.com/apis/notification.aws.crossplane.io/v1alpha1?timeout=32s
I0305 11:01:59.128108  118778 request.go:697] Waited for 1.221652111s due to client-side throttling, not priority and fairness, request: GET:https://MASKED.gr7.us-east-1.eks.amazonaws.com/apis/authentication.k8s.io/v1?timeout=32s
I0305 11:02:09.328875  118778 request.go:697] Waited for 11.422229987s due to client-side throttling, not priority and fairness, request: GET:https://MASKED.gr7.us-east-1.eks.amazonaws.com/apis/identity.aws.crossplane.io/v1beta1?timeout=32s
I0305 11:02:19.528098  118778 request.go:697] Waited for 1.576175443s due to client-side throttling, not priority and fairness, request: GET:https://MASKED.gr7.us-east-1.eks.amazonaws.com/apis/events.k8s.io/v1beta1?timeout=32s

This is my config.yaml

k9s:
  screenDumpDir: /tmp/k9s-screens
  refreshRate: 2
  maxConnRetry: 5
  readOnly: false
  noExitOnCtrlC: false
  ui:
    enableMouse: false
    headless: false
    logoless: true
    crumbsless: true
    noIcons: true
    skin: transparent
    reactive: false
  skipLatestRevCheck: false
  disablePodCounting: false
  shellPod:
    image: busybox:1.35.0
    namespace: default
    limits:
      cpu: 100m
      memory: 100Mi
  imageScans:
    enable: false
    exclusions:
      namespaces: []
      labels: {}
  liveViewAutoRefresh: false
  logger:
    tail: 100
    buffer: 5000
    sinceSeconds: 300
    fullScreen: false
    textWrap: false
    showTime: false
  thresholds:
    cpu:
      critical: 90
      warn: 70
    memory:
      critical: 90
      warn: 70

UPD: more logs

@seanmuth
Copy link

seanmuth commented Mar 5, 2024

Also experiencing pretty significant lag/slowness on 0.32.1.

My k9s.log doesn't show any client-side throttling errors, but I'm wondering if I'm looking at the right log?

Happy to attach logs/configs to help debug.

Versions (please complete the following information):

OS: macOS 13.4
K9s: 0.32.1
kubectl: 1.24.6
K8s: multiple, seen on v1.26.13-gke.1052000, v1.25.16-eks-77b1e4e, etc

I've seen this slowness across the board. I work at a SaaS company and we have clusters in all three major clouds, having namespaces in the range of 20-200+, with pod counts from 200-5000+.

config.yaml

k9s:
  liveViewAutoRefresh: false
  refreshRate: 2
  maxConnRetry: 5
  ui:
    skin: nightfox-astro
  # enableMouse: false
  # enableImageScan: false
  # headless: false
  # logoless: false
  # crumbsless: false
  readOnly: false
  noExitOnCtrlC: false
  # noIcons: false
  shellPod:
    image: busybox:1.35.0
    namespace: default
    limits:
      cpu: 100m
      memory: 100Mi
  skipLatestRevCheck: false
  logger:
    tail: 200
    buffer: 5000
    sinceSeconds: 60
    textWrap: false
    showTime: false
  thresholds:
    cpu:
      critical: 90
      warn: 70
    memory:
      critical: 90
      warn: 70
  screenDumpDir: /Users/seanmuth/code/k9s/screendumps
  disablePodCounting: false

views.yaml

# $XDG_CONFIG_HOME/k9s/views.yaml
views:
  # alters the pod view column layout
  v1/pods:
    sortColumn: NAME:desc
    columns:
      - NAMESPACE
      - NAME
      - STATUS
      - READY
      - RESTARTS
      - CPU
      - MEM
      - IP
      - NODE
      - AGE

E: add version info

@mythbai
Copy link

mythbai commented Mar 5, 2024

Just upgraded from 0.2x to 0.32.1 today, and everything is extremely slow. I work for a Telco company remotely through VPN. Installed K9s in windows 10 Enterprise WSL. There is no relavant log find in k9s.* log files.

derailed added a commit that referenced this issue Mar 5, 2024
derailed added a commit that referenced this issue Mar 6, 2024
@derailed
Copy link
Owner

derailed commented Mar 6, 2024

Let see if we're happier on v0.32.2??

@tasszz2k
Copy link

tasszz2k commented Mar 6, 2024

I have gotten the same issue

@Ubiquitine
Copy link
Author

Ubiquitine commented Mar 6, 2024

I can confirm that the the issue seems to be fixed in v0.32.2. At least in my case.

@Mr-DeWitt
Copy link

Same, using version 0.32.4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working question Further information is requested
Projects
None yet
Development

No branches or pull requests

6 participants