Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pass registry-burst and registry-qps argument to kubelet #1495

Closed
Jell opened this issue Apr 16, 2021 · 13 comments · Fixed by #1532, #1541 or #1527
Closed

Pass registry-burst and registry-qps argument to kubelet #1495

Jell opened this issue Apr 16, 2021 · 13 comments · Fixed by #1532, #1541 or #1527
Assignees
Labels
area/kubernetes K8s including EKS, EKS-A, and including VMW type/enhancement New feature or request
Milestone

Comments

@Jell
Copy link

Jell commented Apr 16, 2021

We hit an issue when having bursts of traffic launching pods on a new node and hit a pull limit error due to the default registry-qps of 5, leading to us not being able to handle our production load.

What I'd like:
Being able to bump the registry-qps limit to fit our needs

Any alternatives you've considered:
Really not much we can do, our registry can handle the load but kubelet won't let us query it that fast.

This is a similar request to #1447

@jhaynes jhaynes added area/kubernetes K8s including EKS, EKS-A, and including VMW status/needs-triage Pending triage or re-evaluation type/enhancement New feature or request labels Apr 16, 2021
@jhaynes
Copy link
Contributor

jhaynes commented Apr 16, 2021

Hi @Jell and thanks for filing this issue. Do you have any sense for how high you'd like to set this? I'm wondering if a higher default makes sense or if we should just expose this as a setting.

@jhaynes jhaynes added this to the next milestone Apr 16, 2021
@jhaynes jhaynes added priority/p1 and removed status/needs-triage Pending triage or re-evaluation labels Apr 16, 2021
@Jell
Copy link
Author

Jell commented Apr 16, 2021

I'm not exactly sure how high we would want it, we'd have to experiment.

But I would suspect probably at least a factor of 10, so something like 50 QPS to begin with. I think it's the kind of setting we'd probably appreciate being able to tweak though given it depends on the kind of registry and workload being deployed?

Ideally I would be able to set that setting myself, then trigger a heavy load and raise the QPS until it handles the load, but not higher than.

@Jell
Copy link
Author

Jell commented Apr 16, 2021

thanks for the quick reply and for looking into it btw! :) @jhaynes

@gthao313
Copy link
Member

Hi @Jell. Thanks for your reply. Now I take responsibility on this issue. If you have updates or thoughts please leave comments. Thanks :)

@Jell
Copy link
Author

Jell commented Apr 17, 2021

alright, will do thanks @gthao313 !

@Jell
Copy link
Author

Jell commented Apr 22, 2021

aright so we ended up testing with a custom build of bottlerocket, and those are the parameters we would want to be able to set ourselves and that worked for our use case (we managed to reach our needed throughput with those):

registryPullQPS: 50
registryBurst: 100

eventRecordQPS: 50
eventBurst: 100

kubeAPIQPS: 50
kubeAPIBurst: 100

serializeImagePulls: false

@gthao313 gthao313 added the status/needs-proposal Needs a more detailed proposal for next steps label Apr 23, 2021
@gthao313
Copy link
Member

gthao313 commented Apr 23, 2021

Hi @Jell . Thanks for your update. After investigating on those two arguments --registry-burst and --registry-qps, we plan to expose these arguments as settings that you can specify according to your use case. In addition, we would not provide default values for those two arguments since Kubelet has already default --registry-qps to 10 and --registry-burst to 5. Do you have any feedbacks for this plan? Also, just to confirm, are --registry-burst and --registry-qps two the only arguments you want us to provide (since I notices a couple other arguments in the thread like --event-qps --event-burst)? Thanks! :)

@Jell
Copy link
Author

Jell commented Apr 23, 2021

@gthao313 thanks! We would actually like to be able to set all the arguments I listed above, so in the CLI that would correspond to the following arguments:

--event-burst
--event-qps
--kube-api-burst
--kube-api-qps
--registry-burst
--registry-qps
--serialize-image-pulls

@Jell
Copy link
Author

Jell commented Apr 23, 2021

also yes I think it's a good idea to keep the default for those 7 arguments, but just provide the means of tweaking them for improved performance. Although with regards to serialize-image-pulls, this could potentially be changed to false, as the documentation seem to indicate that it's only an issue with older docker versions and AUFS, which in bottlerocket's case should not be an issue?

@gthao313
Copy link
Member

Sure thing! We'll support these arguments that you mentioned above. If you have any questions or updates, please let me know.

@Jell
Copy link
Author

Jell commented Apr 23, 2021

awesome! thank you so much @gthao313 ❤️

@gthao313
Copy link
Member

gthao313 commented May 3, 2021

Hi @Jell . Thanks for your patience. we have already exposed event-burst event-qps kube-api-burst kube-api-qps registry-burst registry-qps as settings. You will see those new features in next release of Bottlerocket v1.1.0. For argument serialize-image-pulls, we decide not to expose it at setting but Bottlerocket had already defaulted it to false.You can see that default in kube-config. If you have any questions or new issues, please feel free to reach out to Bottlerocket team. Thank you! ❤️

@Jell
Copy link
Author

Jell commented May 4, 2021

awesome sauce! Looking forward to the next release! :) Thanks a lot and for the very quick handling of the issue!

@bcressey bcressey removed the status/in-progress This issue is currently being worked on label Nov 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment