Skip to content

Conversation

@pacoxu
Copy link
Member

@pacoxu pacoxu commented Mar 20, 2023

This blog post discusses methods to speed up pod start-up from the kubelet side, including

  1. enabling parallel image pulls, setting a limit on the number of parallel image pulls at the node level (Limit added in v1.27)
  2. increasing default API QPS limits for Kubelet(bump in v1.27)
  3. using Evented PLEG. (to beta in v1.27)
  4. pod resource limit (most likely the CPU limit will influence it; BTW we changed the throttling factor for cgroup v2 memory, if pod uses a lot of memory during startup, this may be a possible factor)

It also involves the metrics/log of pod startup SLO/SLI.

Other factors that may impact pod startup include container runtime, disk speed, CPU and memory resources on the node.

/cc @ruiwen-zhao @SergeyKanzhelev @harche @wojtek-t

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. area/blog Issues or PRs related to the Kubernetes Blog subproject language/en Issues or PRs related to English language sig/docs Categorizes an issue or PR as relevant to SIG Docs. labels Mar 20, 2023
@netlify
Copy link

netlify bot commented Mar 20, 2023

Pull request preview available for checking

Built without sensitive environment variables

Name Link
🔨 Latest commit 612c222
🔍 Latest deploy log https://app.netlify.com/sites/kubernetes-io-main-staging/deploys/644b2993ca144a000852aafa
😎 Deploy Preview https://deploy-preview-40156--kubernetes-io-main-staging.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.


**Author**: Paco Xu(DaoCloud), Sergey Kanzhelev, Ruiwen Zhao(Google)

How can pod start-up time be accelerated on nodes in large clusters? This is a common issue that cluster administrators may face.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this issue is not specific to large clusters, right? Even for an one-node cluster, we can use the techniques in this blog to speed up pod startups.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uh I see. The QPS setting is mostly for large clusters.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most likely, you will meet the pod start-up thing in a large node or a large cluster.

@ruiwen-zhao
Copy link
Contributor

Thanks for the effort!

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 21, 2023
Copy link
Contributor

@sftim sftim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. We'll need changes before we can publish this.

/hold
pending assignment of a publication date

---
layout: blog
title: "Recent developments in kubelet to speed up Pod startup"
date: 2023-03-21T16:00:00+0000
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a post-release blog article? “Recent” suggests these have already shipped, but v1.27 is not released.

Copy link
Member Author

@pacoxu pacoxu Mar 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some features mentioned here will be introduced after v1.27.
Should I change this PR to target dev-1.27? Should I change the date to v1.27 release date or keep it as?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that I should update the date here once all content got approved.


To identify the cause of slow pod startup, analyzing metrics and logs can be helpful. Other factors that may impact pod startup include container runtime, disk speed, CPU and memory resources on the node.

SIG Node is responsible for ensuring fast Pod startup times, while addressing issues in large clusters falls under the purview of SIG Scalability as well.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem to link to the previous text. Is it needed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted users can know which SIGs care more about the pod startup speed.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 21, 2023
@pacoxu pacoxu force-pushed the speedup-pod-startup branch from d379a9e to 6097f7c Compare March 22, 2023 02:05
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 22, 2023
@pacoxu pacoxu force-pushed the speedup-pod-startup branch from 6097f7c to 8372df4 Compare March 22, 2023 03:38
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Mar 22, 2023
@pacoxu
Copy link
Member Author

pacoxu commented Apr 17, 2023

Hello, The publication date for this article is 15-05-2023 (May 15).

Thank you!

Nice. Then I can follow it asap after the kubecon.

Copy link
Member

@SergeyKanzhelev SergeyKanzhelev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 21, 2023
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 4594c6aaf44168f8aa81e5bd94f6f5dbdc5eb340

@pacoxu pacoxu force-pushed the speedup-pod-startup branch from dfefc58 to 32fe540 Compare April 24, 2023 06:43
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 24, 2023
@pacoxu
Copy link
Member Author

pacoxu commented Apr 24, 2023

Hello, The publication date for this article is 15-05-2023 (May 15).

The date is updated.

@stmcginnis
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 25, 2023
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: e666d6edfd23daa9ba454d47c9f7aea53491433e

@tengqm
Copy link
Contributor

tengqm commented Apr 25, 2023

/label tide/merge-method-squash

@k8s-ci-robot k8s-ci-robot added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Apr 25, 2023
@pacoxu pacoxu force-pushed the speedup-pod-startup branch from 32fe540 to 612c222 Compare April 28, 2023 02:04
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 28, 2023
@k8s-ci-robot k8s-ci-robot requested a review from stmcginnis April 28, 2023 02:04
@stmcginnis
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 28, 2023
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 2b74fc9e394ad28a8a06c37b566d1449088edd3b

@tengqm
Copy link
Contributor

tengqm commented Apr 28, 2023

/hold cancel
/approve

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 28, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: SergeyKanzhelev, tengqm

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 28, 2023
@k8s-ci-robot k8s-ci-robot merged commit 71427bc into kubernetes:main Apr 28, 2023
DonatoHorn pushed a commit to DonatoHorn/website that referenced this pull request Jun 25, 2023
…s#40156)

* add blog for how to speed up pod startup from kubelet side

* rename blog to recent devs in kubelet to speed up pod startup and update according to comments

* add pod resource limit related things that may be related to pod startup

* add SELinux Relabeling with Mount Options feature

* update per sftim's comment
@pacoxu pacoxu deleted the speedup-pod-startup branch June 26, 2023 05:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/blog Issues or PRs related to the Kubernetes Blog subproject cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. language/en Issues or PRs related to English language lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/docs Categorizes an issue or PR as relevant to SIG Docs. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.

Projects

Status: Published

Development

Successfully merging this pull request may close these issues.

8 participants