Skip to content

OCPBUGS-63213: Add priority field to prevent early shutdown#1915

Merged
openshift-merge-bot[bot] merged 1 commit intoopenshift:mainfrom
CoreyCook8:patch-1
Oct 3, 2025
Merged

OCPBUGS-63213: Add priority field to prevent early shutdown#1915
openshift-merge-bot[bot] merged 1 commit intoopenshift:mainfrom
CoreyCook8:patch-1

Conversation

@CoreyCook8
Copy link
Copy Markdown
Contributor

@CoreyCook8 CoreyCook8 commented Aug 29, 2025

Based on the issue described here : kubernetes/kubernetes#133442

priorityClassName is currently ignored by Kubelet for static pod files so setting this value has no impact on the gracefulShutdown order causing the static pods to start to be killed as soon as shutdown begins.

To prevent this we must set priority explicitly

Summary by CodeRabbit

  • Chores
    • Increased the kube-apiserver Pod’s scheduling priority to improve reliability under node pressure and ensure critical control-plane availability.
    • No functional changes to existing settings; behavior remains the same aside from higher scheduling precedence.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Aug 29, 2025

Walkthrough

Adds a numeric priority field to the kube-apiserver Pod manifest to set its scheduling priority. No other fields or control flow are changed.

Changes

Cohort / File(s) Change Summary
kube-apiserver Pod manifest
bindata/assets/kube-apiserver/pod.yaml
Inserted priority: 2000001000 after hostNetwork, retaining existing priorityClassName and other fields.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

Poem

I nudge the clouds with higher might,
A pod now hops to preemptive height—
Priority set, my whiskers twitch,
Schedules align with a subtle switch.
In kube fields blue, I thump with glee,
The API soars—preferred for me! 🐇✨

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

✨ Finishing Touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@openshift-ci openshift-ci Bot requested review from benluddy and p0lyn0mial August 29, 2025 17:21
@openshift-ci openshift-ci Bot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Aug 29, 2025
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Aug 29, 2025

Hi @CoreyCook8. Thanks for your PR.

I'm waiting for a openshift member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
bindata/assets/kube-apiserver/pod.yaml (2)

291-291: Add a short inline comment for future maintainers

Explain why the numeric priority is set explicitly.

   hostNetwork: true
+  # Explicit numeric priority because kubelet ignores priorityClassName for static pods in static pod files.
+  # Ref: https://github.com/kubernetes/kubernetes/issues/133442
   priority: 2000001000

289-295: Ensure shutdown ordering is consistent across control-plane static pods

Verify priorities for other control-plane static pods (scheduler, controller-manager, etcd) meet the intended shutdown order relative to kube-apiserver.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 0bec046 and e7fc938.

📒 Files selected for processing (1)
  • bindata/assets/kube-apiserver/pod.yaml (1 hunks)
🔇 Additional comments (1)
bindata/assets/kube-apiserver/pod.yaml (1)

291-293: Approve numeric priority for static pod ordering: priority and priorityClassName fields are present in bindata/assets/kube-apiserver/pod.yaml (lines 291–292).

@CoreyCook8
Copy link
Copy Markdown
Contributor Author

👋 @p0lyn0mial @benluddy Could I get a review on this and an ok-to-test 🙏

readOnlyRootFilesystem: true
terminationGracePeriodSeconds: {{.GracefulTerminationDuration}}
hostNetwork: true
priority: 2000001000
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps for HA it doesn’t matter, but for SNO it might. It would be helpful to shut down KAS as the last pod, so that the remaining workloads can gracefully shut down before KAS.

Unless I am missing something. WDYT?

@p0lyn0mial
Copy link
Copy Markdown
Contributor

/ok-to-test

@openshift-ci openshift-ci Bot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Oct 1, 2025
@p0lyn0mial
Copy link
Copy Markdown
Contributor

@CoreyCook8 @benluddy once we are convinced about this PR perhaps we should add the priority for the other control plane components that use the static pods.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Oct 1, 2025

@CoreyCook8: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-gcp-operator-single-node e7fc938 link false /test e2e-gcp-operator-single-node
ci/prow/e2e-aws-operator-disruptive-single-node e7fc938 link false /test e2e-aws-operator-disruptive-single-node
ci/prow/e2e-aws-ovn-single-node e7fc938 link false /test e2e-aws-ovn-single-node
ci/prow/okd-scos-e2e-aws-ovn e7fc938 link false /test okd-scos-e2e-aws-ovn

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@benluddy
Copy link
Copy Markdown
Contributor

benluddy commented Oct 1, 2025

/lgtm

Thanks for opening this PR and calling out this issue!

There's a concise summary here for the record: kubernetes/kubernetes#133535 (comment). It's the numeric priority that Kubelet uses to determine shutdown priority, and the priority admission plugin is responsible for setting a numeric priority based on a priority class name. Static pods aren't created through API admission, so they don't get the numeric priority.

I don't know if we absolutely need kube-apiserver to be shut down later on an HA topology, since other components on the same node should continue to work while their local kube-apiserver process rolls out, but it does seem good to shrink the window of time we're running with reduced redundancy, even if it ends up making node shutdown take a bit longer. Maybe we'll see smoother rollouts on single-node with this?

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Oct 1, 2025
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Oct 1, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: benluddy, CoreyCook8

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 1, 2025
@p0lyn0mial
Copy link
Copy Markdown
Contributor

/verified by @CoreyCook8

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Oct 2, 2025
@openshift-ci-robot
Copy link
Copy Markdown

@p0lyn0mial: This PR has been marked as verified by @CoreyCook8.

Details

In response to this:

/verified by @CoreyCook8

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@benluddy
Copy link
Copy Markdown
Contributor

benluddy commented Oct 2, 2025

/retitle NO-JIRA: Add priority field to prevent early shutdown

@openshift-ci openshift-ci Bot changed the title Add priority field to prevent early shutdown NO-JIRA: Add priority field to prevent early shutdown Oct 2, 2025
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Oct 2, 2025
@openshift-ci-robot
Copy link
Copy Markdown

@CoreyCook8: This pull request explicitly references no jira issue.

Details

In response to this:

Based on the issue described here : kubernetes/kubernetes#133442

priorityClassName is currently ignored by Kubelet for static pod files so setting this value has no impact on the gracefulShutdown order causing the static pods to start to be killed as soon as shutdown begins.

To prevent this we must set priority explicitly

Summary by CodeRabbit

  • Chores
  • Increased the kube-apiserver Pod’s scheduling priority to improve reliability under node pressure and ensure critical control-plane availability.
  • No functional changes to existing settings; behavior remains the same aside from higher scheduling precedence.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link
Copy Markdown

/retest-required

Remaining retests: 0 against base HEAD 1a57475 and 2 for PR HEAD e7fc938 in total

@openshift-merge-bot openshift-merge-bot Bot merged commit 51e09bc into openshift:main Oct 3, 2025
17 of 21 checks passed
@dinhxuanvu
Copy link
Copy Markdown
Contributor

/backport release-4.20

@dinhxuanvu
Copy link
Copy Markdown
Contributor

/cherry-pick release-4.20

@openshift-cherrypick-robot
Copy link
Copy Markdown

@dinhxuanvu: new pull request created: #1955

Details

In response to this:

/cherry-pick release-4.20

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@sanchezl sanchezl changed the title NO-JIRA: Add priority field to prevent early shutdown OCPBUGS-63213: Add priority field to prevent early shutdown Jan 28, 2026
@openshift-ci-robot
Copy link
Copy Markdown

@CoreyCook8: Jira Issue OCPBUGS-63213 is in an unrecognized state (ON_QA) and will not be moved to the MODIFIED state.

Details

In response to this:

Based on the issue described here : kubernetes/kubernetes#133442

priorityClassName is currently ignored by Kubelet for static pod files so setting this value has no impact on the gracefulShutdown order causing the static pods to start to be killed as soon as shutdown begins.

To prevent this we must set priority explicitly

Summary by CodeRabbit

  • Chores
  • Increased the kube-apiserver Pod’s scheduling priority to improve reliability under node pressure and ensure critical control-plane availability.
  • No functional changes to existing settings; behavior remains the same aside from higher scheduling precedence.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@sanchezl
Copy link
Copy Markdown
Contributor

/jira refresh

@openshift-ci-robot
Copy link
Copy Markdown

@sanchezl: Jira Issue Verification Checks: Jira Issue OCPBUGS-63213
✔️ This pull request was pre-merge verified.
✔️ All associated pull requests have merged.
✔️ All associated, merged pull requests were pre-merge verified.

Jira Issue OCPBUGS-63213 has been moved to the MODIFIED state and will move to the VERIFIED state when the change is available in an accepted nightly payload. 🕓

Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. verified Signifies that the PR passed pre-merge verification criteria

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants