Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Have default priorityClass to prevent unexpected longhorn pods eviction #6528

Closed
innobead opened this issue Aug 15, 2023 · 4 comments
Assignees
Labels
area/chart Helm chart related area/install-uninstall-upgrade Install, Uninstall or Upgrade related highlight Important feature/issue to highlight kind/feature Feature request, new feature priority/0 Must be implement or fixed in this release (managed by PO) require/auto-e2e-test Require adding/updating auto e2e test cases if they can be automated require/chart Require updating (chart) manifests in longhorn, longhorn-manager, charts repos require/doc Require updating the longhorn.io documentation require/lep Require adding/updating enhancement proposal
Milestone

Comments

@innobead
Copy link
Member

Is your feature request related to a problem? Please describe (👍 if you like this request)

Have a default priority class with the highest value to ensure longhorn pods have not been evicted when low system resources by the Kube scheduler. Right now, the setting is optional.

https://kubernetes.io/docs/concepts/scheduling-eviction/pod-priority-preemption/

Describe the solution you'd like

Describe alternatives you've considered

Additional context

cc @longhorn/dev

@innobead innobead added kind/feature Feature request, new feature priority/0 Must be implement or fixed in this release (managed by PO) require/auto-e2e-test Require adding/updating auto e2e test cases if they can be automated require/doc Require updating the longhorn.io documentation require/lep Require adding/updating enhancement proposal require/chart Require updating (chart) manifests in longhorn, longhorn-manager, charts repos labels Aug 15, 2023
@innobead innobead added this to the v1.6.0 milestone Aug 15, 2023
@harridu
Copy link

harridu commented Aug 22, 2023

Will existing deployments be migrated to the new default priority class on the fly? I am not sure if this is possible.

@innobead
Copy link
Member Author

Will existing deployments be migrated to the new default priority class on the fly? I am not sure if this is possible.

In the current implementation, it's not possible. However, we should improve this part to have the following support to prevent any interruption for running volumes.

  • When changing any volume-related Pod configuration (ex: IM pod, share manager pod, upcoming object storage/s3gw pod) like priority class, the change will only take effect when doing volume upgrade or volume reattachement.

Extra info from @PhanLe1010

image

cc @shuo-wu @derekbit @c3y1huang

@longhorn-io-github-bot
Copy link
Collaborator

longhorn-io-github-bot commented Dec 5, 2023

Pre Ready-For-Testing Checklist

  • Where is the reproduce steps/test steps documented?
    The reproduce steps/test steps are at:
  1. Install Longhorn from scratch.
  2. There is a priority class named longhorn-critical (kubectl get pc)
  3. Longhorn setting priority-class has a default value longhorn-critical
  4. The priorityClass of the spec should be longhorn-critical for all longhorn pods.
  • Is there a workaround for the issue? If so, where is it documented?
    The workaround is at:

  • Does the PR include the explanation for the fix or the feature?

  • Does the PR include deployment change (YAML/Chart)? If so, where are the PRs for both YAML file and Chart?
    The PR for the YAML change is at:
    The PR for the chart change is at:
    feat: default priority class #7262

  • Have the backend code been merged (Manager, Engine, Instance Manager, BackupStore etc) (including backport-needed/*)?
    The PR is at
    feat: default priority class longhorn-manager#2332
    feat(setting): lazy update for volume-related settings longhorn-manager#2411

  • Which areas/issues this PR might have potential impacts on?
    Area
    Issues

  • If labeled: require/LEP Has the Longhorn Enhancement Proposal PR submitted?
    The LEP PR is at
    feat(lep): default priority class #7252

  • If labeled: area/ui Has the UI issue filed or ready to be merged (including backport-needed/*)?
    The UI issue/PR is at

  • If labeled: require/doc Has the necessary document PR submitted or merged (including backport-needed/*)?
    The documentation issue/PR is at
    doc: default priority class website#808

  • If labeled: require/automation-e2e Has the end-to-end test plan been merged? Have QAs agreed on the automation test case? If only test case skeleton w/o implementation, have you created an implementation issue (including backport-needed/*)
    The automation skeleton PR is at
    The automation test case PR is at
    The issue of automation test case implementation is at (please create by the template)

  • If labeled: require/automation-engine Has the engine integration test been merged (including backport-needed/*)?
    The engine automation PR is at

  • If labeled: require/manual-test-plan Has the manual test plan been documented?
    The updated manual test plan is at

  • If the fix introduces the code for backward compatibility Has a separate issue been filed with the label release/obsolete-compatibility?
    The compatibility issue is filed at

@innobead innobead added area/install-uninstall-upgrade Install, Uninstall or Upgrade related area/chart Helm chart related labels Dec 20, 2023
@innobead innobead added the highlight Important feature/issue to highlight label Jan 2, 2024
@chriscchien chriscchien self-assigned this Jan 12, 2024
@chriscchien
Copy link
Contributor

Verified pass on longhorn v1.6.0-rc1 with test stpes

After install v1.6.0-rc1, can see priority class longhorn-critical in all longhorn pods

> k get priorityclass
NAME                      VALUE        GLOBAL-DEFAULT   AGE
system-node-critical      2000001000   false            8m54s
system-cluster-critical   2000000000   false            8m54s
longhorn-critical         1000000000   false            8m4s
> k get setting -n longhorn-system | grep priority
priority-class                                                    longhorn-critical                                 8m4s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/chart Helm chart related area/install-uninstall-upgrade Install, Uninstall or Upgrade related highlight Important feature/issue to highlight kind/feature Feature request, new feature priority/0 Must be implement or fixed in this release (managed by PO) require/auto-e2e-test Require adding/updating auto e2e test cases if they can be automated require/chart Require updating (chart) manifests in longhorn, longhorn-manager, charts repos require/doc Require updating the longhorn.io documentation require/lep Require adding/updating enhancement proposal
Projects
Status: Closed
Development

No branches or pull requests

5 participants