Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Asynchronous preemption in the scheduler #4832

Open
4 tasks done
sanposhiho opened this issue Sep 7, 2024 · 28 comments
Open
4 tasks done

Asynchronous preemption in the scheduler #4832

sanposhiho opened this issue Sep 7, 2024 · 28 comments
Assignees
Labels
lead-opted-in Denotes that an issue has been opted in to a release sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. stage/alpha Denotes an issue tracking an enhancement targeted for Alpha status
Milestone

Comments

@sanposhiho
Copy link
Member

sanposhiho commented Sep 7, 2024

Enhancement Description

Please keep this description up to date. This will help the Enhancement Team to track the evolution of the enhancement efficiently.

/sig scheduling
/assign

@k8s-ci-robot k8s-ci-robot added the sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. label Sep 7, 2024
@github-project-automation github-project-automation bot moved this to Needs Triage in SIG Scheduling Sep 7, 2024
@alculquicondor
Copy link
Member

/label lead-opted-in

@k8s-ci-robot k8s-ci-robot added the lead-opted-in Denotes that an issue has been opted in to a release label Sep 17, 2024
@impact-maker
Copy link

impact-maker commented Oct 1, 2024

Hello @sanposhiho @alculquicondor 👋, Enhancements team here.

Just checking in as we approach enhancements freeze on 02:00 UTC Friday 11th October 2024 / 19:00 PDT Thursday 10th October 2024.

This enhancement is targeting for stage alpha for v1.32 (correct me, if otherwise).

Here's where this enhancement currently stands:

  • KEP readme using the latest template has been merged into the k/enhancements repo.
  • KEP status is marked as implementable for latest-milestone: v1.32.
  • KEP readme has up-to-date graduation criteria
  • KEP has a production readiness review that has been completed and merged into k/enhancements. (For more information on the PRR process, check here). If your production readiness review is not completed yet, please make sure to fill the production readiness questionnaire in your KEP by the PRR Freeze deadline on Thursday, October 3rd, 2024 so that the PRR team has enough time to review your KEP.

For this KEP, we would just need to update the following:

  • KEP has a production readiness review that has been completed and merged into k/enhancements.

The status of this enhancement is marked as at risk for enhancement freeze. Please keep the issue description up-to-date with appropriate stages as well.

If you anticipate missing enhancements freeze, you can file an exception request in advance. Thank you!

@impact-maker impact-maker moved this to At risk for enhancements freeze in 1.32 Enhancements Tracking Oct 1, 2024
@tjons
Copy link
Contributor

tjons commented Oct 11, 2024

Hello 👋, v1.32 Enhancements team here.

Unfortunately, this enhancement did not meet requirements for enhancements freeze.

If you still wish to progress this enhancement in v1.32, please file an exception request as soon as possible, within three days. If you have any questions, you can reach out in the #release-enhancements channel on Slack and we'll be happy to help. Thanks!

@tjons
Copy link
Contributor

tjons commented Oct 11, 2024

/milestone clear

@tjons tjons moved this from At risk for enhancements freeze to Removed from Milestone in 1.32 Enhancements Tracking Oct 11, 2024
@sanposhiho
Copy link
Member Author

@alculquicondor
Looks like we just missed the deadline.
I believe we can make an exception request because we only need an approval for PRR, and actually @/wojtek-t already mentioned it's generally OK for alpha. What do you think?

@sanposhiho
Copy link
Member Author

I filed an exception request for it. cc @wojtek-t.

@sreeram-venkitesh
Copy link
Member

/stage alpha

@k8s-ci-robot k8s-ci-robot added the stage/alpha Denotes an issue tracking an enhancement targeted for Alpha status label Oct 11, 2024
@tjons tjons moved this from Removed from Milestone to Tracked for enhancements freeze in 1.32 Enhancements Tracking Oct 13, 2024
@tjons
Copy link
Contributor

tjons commented Oct 13, 2024

With all the requirements met, this enhancement's exception is granted and this enhancement is now tracked for enhancements freeze!

@chanieljdan
Copy link

Hi @sanposhiho 👋, 1.32 Release Docs Lead here.

Does this enhancement work planned for 1.32 require any new docs or modification to existing docs?

If so, please follows the steps here to open a PR against dev-1.32 branch in the k/website repo. This PR can be just a placeholder at this time and must be created before Thursday October 24th 2024 18:00 PDT.

Also, take a look at Documenting for a release to get yourself familiarize with the docs requirement for the release.

Thank you!

@chanieljdan
Copy link

Hi @sanposhiho 👋, 1.32 Release Docs Lead here.

Does this enhancement work planned for 1.32 require any new docs or modification to existing docs?

If so, please follows the steps here to open a PR against dev-1.32 branch in the k/website repo. This PR can be just a placeholder at this time and must be created before Thursday October 24th 2024 18:00 PDT.

Also, take a look at Documenting for a release to get yourself familiarize with the docs requirement for the release.

Thank you!

Hi @sanposhiho 👋,

Just a reminder to open a placeholder PR against dev-1.32 branch in the k/website repo for this (steps available here). The deadline for this is a week away at Thursday October 24, 2024 18:00 PDT.

Thanks,

Daniel

@sanposhiho
Copy link
Member Author

Created: kubernetes/website#48407

@wrkode
Copy link
Member

wrkode commented Oct 24, 2024

👋 Hi there, William here from v1.32 Comms
We'd love for you to consider writing a feature blog about your enhancement! Some reasons why you might want to write a blog for this feature include (but are not limited to) if this introduces breaking changes, is important to our users, or has been in progress for a long time and is graduating.

To opt-in, let us know and open a Feature Blog placeholder PR against the website repository by 30th Oct 2024. For more information about writing a blog see the blog contribution guidelines.

Note: In your placeholder PR, use XX characters for the blog date in the front matter and file name. We will work with you on updating the PR with the publication date once we have a final number of feature blogs for this release.

@sanposhiho
Copy link
Member Author

I believe we don't need a blog post for this. (at least, within this release)

@wrkode
Copy link
Member

wrkode commented Oct 25, 2024

Thank's for your prompt feedback @sanposhiho

@tjons
Copy link
Contributor

tjons commented Nov 4, 2024

Hey again @sanposhiho 👋 v1.32 Enhancements team here,

Just checking in as we approach code freeze at 02:00 UTC Friday 8th November 2024 / 19:00 PDT Thursday 7th November 2024 .

Here's where this enhancement currently stands:

  • All PRs to the Kubernetes repo that are related to your enhancement are linked in the above issue description (for tracking purposes).
  • All PR/s are ready to be merged (they have approved and lgtm labels applied) by the code freeze deadline. This includes tests.

For this enhancement, it looks like the following PRs are open and need to be merged before code freeze (and we need to update the Issue description to include all the related PRs of this KEP):

Additionally, please let me know if there are any other PRs in k/k not listed in the description or not linked with this GitHub issue that we should track for this KEP, so that we can maintain accurate status.

The status of this enhancement is marked as at risk for code freeze.

If you anticipate missing code freeze, you can file an exception request in advance. Thank you!

@tjons tjons moved this from Tracked for enhancements freeze to At risk for code freeze in 1.32 Enhancements Tracking Nov 4, 2024
@tjons
Copy link
Contributor

tjons commented Nov 8, 2024

Hello @sanposhiho 👋, Enhancements team here.

With all the implementation (code related) PRs merged as per the issue description:

This enhancement is now marked as tracked for code freeze for the 1.32 Code Freeze!

Please note that KEPs targeting stable need to have the status field marked as implemented in the kep.yaml file after code PRs are merged and the feature gates are removed.

@sanposhiho
Copy link
Member Author

@alculquicondor Can we include it within this release cycle as well? or would you suggest not?

@dipesh-rawat
Copy link
Member

Hello @sanposhiho 👋, 1.33 Enhancements Lead here.

If you'd like to work on this enhancement in v1.33, please have the SIG lead opt-in by adding the lead-opted-in label, which ensures it gets added to the tracking board. Also, please set the milestone to v1.33 using /milestone v1.33.
Thanks!

/remove-label lead-opted-in

@k8s-ci-robot k8s-ci-robot removed the lead-opted-in Denotes that an issue has been opted in to a release label Jan 13, 2025
@alculquicondor
Copy link
Member

You mean graduate to beta? I don't see why not.

@alculquicondor
Copy link
Member

cc @macsko @dom4ha any viewpoints?

@sanposhiho
Copy link
Member Author

Graduate to beta as it is, or potentially update KEP with kubernetes/kubernetes#129449 if we discuss and decide to do that.

@dom4ha
Copy link
Member

dom4ha commented Jan 14, 2025

Not sure for how long we usually wait to spot potential issues, so not certain what to suggest here.

As a side note, the preemption process itself has a few important issues. One of them is the performance, which async preemption helps to mitigate, so it's an argument for moving forward.

However, the current process cannot preempt pods running on a different node than the candidate wants to schedule. I'm looking into this issue and I hope we will be able to do some brainstorming soon how it could be addressed as well. Once we do that, maybe we;d have more clear picture what actions we should take and when.

@sanposhiho
Copy link
Member Author

sanposhiho commented Jan 14, 2025

We have a plugin for that as a subproject.
https://github.com/kubernetes-sigs/scheduler-plugins/blob/master/pkg%2Fcrossnodepreemption%2FREADME.md

We can consider implementing the same if there's many-enough people wanting it. But, I'm not sure if we're getting such feedback from various folks.

@dom4ha
Copy link
Member

dom4ha commented Jan 14, 2025

Thanks Kensei for the pointer. IIUC the plugin tries the brute force approach, which would impact performance a lot, so I'd think about something more performance oriented.

Regarding the interest, I suspect that development around DRA and other more complex dependencies, will force us to look closer into this problem.

@dom4ha
Copy link
Member

dom4ha commented Jan 14, 2025

@sanposhiho Is there also an issue describing the problem and considered solutions?

@sanposhiho
Copy link
Member Author

cc @Huang-Wei, might know more context for the plugin.

I'd think about something more performance oriented

Nice, I didn't grasp its algorithm actually.
It would be great to make an improvement in the algorithm, regardless of whether we decide to or not to move it to the upstream.

I suspect that development around DRA and other more complex dependencies, will force us to look closer into this problem.
...
Is there also an issue describing the problem and considered solutions?

DRA has its own PostFilter and hence I'm not sure if they really would.
But, feel free to create an issue for the discussion. (AFAIK, we don't have an issue) Either way, here isn't an appropriate place to continue this discussion.
We can discuss the solution, whether we need to implement it within the upstream scheduler, etc.

@alculquicondor
Copy link
Member

Can we move discussions about cross-node preemption to a separate issue? It's not related to this enhancement.

In the meantime, I'm not hearing anything against moving to beta.

/label lead-opted-in

@k8s-ci-robot k8s-ci-robot added the lead-opted-in Denotes that an issue has been opted in to a release label Jan 14, 2025
@dipesh-rawat dipesh-rawat added this to the v1.33 milestone Jan 14, 2025
@ArkaSaha30
Copy link
Member

ArkaSaha30 commented Feb 5, 2025

Hello @sanposhiho 👋, v1.33 Enhancements team here.

Just checking in as we approach enhancements freeze on 02:00 UTC Friday 14th February 2025 / 19:00 PDT Thursday 13th February 2025.

This enhancement is targeting stage beta for v1.33 (correct me, if otherwise)
/stage beta

Here's where this enhancement currently stands:

  • KEP readme using the latest template has been merged into the k/enhancements repo.
  • KEP status is marked as implementable for latest-milestone: v1.33.
  • KEP readme has up-to-date graduation criteria
  • KEP has a production readiness review that has been completed and merged into k/enhancements. (For more information on the PRR process, check here). If your production readiness review is not completed yet, please make sure to fill the production readiness questionnaire in your KEP by the PRR Freeze deadline on Thursday 6th February 2025 so that the PRR team has enough time to review your KEP.

For this KEP, we would just need to update the following:

  • KEP needs to be updated with the target release
  • Description needs to reflect the correct Beta release target (x.y): v1.33
  • KEP needs to be updated with prr-approvers as per PRR yaml

The status of this enhancement is marked as At risk for enhancements freeze. Please keep the issue description up-to-date with appropriate stages as well.

If you anticipate missing enhancements freeze, you can file an exception request in advance. Thank you!

@ArkaSaha30 ArkaSaha30 moved this to At risk for enhancements freeze in 1.33 Enhancements Tracking Feb 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lead-opted-in Denotes that an issue has been opted in to a release sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. stage/alpha Denotes an issue tracking an enhancement targeted for Alpha status
Projects
Status: At risk for enhancements freeze
Status: Needs Triage
Status: Tracked for code freeze
Development

No branches or pull requests