Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DRA: control plane controller ("classic DRA") #3063

Open
36 of 38 tasks
pohly opened this issue Nov 30, 2021 · 153 comments
Open
36 of 38 tasks

DRA: control plane controller ("classic DRA") #3063

pohly opened this issue Nov 30, 2021 · 153 comments
Assignees
Labels
lead-opted-in Denotes that an issue has been opted in to a release sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. stage/alpha Denotes an issue tracking an enhancement targeted for Alpha status tracked/yes Denotes an enhancement issue is actively being tracked by the Release Team
Milestone

Comments

@pohly
Copy link
Contributor

pohly commented Nov 30, 2021

Enhancement Description

@k8s-ci-robot k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Nov 30, 2021
@pohly
Copy link
Contributor Author

pohly commented Nov 30, 2021

/assign @pohly
/sig node

@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Nov 30, 2021
@ahg-g
Copy link
Member

ahg-g commented Dec 20, 2021

do we have a discussion issue on this enhancement?

@pohly
Copy link
Contributor Author

pohly commented Jan 10, 2022

@ahg-g: with discussion issue you mean a separate issue in some repo (where?) in which arbitrary comments are welcome?

No, not at the moment. I've also not seen that done elsewhere before. IMHO at this point the open KEP PR is a good place to collect feedback and questions. I also intend to come to the next SIG-Scheduling meeting.

@ahg-g
Copy link
Member

ahg-g commented Jan 10, 2022

@ahg-g: with discussion issue you mean a separate issue in some repo (where?) in which arbitrary comments are welcome?

Yeah, this is what I was looking for, the issue would be under k/k repo.

No, not at the moment. I've also not seen that done elsewhere before.

That is actually the common practice, one starts a feature request issue where the community discusses initial ideas and the merits of the request (look for issues with label kind/feature). That is what I would expect in the discussion link.

IMHO at this point the open KEP PR is a good place to collect feedback and questions. I also intend to come to the next SIG-Scheduling meeting.

But the community have no idea what this is about yet, so better to have an issue discusses "What would you like to be added?" and "Why is this needed" beforehand. Also, meetings are attended by fairly small groups of contributors, having an issue tracking the discussion is important IMO.

@pohly
Copy link
Contributor Author

pohly commented Jan 10, 2022

In my work in SIG-Storage I've not seen much use of such a discussion issue. Instead I had the impression that the usage of "kind/feature" is discouraged nowadays.

https://github.com/kubernetes/kubernetes/issues/new?assignees=&labels=kind%2Ffeature&template=enhancement.yaml explicitly says

Feature requests are unlikely to make progress as issues. Please consider engaging with SIGs on slack and mailing lists, instead. A proposal that works through the design along with the implications of the change can be opened as a KEP.

This proposal was discussed with various people beforehand, now we are in the formal KEP phase. But I agree, it is hard to provide a good link to those prior discussions.

@ahg-g
Copy link
Member

ahg-g commented Jan 10, 2022

We use that in sig-scheduling, and it does serve as a very good place for initial rounds of discussions, discussions on slack and meetings are hard to reference as you pointed out.

I still have no idea what this is proposing, and I may not attend the next sig meeting for example...

@gracenng gracenng added the tracked/yes Denotes an enhancement issue is actively being tracked by the Release Team label Jan 17, 2022
@gracenng gracenng added this to the v1.24 milestone Jan 17, 2022
@gracenng
Copy link
Member

gracenng commented Jan 30, 2022

Hi @ ! 1.24 Enhancements team here.
Checking in as we approach enhancements freeze in less than a week on 18:00pm PT on Thursday Feb 3rd
Here’s where this enhancement currently stands:

  • Updated KEP file using the latest template has been merged into the k/enhancements repo. KEP-3063: dynamic resource allocation #3064
  • KEP status is marked as implementable for this release with latest-milestone: 1.24
  • KEP has a test plan section filled out.
  • KEP has up to date graduation criteria.
  • KEP has a production readiness review that has been completed and merged into k/enhancements.

The status of this enhancement is track as at risk.
Thanks!

@gracenng gracenng added tracked/no Denotes an enhancement issue is NOT actively being tracked by the Release Team and removed tracked/yes Denotes an enhancement issue is actively being tracked by the Release Team labels Feb 4, 2022
@gracenng
Copy link
Member

gracenng commented Feb 4, 2022

The Enhancements Freeze is now in effect and this enhancement is removed from the release.
Please feel free to file an exception.

/milestone clear

@k8s-ci-robot k8s-ci-robot removed this from the v1.24 milestone Feb 4, 2022
@gracenng
Copy link
Member

gracenng commented Mar 1, 2022 via email

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 30, 2022
@kerthcet
Copy link
Member

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 31, 2022
@dchen1107 dchen1107 added the stage/alpha Denotes an issue tracking an enhancement targeted for Alpha status label Jun 9, 2022
@dchen1107 dchen1107 added this to the v1.25 milestone Jun 9, 2022
@Priyankasaggu11929 Priyankasaggu11929 added tracked/yes Denotes an enhancement issue is actively being tracked by the Release Team and removed tracked/no Denotes an enhancement issue is NOT actively being tracked by the Release Team labels Jun 10, 2022
@marosset
Copy link
Contributor

Hello @pohly 👋, 1.25 Enhancements team here.

Just checking in as we approach enhancements freeze on 18:00 PST on Thursday June 16, 2022.

For note, This enhancement is targeting for stage alpha for 1.25 (correct me, if otherwise)

Here's where this enhancement currently stands:

  • KEP file using the latest template has been merged into the k/enhancements repo.
  • KEP status is marked as implementable
  • KEP has a updated detailed test plan section filled out
  • KEP has up to date graduation criteria
  • KEP has a production readiness review that has been completed and merged into k/enhancements.

It looks like #3064 will address everything in this list.

For note, the status of this enhancement is marked as at risk. Please keep the issue description up-to-date with appropriate stages as well. Thank you!

@marosset
Copy link
Contributor

Hello @pohly 👋, just a quick check-in again, as we approach the 1.25 enhancements freeze.

Please plan to get #3064 reviewed and merged before enhancements freeze on Thursday, June 23, 2022 at 18:00 PM PT.

For note, the current status of the enhancement is atat-risk. Thank you!

@sreeram-venkitesh
Copy link
Member

Thanks! Marking this KEP as tracked for code freeze 🎉

@kannon92
Copy link
Contributor

@pohly What is your intention for this feature going forward? DRA has split into sub issues but I guess you are keeping this open as the main feature to update?

@pohly
Copy link
Contributor Author

pohly commented Aug 21, 2024

I don't have any particular plan for this aspect of DRA. Given that others want it removed, some potential user would need to speak up soon or it will get removed, probably in 1.32.

@kannon92
Copy link
Contributor

So then technically we may want to do some work with this even if its deprecation?

@kannon92
Copy link
Contributor

I'm trying to figure out for sig-node what to consider this feature.

"Proposed For Release", "Not for Release"?

@johnbelamaric
Copy link
Member

I suggest we raise it in SIG Node and WG Dev Mgmt with the following plan:

  • In 1.32, unless we hear a compelling case before Enhancements Freeze, the plan will be to remove this feature.
  • We prepare a KEP updated that would retire this in 1.32 and target that for 1.32. If no one pipes up, we move forward with removal in 1.32.
  • If there are folks that need to keep this around for another release, we need to understand their use case. We then need to see if we will be able to meet it with any of the currently planned enhancements to the structured parameters code, or we need to determine the right path forward to otherwise meet it. That may entail new, different enhancements to structured parameters, or some version of this feature. In that event, we would withdraw the "removal" from 1.32.

@tjons
Copy link
Contributor

tjons commented Sep 16, 2024

Hi, enhancements lead here - I inadvertently added this to the 1.32 tracking board 😀. Please readd it if you wish to progress this enhancement in 1.32.

/remove-label lead-opted-in

@k8s-ci-robot k8s-ci-robot removed the lead-opted-in Denotes that an issue has been opted in to a release label Sep 16, 2024
@haircommander
Copy link
Contributor

/milestone v1.32
/label lead-opted-in

@k8s-ci-robot k8s-ci-robot modified the milestones: v1.31, v1.32 Sep 17, 2024
@k8s-ci-robot k8s-ci-robot added the lead-opted-in Denotes that an issue has been opted in to a release label Sep 17, 2024
pohly added a commit to pohly/enhancements that referenced this issue Sep 24, 2024
Much of the PRR text that was originally written for "classic DRA" applies also
to "structured parameters". It gets moved from kubernetes#3063 to kubernetes#4381, with some minor
adaptions. The placeholder comments get restored in kubernetes#3063 because further work
on the KEP would be needed to move it forward - if it gets moved forward at all
instead of being abandoned.

The v1beta1 API will be almost identical to the v1alpha3 API, with just some
minor tweaks to fix oversights.

The kubelet gRPC gets bumped with no changes. Nonetheless, drivers should get
updated, which can be done by updating the Go dependencies and optionally
changing the API import.
pohly added a commit to pohly/enhancements that referenced this issue Sep 24, 2024
Much of the PRR text that was originally written for "classic DRA" applies also
to "structured parameters". It gets moved from kubernetes#3063 to kubernetes#4381, with some minor
adaptions. The placeholder comments get restored in kubernetes#3063 because further work
on the KEP would be needed to move it forward - if it gets moved forward at all
instead of being abandoned.

The v1beta1 API will be almost identical to the v1alpha3 API, with just some
minor tweaks to fix oversights.

The kubelet gRPC gets bumped with no changes. Nonetheless, drivers should get
updated, which can be done by updating the Go dependencies and optionally
changing the API import.
@catblade
Copy link

Request for leaving this here a little longer, @klueska . We would like some time to go evaluate what is best, from the scheduling side. If we can have some time to try to resolve the complexity of the structured parameters and maybe simplify the classic, having this to play with would be really helpful. 1.33 should be okay to remove because by then we'll have a plan. Spoke with @johnbelamaric already and he suggested I leave this comment and request. We are also looking at handling CPU, as was referenced in the original DRA doc here https://docs.google.com/document/d/1XNkTobkyz-MyXhidhTp5RfbMsM-uRCWDoflUMqNcYTk/ but I'm aware that that may make this scope too complex.

@johnbelamaric
Copy link
Member

@pohly can you lay out here the implications of #4381 going beta without first removing #3063? It's important that we make beta for #4381 in 1.32.

@pohly
Copy link
Contributor Author

pohly commented Sep 24, 2024

The two are independent since Kubernetes 1.31, with separate feature gates. Keeping #3063 as alpha does not block #4381 as beta. It also does not cause extra work (that was all already done for 1.31).

@johnbelamaric
Copy link
Member

The two are independent since Kubernetes 1.31, with separate feature gates. Keeping #3063 as alpha does not block #4381 as beta. It also does not cause extra work (that was all already done for 1.31).

@klueska (or was it @SergeyKanzhelev?) mentioned that there are round tripping implications, is that accurate?

@pohly
Copy link
Contributor Author

pohly commented Sep 25, 2024

Probably Jordan.

If we don't remove it now, the following fields remain reserved forever:

  • ResourceClaimSpec.Controller
  • ResourceClaimStatus.DeallocationRequested
  • AllocationResult.Controller

They don't get set, but the names are "burned" and cannot be used for something else in the future. I think that's okay and won't block future extensions.

@cyclinder
Copy link
Contributor

We're still using Classic DRA at the moment, and if it's removed, it's a big breaking change for us, so before we move to the Structure Parameter, we want the Classic DRA to be here for a while, thanks

@sftim
Copy link
Contributor

sftim commented Sep 25, 2024

  • We can rename the existing fields now and then keep them forever. The earlier such a rename happens, the fewer people who need to update their code / integrations.
  • We can have certain fields as alpha (behind their own gate) in APIs that are otherwise beta.

@alculquicondor
Copy link
Member

We're still using Classic DRA at the moment, and if it's removed, it's a big breaking change for us, so before we move to the Structure Parameter, we want the Classic DRA to be here for a while, thanks

Alpha features don't have backwards compatibility guarantees. I suggest you start the migration process to structured DRA or elaborate on why it's not possible for you to migrate.

@catblade
Copy link

So we can wait another cycle, but perhaps expect a rename?

@aojea
Copy link
Member

aojea commented Sep 25, 2024

We can rename the existing fields now and then keep them forever. The earlier such a rename happens, the fewer people who need to update their code / integrations.

I don't think we need to find a technical solution to perpetuate an alpha feature specially when we are developing the alternative that solves the problem with the original one, also if there is a bug with classic DRA it will not be backported and most probably not fixed

We're still using Classic DRA at the moment, and if it's removed, it's a big breaking change for us, so before we move to the Structure Parameter, we want the Classic DRA to be here for a while, thanks

Request for leaving this here a little longer, @klueska . We would like some time to go evaluate what is best, from the scheduling side. I

@catblade @cyclinder it seems you are working with old versions of Kubernetes, can you explain which versions are you using now and how far are you from current development? we need more information than "please don't remove it" to objectively evaluate the cost of maintaining code that should not be used ... also is important to describe the exact problems and why you can not use the new one, is a custom DRA driver you already have? or one you are using from a third party?

@pohly
Copy link
Contributor Author

pohly commented Sep 25, 2024

I had a call with @catblade. She cannot share in public yet what she is working on, but I found it interesting and worth supporting by keeping classic DRA as alpha for another release. It's important to note that it's not about supporting some existing solution. Instead, she is currently exploring both classic DRA and structured parameters and wants to have all options available until she reaches a conclusion of that exploration.

@cyclinder already said on Slack that they will use classic DRA only with older Kubernetes and want to migrate to structured parameters for Kubernetes >= 1.31.

@cyclinder
Copy link
Contributor

We haven't delved into whether the structured parameter will be able to meet our needs, and it will take some time. I can give you feedback on the results. 1.31 has removed the ResourceClass resource, so we had to make some changes to get ClassicDRA to run at 1.31, so we're considering moving directly to structured parameters.

@aojea
Copy link
Member

aojea commented Sep 29, 2024

It's important to note that it's not about supporting some existing solution. Instead, she is currently exploring both classic DRA and structured parameters and wants to have all options available until she reaches a conclusion of that exploration.

@pohly but this is still confusing, classic DRA has some limitations and we invested and decided to move with structured DRA, what is the point of exploring classic DRA?
what happens if the result of the exploration is to use classic DRA? are we going to open the debate again?

@thockin
Copy link
Member

thockin commented Sep 29, 2024

I agree. I do not see any possible future where classic DRA is revived. "Exploring" can be done on 1.31 or 1.30 or ... - why do we need to keep it in 1.32?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lead-opted-in Denotes that an issue has been opted in to a release sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. stage/alpha Denotes an issue tracking an enhancement targeted for Alpha status tracked/yes Denotes an enhancement issue is actively being tracked by the Release Team
Projects
Status: Net New
Status: Tracked
Status: Tracked
Status: Removed from Milestone
Status: No status
Status: Considered for release
Status: Needs Triage
Status: Tracked for Doc Freeze
Development

No branches or pull requests