Skip to content

Conversation

@capri-xiyue
Copy link
Contributor

@capri-xiyue capri-xiyue commented Jul 31, 2025

fixed #1278
we want to support CUJ (use case) that users can specify either "inference.networking.k8s.io" or "inference.networking.x-k8s.io" InferencePool in EPP. Please note EPP deployment and InferencePool still has 1:1 mapping.

  1. Added --pool-group to allow users to specify whether they configure either inference.networking.x-k8s.io or http://inference.networking.k8s.io InferencePool
  2. In reconciler, will watch either inference.networking.x-k8s.io or http://inference.networking.k8s.io InferencePool
  3. In datastore logic, will convert v1alph2.InferencePool to v1.InferencePool as the spec and status has no change

@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 31, 2025
@netlify
Copy link

netlify bot commented Jul 31, 2025

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit d823af9
🔍 Latest deploy log https://app.netlify.com/projects/gateway-api-inference-extension/deploys/689260b6c3af530009aabf02
😎 Deploy Preview https://deploy-preview-1277--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jul 31, 2025
@k8s-ci-robot k8s-ci-robot requested review from ahg-g and danehans July 31, 2025 22:44
@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jul 31, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @capri-xiyue. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jul 31, 2025
@capri-xiyue
Copy link
Contributor Author

Please don't review it as it is just a draft

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 1, 2025
@nirrozenbaum
Copy link
Contributor

didn't start the review. just verifying:
are we in agreement that we should keep both InfPool CRDs v1 and v1alpha2 to let the various gateways some time to adjust their code and pass conformance using v1 before we deprecate v1alpha2?

I think this is the right thing to do!
cc: @robscott @ahg-g @kfswain @danehans

@pierDipi
Copy link
Contributor

pierDipi commented Aug 1, 2025

As an additional data point, supporting both for some time will make our life a lot easier, thanks for doing this!

@kfswain
Copy link
Collaborator

kfswain commented Aug 1, 2025

👍 I'll review when @capri-xiyue pulls off the WIP tag! Thanks for your work on this! We will give you space to work for now, feel free to pull of that WIP tag when ready!

@kfswain
Copy link
Collaborator

kfswain commented Aug 1, 2025

WRT the timeline, this is intended to be transitionary & give gateways time to migrate to v1. It wont be immediate, but this is not intended to be indefinite. We will work with our upstream partners but I'm thinking deprecation of v1a2 would probably be in the v1.1-v1.2 timeline

@capri-xiyue capri-xiyue force-pushed the capri-xiyue/epp-support-both branch from 9cec6c5 to d4a8d67 Compare August 1, 2025 16:17
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Aug 1, 2025
@kfswain
Copy link
Collaborator

kfswain commented Aug 1, 2025

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Aug 1, 2025
@capri-xiyue capri-xiyue force-pushed the capri-xiyue/epp-support-both branch from 288e1b3 to 43727d1 Compare August 1, 2025 19:13
@capri-xiyue capri-xiyue changed the title [WIP] changed to support both v1 and v1a2 ip [WIP] changed to support both v1 and v1a2 ip in EPP Aug 1, 2025
@capri-xiyue capri-xiyue changed the title [WIP] changed to support both v1 and v1a2 ip in EPP feat: changed to support both v1 and v1a2 ip in EPP Aug 1, 2025
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 1, 2025
@capri-xiyue
Copy link
Contributor Author

capri-xiyue commented Aug 1, 2025

I feel it is too much to refactor e2e and integration test here. Will create another issue to track it. #1283

I added basic UT here to make sure it works and I also verified it e2e via manual test

Screenshot 2025-08-01 at 4 56 21 PM Screenshot 2025-08-01 at 4 56 50 PM

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 4, 2025
# Conflicts:
#	pkg/epp/controller/inferencemodel_reconciler.go
#	pkg/epp/datastore/datastore.go
#	pkg/epp/server/controller_manager.go
#	pkg/epp/server/runserver.go

# Conflicts:
#	cmd/epp/runner/runner.go
Signed-off-by: Xiyue Yu <[email protected]>
Signed-off-by: Xiyue Yu <[email protected]>
@capri-xiyue capri-xiyue force-pushed the capri-xiyue/epp-support-both branch from 76b97a9 to 3c84a17 Compare August 4, 2025 16:45
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 4, 2025
DefaultCertPath = "" // default for --cert-path
DefaultConfigFile = "" // default for --config-file
DefaultConfigText = "" // default for --config-text
DefaultPoolGroup = "inference.networking.k8s.io" // default for --pool-group
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering maybe I should use "inference.networking.x-k8s.io" to avoid unexpected break change when users change from release 0.5 to main branch?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally think it's safer to switch the defaults to the new GA API and leave an option to use the older alpha API for backwards compatibility.

@capri-xiyue capri-xiyue requested a review from robscott August 4, 2025 16:59
logger.Info("InferencePool not found. Clearing the datastore")
if c.PoolGKNN.Group == v1alpha2.GroupName {
infPool := &v1alpha2.InferencePool{}
if err := c.Get(ctx, req.NamespacedName, infPool); err != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we combine this logic so we dont repeat? The only conditionals should be:

  • creation of pool based on type
  • conversion of v1a2 pool to v1 (do we need to unstructured middle step? would a conversion function that we define be more reliable?)

LMKWYT

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to avoid duplicate code. To avoid duplicate code, I have to initialize the variable as a interface which both v1a2 and v1 Pool type can use, here I use client.Object.

I think using unstructured as middle step is quite reliable as the conversation function is handled by k8s runtime instead of self-authored written code. And self-written conversion code would be cumbersome as I need to self-write deep copy code by myself. Let me know what you think.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds great, and looks great as well. Thanks!

@kfswain
Copy link
Collaborator

kfswain commented Aug 5, 2025

Looks good for the most part, left a single comment in the primary change (infPool reconciliation)

@capri-xiyue capri-xiyue requested a review from kfswain August 5, 2025 19:49
@capri-xiyue
Copy link
Contributor Author

Looks good for the most part, left a single comment in the primary change (infPool reconciliation)

Updated the code based on the comment

@kfswain
Copy link
Collaborator

kfswain commented Aug 5, 2025

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 5, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: capri-xiyue, kfswain

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 5, 2025
@k8s-ci-robot k8s-ci-robot merged commit 115aa85 into kubernetes-sigs:main Aug 5, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support CUJ that users can specify either "inference.networking.k8s.io" or "inference.networking.x-k8s.io" InferencePool when configure EPP

6 participants