Skip to content

Conversation

@SinaChavoshi
Copy link
Contributor

@SinaChavoshi SinaChavoshi commented May 14, 2025

Description

This PR introduces a new conformance test, HTTPRouteMultipleRulesDifferentPools, which validates the setup with one Gateway and one HTTPRoute to successfully route traffic to multiple, distinct InferencePool backends.

local run results: ( Ran on commit daeb8e6)

go test -v ./conformance -args -debug     -gateway-class gke-l7-regional-external-managed     -cleanup-base-resources=false     -run-test HTTPRouteMultipleRulesDifferentPools
=== RUN   TestConformance
...
    apply.go:279: 2025-06-17T00:17:03.69078318Z: Deleting pool-primary-epp Service
=== NAME  TestConformance
    suite.go:451: 2025-06-17T00:17:03.869842294Z: Sleeping 0s for test isolation
=== RUN   TestConformance/InferencePoolResolvedRefsCondition
    conformance.go:68: Skipping InferencePoolResolvedRefsCondition: test explicitly skipped
--- PASS: TestConformance (70.46s)
    --- SKIP: TestConformance/HTTPRouteInvalidInferencePoolRef (0.00s)
    --- SKIP: TestConformance/InferencePoolAccepted (0.00s)
    --- PASS: TestConformance/HTTPRouteMultipleRulesDifferentPools (66.56s)
        --- PASS: TestConformance/HTTPRouteMultipleRulesDifferentPools/Wait_for_resources_to_be_accepted (34.50s)
        --- PASS: TestConformance/HTTPRouteMultipleRulesDifferentPools/Traffic_should_be_routed_to_the_correct_pool_based_on_path (28.87s)
            --- PASS: TestConformance/HTTPRouteMultipleRulesDifferentPools/Traffic_should_be_routed_to_the_correct_pool_based_on_path/request_to_primary_pool (28.20s)
            --- PASS: TestConformance/HTTPRouteMultipleRulesDifferentPools/Traffic_should_be_routed_to_the_correct_pool_based_on_path/request_to_secondary_pool (0.24s)
    --- SKIP: TestConformance/InferencePoolResolvedRefsCondition (0.00s)
PASS
ok      sigs.k8s.io/gateway-api-inference-extension/conformance 70.674s

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label May 14, 2025
@k8s-ci-robot k8s-ci-robot requested review from ahg-g and kfswain May 14, 2025 22:40
@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label May 14, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @SinaChavoshi. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@netlify
Copy link

netlify bot commented May 14, 2025

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit 81562a2
🔍 Latest deploy log https://app.netlify.com/projects/gateway-api-inference-extension/deploys/6851a00997ab93000885f394
😎 Deploy Preview https://deploy-preview-834--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label May 14, 2025
@spencerhance
Copy link

/cc

@k8s-ci-robot k8s-ci-robot requested a review from spencerhance May 16, 2025 18:16
@ahg-g
Copy link
Contributor

ahg-g commented May 20, 2025

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 20, 2025
@SinaChavoshi
Copy link
Contributor Author

@robscott @danehans Once #866 is merged in I will update this PR to use the same helper methods for route accepted and pool accepted. For now leaving the PR as is to avoid copying code across multiple PRs.

@danehans danehans requested a review from robscott June 16, 2025 21:46
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jun 17, 2025
Copy link
Member

@robscott robscott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @SinaChavoshi, this is great!

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 17, 2025
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 17, 2025
@danehans
Copy link
Contributor

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 17, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: danehans, robscott, SinaChavoshi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 17, 2025
@k8s-ci-robot k8s-ci-robot merged commit 68c73c0 into kubernetes-sigs:main Jun 17, 2025
8 of 9 checks passed
shmuelk pushed a commit to shmuelk/gateway-api-inference-extension that referenced this pull request Jun 18, 2025
…ernetes-sigs#834)

* copy of accepted inference pool test to start from.

* add yaml file for the test

* update time out

* update the yaml file to add port 9002

* read timeout config from local repo

* remove excess comments

* correct spelling for scenarios

* check route condition on RouteConditionResolvedRefs

* remove empty lines in yaml

* set optional/defaulted fields as unspecified

* fix timeout

* fix boilerplate header

* change varialbe names to use primary secondary consistently.

* remove extra comments

* factor out common code

* Add actual http traffic validation using echo-basic

* remove extra comments from manifest

* remove modifiedTimeoutConfig.HTTPRouteMustHaveCondition per review comment.

* intermediate update

* fix the test run

* factor out common code

* move epp def to shared manifest

* remove extra comments

* revert back to two epps

* add to do for epp image

* switch to GeneralMustHaveConditionTimeout

* undo gateway version changes

* remove unused HTTPRouteMustHaveConditions

* update doc string for GetPod

* update docstring

* Remove resource type from names in manifests.

* remove type from name

* remove health check

* add todo for combining getpod methods
k8s-ci-robot pushed a commit that referenced this pull request Jun 18, 2025
…e it easier to add plugins (#881)

* configuration implementation (after rebase...)

Signed-off-by: Shmuel Kallner <[email protected]>

* Moved plugin registry back to pkg/epp/plugins

Signed-off-by: Shmuel Kallner <[email protected]>

* Removed unneeded 'forced imports' of scorers

Signed-off-by: Shmuel Kallner <[email protected]>

* Changed 'profilepicker' to 'profilehandler' in new and old code

Signed-off-by: Shmuel Kallner <[email protected]>

* Pass the configured SchedulingProfiles to LoadSchedulerConfig

Signed-off-by: Shmuel Kallner <[email protected]>

* Ensure that both the configText and configFile flags are not specified

Signed-off-by: Shmuel Kallner <[email protected]>

* Load RequestControl plugins from the configuration

Signed-off-by: Shmuel Kallner <[email protected]>

* Register all plugin factories

Signed-off-by: Shmuel Kallner <[email protected]>

* Review fixes

Signed-off-by: Shmuel Kallner <[email protected]>

* Reverted unneeded change

Signed-off-by: Shmuel Kallner <[email protected]>

* Updates from review comments

Signed-off-by: Shmuel Kallner <[email protected]>

* Added a stub interface for plugins to get data from the EPP

Signed-off-by: Shmuel Kallner <[email protected]>

* Added a temporary implementation of plugins.Handle

Signed-off-by: Shmuel Kallner <[email protected]>

* Added pluginName and plugins.Handle to plugin factory interface

Signed-off-by: Shmuel Kallner <[email protected]>

* Updated plugin factory signatures to reflect new API

Signed-off-by: Shmuel Kallner <[email protected]>

* Updated plugin instantiation to reflect new API

Signed-off-by: Shmuel Kallner <[email protected]>

* Updated plugin instantiation to reflect new API

Signed-off-by: Shmuel Kallner <[email protected]>

* Updated tests to reflect new API

Signed-off-by: Shmuel Kallner <[email protected]>

* Do not rename the imported package

Signed-off-by: Shmuel Kallner <[email protected]>

* Only upper layer of code should log errors

Signed-off-by: Shmuel Kallner <[email protected]>

* Only pass what is needed to instantiate the plugins

Signed-off-by: Shmuel Kallner <[email protected]>

* Review updates

Signed-off-by: Shmuel Kallner <[email protected]>

* Review update

Signed-off-by: Shmuel Kallner <[email protected]>

* Review update. Make more clear that the code only checks for already defined names

Signed-off-by: Shmuel Kallner <[email protected]>

* fixed e2e doc in makefile (does not require GPUs) (#976)

Signed-off-by: Nir Rozenbaum <[email protected]>

* API: Adds 5xx Status Code for Invalid ExtRef (#991)

Signed-off-by: Daneyon Hansen <[email protected]>

* feat(conformance): Add test for invalid EPP service reference (#959)

* fix boilerplate header

* add tests for InferencePoolInvalidEPPService

* change to expect error on httproute refcond

* moved the creation of the context to main.go. (#995)

this is useful when writing a different main like llm-d, allowing to propogate the same context to the whole system.

Signed-off-by: Nir Rozenbaum <[email protected]>

* fix dead links (#989)

* feat: add health check for epp cluster (#966)

* feat: add health check for epp cluster

Signed-off-by: zhengkezhou1 <[email protected]>

* remove tls

Signed-off-by: zhengkezhou1 <[email protected]>

* don't use tls

Signed-off-by: zhengkezhou1 <[email protected]>

* health checking flag

Signed-off-by: zhengkezhou1 <[email protected]>

* fix import

Signed-off-by: zhengkezhou1 <[email protected]>

* add tls options

Signed-off-by: zhengkezhou1 <[email protected]>

---------

Signed-off-by: zhengkezhou1 <[email protected]>

* Server unit test and utility to help with such tests (#820)

Signed-off-by: Ira <[email protected]>

* Update dynamic-lora-sidecar to expose metrics to track loaded adapters (#980)

* Add a metrics to track loaded adapters

* Update the sample manifests

* Add explanation of metrics from dyanmic LoRA adapter sidecar

* Add explanation of metrics from dyanmic LoRA adapter sidecar (take 2)

* Update metrics.md based on feedback

* refactor: Replace prefix cache structure with golang-lru (#928)

* refactor: Replace prefix cache structure with golang-lru

Signed-off-by: Kfir Toledo <[email protected]>
Co-authored-by: Maroon Ayoub <[email protected]>

* fix: rename prefix scorer parameters and convert test to benchmark test

Signed-off-by: Kfir Toledo <[email protected]>

* feat: Add per server LRU capacity

Signed-off-by: Kfir Toledo <[email protected]>

* fix: Fix typos and error handle

Signed-off-by: Kfir Toledo <[email protected]>

* fix: add safety check for LRUCapacityPerServer

Signed-off-by: Kfir Toledo <[email protected]>

---------

Signed-off-by: Kfir Toledo <[email protected]>
Co-authored-by: Maroon Ayoub <[email protected]>

* feat(conformance): Add HTTPRouteMultipleRulesDifferentPools test (#834)

* copy of accepted inference pool test to start from.

* add yaml file for the test

* update time out

* update the yaml file to add port 9002

* read timeout config from local repo

* remove excess comments

* correct spelling for scenarios

* check route condition on RouteConditionResolvedRefs

* remove empty lines in yaml

* set optional/defaulted fields as unspecified

* fix timeout

* fix boilerplate header

* change varialbe names to use primary secondary consistently.

* remove extra comments

* factor out common code

* Add actual http traffic validation using echo-basic

* remove extra comments from manifest

* remove modifiedTimeoutConfig.HTTPRouteMustHaveCondition per review comment.

* intermediate update

* fix the test run

* factor out common code

* move epp def to shared manifest

* remove extra comments

* revert back to two epps

* add to do for epp image

* switch to GeneralMustHaveConditionTimeout

* undo gateway version changes

* remove unused HTTPRouteMustHaveConditions

* update doc string for GetPod

* update docstring

* Remove resource type from names in manifests.

* remove type from name

* remove health check

* add todo for combining getpod methods

* configuration implementation (after rebase...)

Signed-off-by: Shmuel Kallner <[email protected]>

* After review, made code more obvious

Signed-off-by: Shmuel Kallner <[email protected]>

* Fixed merge issues

Signed-off-by: Shmuel Kallner <[email protected]>

---------

Signed-off-by: Shmuel Kallner <[email protected]>
Signed-off-by: Nir Rozenbaum <[email protected]>
Signed-off-by: Daneyon Hansen <[email protected]>
Signed-off-by: zhengkezhou1 <[email protected]>
Signed-off-by: Ira <[email protected]>
Signed-off-by: Kfir Toledo <[email protected]>
Co-authored-by: Nir Rozenbaum <[email protected]>
Co-authored-by: Daneyon Hansen <[email protected]>
Co-authored-by: sina chavoshi <[email protected]>
Co-authored-by: Xudong Wang <[email protected]>
Co-authored-by: Zhengke Zhou <[email protected]>
Co-authored-by: Ira Rosen <[email protected]>
Co-authored-by: Shotaro Kohama <[email protected]>
Co-authored-by: Kfir Toledo <[email protected]>
Co-authored-by: Maroon Ayoub <[email protected]>
@zetxqx zetxqx mentioned this pull request Jun 20, 2025
12 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants