-
Notifications
You must be signed in to change notification settings - Fork 194
feat: Add support to invoke PostResponse plugins #800
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add support to invoke PostResponse plugins #800
Conversation
Signed-off-by: Shmuel Kallner <[email protected]>
Signed-off-by: Shmuel Kallner <[email protected]>
Signed-off-by: Shmuel Kallner <[email protected]>
Signed-off-by: Shmuel Kallner <[email protected]>
Signed-off-by: Shmuel Kallner <[email protected]>
Signed-off-by: Shmuel Kallner <[email protected]>
Signed-off-by: Shmuel Kallner <[email protected]>
…gins Signed-off-by: Shmuel Kallner <[email protected]>
Signed-off-by: Shmuel Kallner <[email protected]>
Signed-off-by: Shmuel Kallner <[email protected]>
✅ Deploy Preview for gateway-api-inference-extension ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
|
Hi @shmuelk. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
pkg/epp/scheduling/types/types.go
Outdated
| func NewSchedulingContext(ctx context.Context, req *LLMRequest, resp *LLMResponse, pods []Pod) *SchedulingContext { | ||
| var logger logr.Logger | ||
| if req != nil { | ||
| logger = log.FromContext(ctx).WithValues("request", req) | ||
| } else { | ||
| logger = log.FromContext(ctx).WithValues("response", resp) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does it makes sense to separate SchedulingRequestContext and SchedulingResponseContext?
for example, when using context for response, do you use PodSnapshot or only the selected pod?
maybe we could put TargetPod in ResponseContext instead of calculating PodsSnapshot which could be expensive in large scale scenario.
would be good if each gets only the data it uses and not super-object that contains all.
|
Apologies for the delay! I think there is some refactoring we could do here to make this implementation a little cleaner, but these were decisions made before this PR. This does move the needle forward, and actually wires up the /lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: kfswain, shmuelk The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/ok-to-test |
Signed-off-by: Nir Rozenbaum <[email protected]>
* generalize scheduling cycle state concept Signed-off-by: Nir Rozenbaum <[email protected]> * typo Signed-off-by: Nir Rozenbaum <[email protected]> * make linter happy Signed-off-by: Nir Rozenbaum <[email protected]> * make prefix state struct internal to package instead of public Signed-off-by: Nir Rozenbaum <[email protected]> --------- Signed-off-by: Nir Rozenbaum <[email protected]>
* remove Model field from LLMRequest Signed-off-by: Nir Rozenbaum <[email protected]> * rebase handling Signed-off-by: Nir Rozenbaum <[email protected]> --------- Signed-off-by: Nir Rozenbaum <[email protected]>
Signed-off-by: Shmuel Kallner <[email protected]>
…headers in them Signed-off-by: Shmuel Kallner <[email protected]>
|
/lgtm |
* Added the LLMResponse struct and RequestId to LLMRequest Signed-off-by: Shmuel Kallner <[email protected]> * Updates due to NewSchedulerContext API change Signed-off-by: Shmuel Kallner <[email protected]> * Populate the RequestId field of LLMRequest Signed-off-by: Shmuel Kallner <[email protected]> * Updates to tests Signed-off-by: Shmuel Kallner <[email protected]> * Added PostResponse plugins to scheduler config Signed-off-by: Shmuel Kallner <[email protected]> * Added scheduler.OnResponse to handle responses Signed-off-by: Shmuel Kallner <[email protected]> * Added dispatcher.HandleResponse to handle responses Signed-off-by: Shmuel Kallner <[email protected]> * Refactored server response header handling to invoke PostResponse plugins Signed-off-by: Shmuel Kallner <[email protected]> * Added simple test for PostResponse plugins Signed-off-by: Shmuel Kallner <[email protected]> * Setup the logger in the SchedulerContext appropriately for reponses Signed-off-by: Shmuel Kallner <[email protected]> * Updates due to rebase issues * merge functions in env utils (kubernetes-sigs#819) Signed-off-by: Nir Rozenbaum <[email protected]> * generalize scheduling cycle state concept (kubernetes-sigs#818) * generalize scheduling cycle state concept Signed-off-by: Nir Rozenbaum <[email protected]> * typo Signed-off-by: Nir Rozenbaum <[email protected]> * make linter happy Signed-off-by: Nir Rozenbaum <[email protected]> * make prefix state struct internal to package instead of public Signed-off-by: Nir Rozenbaum <[email protected]> --------- Signed-off-by: Nir Rozenbaum <[email protected]> * remove Model field from LLMRequest (kubernetes-sigs#782) * remove Model field from LLMRequest Signed-off-by: Nir Rozenbaum <[email protected]> * rebase handling Signed-off-by: Nir Rozenbaum <[email protected]> --------- Signed-off-by: Nir Rozenbaum <[email protected]> * Added the LLMResponse struct and RequestId to LLMRequest Signed-off-by: Shmuel Kallner <[email protected]> * Insure that wanted response header messages have all of the response headers in them Signed-off-by: Shmuel Kallner <[email protected]> --------- Signed-off-by: Shmuel Kallner <[email protected]> Signed-off-by: Nir Rozenbaum <[email protected]> Co-authored-by: Nir Rozenbaum <[email protected]>
* Added the LLMResponse struct and RequestId to LLMRequest Signed-off-by: Shmuel Kallner <[email protected]> * Updates due to NewSchedulerContext API change Signed-off-by: Shmuel Kallner <[email protected]> * Populate the RequestId field of LLMRequest Signed-off-by: Shmuel Kallner <[email protected]> * Updates to tests Signed-off-by: Shmuel Kallner <[email protected]> * Added PostResponse plugins to scheduler config Signed-off-by: Shmuel Kallner <[email protected]> * Added scheduler.OnResponse to handle responses Signed-off-by: Shmuel Kallner <[email protected]> * Added dispatcher.HandleResponse to handle responses Signed-off-by: Shmuel Kallner <[email protected]> * Refactored server response header handling to invoke PostResponse plugins Signed-off-by: Shmuel Kallner <[email protected]> * Added simple test for PostResponse plugins Signed-off-by: Shmuel Kallner <[email protected]> * Setup the logger in the SchedulerContext appropriately for reponses Signed-off-by: Shmuel Kallner <[email protected]> * Updates due to rebase issues * merge functions in env utils (kubernetes-sigs#819) Signed-off-by: Nir Rozenbaum <[email protected]> * generalize scheduling cycle state concept (kubernetes-sigs#818) * generalize scheduling cycle state concept Signed-off-by: Nir Rozenbaum <[email protected]> * typo Signed-off-by: Nir Rozenbaum <[email protected]> * make linter happy Signed-off-by: Nir Rozenbaum <[email protected]> * make prefix state struct internal to package instead of public Signed-off-by: Nir Rozenbaum <[email protected]> --------- Signed-off-by: Nir Rozenbaum <[email protected]> * remove Model field from LLMRequest (kubernetes-sigs#782) * remove Model field from LLMRequest Signed-off-by: Nir Rozenbaum <[email protected]> * rebase handling Signed-off-by: Nir Rozenbaum <[email protected]> --------- Signed-off-by: Nir Rozenbaum <[email protected]> * Added the LLMResponse struct and RequestId to LLMRequest Signed-off-by: Shmuel Kallner <[email protected]> * Insure that wanted response header messages have all of the response headers in them Signed-off-by: Shmuel Kallner <[email protected]> --------- Signed-off-by: Shmuel Kallner <[email protected]> Signed-off-by: Nir Rozenbaum <[email protected]> Co-authored-by: Nir Rozenbaum <[email protected]>
* Added the LLMResponse struct and RequestId to LLMRequest Signed-off-by: Shmuel Kallner <[email protected]> * Updates due to NewSchedulerContext API change Signed-off-by: Shmuel Kallner <[email protected]> * Populate the RequestId field of LLMRequest Signed-off-by: Shmuel Kallner <[email protected]> * Updates to tests Signed-off-by: Shmuel Kallner <[email protected]> * Added PostResponse plugins to scheduler config Signed-off-by: Shmuel Kallner <[email protected]> * Added scheduler.OnResponse to handle responses Signed-off-by: Shmuel Kallner <[email protected]> * Added dispatcher.HandleResponse to handle responses Signed-off-by: Shmuel Kallner <[email protected]> * Refactored server response header handling to invoke PostResponse plugins Signed-off-by: Shmuel Kallner <[email protected]> * Added simple test for PostResponse plugins Signed-off-by: Shmuel Kallner <[email protected]> * Setup the logger in the SchedulerContext appropriately for reponses Signed-off-by: Shmuel Kallner <[email protected]> * Updates due to rebase issues * merge functions in env utils (kubernetes-sigs#819) Signed-off-by: Nir Rozenbaum <[email protected]> * generalize scheduling cycle state concept (kubernetes-sigs#818) * generalize scheduling cycle state concept Signed-off-by: Nir Rozenbaum <[email protected]> * typo Signed-off-by: Nir Rozenbaum <[email protected]> * make linter happy Signed-off-by: Nir Rozenbaum <[email protected]> * make prefix state struct internal to package instead of public Signed-off-by: Nir Rozenbaum <[email protected]> --------- Signed-off-by: Nir Rozenbaum <[email protected]> * remove Model field from LLMRequest (kubernetes-sigs#782) * remove Model field from LLMRequest Signed-off-by: Nir Rozenbaum <[email protected]> * rebase handling Signed-off-by: Nir Rozenbaum <[email protected]> --------- Signed-off-by: Nir Rozenbaum <[email protected]> * Added the LLMResponse struct and RequestId to LLMRequest Signed-off-by: Shmuel Kallner <[email protected]> * Insure that wanted response header messages have all of the response headers in them Signed-off-by: Shmuel Kallner <[email protected]> --------- Signed-off-by: Shmuel Kallner <[email protected]> Signed-off-by: Nir Rozenbaum <[email protected]> Co-authored-by: Nir Rozenbaum <[email protected]>
This PR adds the support to invoke the PostResponse scheduler plugins during the processing of response headers.
This PR with the PostResponse plugins invoked at response header processing time, enables developers to add headers to those that are sent to the client. This might include a session id or session token, usefull in session aware routing.
The changes included in this PR include:
A more complex test that includes sending the gRPC messages will be added in a future PR that we plan to submit.
Note: This PR is dependent on PR #799