Commit 7bd2053
feat: Load the SchedulerConfig from a configuration file/text and make it easier to add plugins (#881)
* configuration implementation (after rebase...)
Signed-off-by: Shmuel Kallner <[email protected]>
* Moved plugin registry back to pkg/epp/plugins
Signed-off-by: Shmuel Kallner <[email protected]>
* Removed unneeded 'forced imports' of scorers
Signed-off-by: Shmuel Kallner <[email protected]>
* Changed 'profilepicker' to 'profilehandler' in new and old code
Signed-off-by: Shmuel Kallner <[email protected]>
* Pass the configured SchedulingProfiles to LoadSchedulerConfig
Signed-off-by: Shmuel Kallner <[email protected]>
* Ensure that both the configText and configFile flags are not specified
Signed-off-by: Shmuel Kallner <[email protected]>
* Load RequestControl plugins from the configuration
Signed-off-by: Shmuel Kallner <[email protected]>
* Register all plugin factories
Signed-off-by: Shmuel Kallner <[email protected]>
* Review fixes
Signed-off-by: Shmuel Kallner <[email protected]>
* Reverted unneeded change
Signed-off-by: Shmuel Kallner <[email protected]>
* Updates from review comments
Signed-off-by: Shmuel Kallner <[email protected]>
* Added a stub interface for plugins to get data from the EPP
Signed-off-by: Shmuel Kallner <[email protected]>
* Added a temporary implementation of plugins.Handle
Signed-off-by: Shmuel Kallner <[email protected]>
* Added pluginName and plugins.Handle to plugin factory interface
Signed-off-by: Shmuel Kallner <[email protected]>
* Updated plugin factory signatures to reflect new API
Signed-off-by: Shmuel Kallner <[email protected]>
* Updated plugin instantiation to reflect new API
Signed-off-by: Shmuel Kallner <[email protected]>
* Updated plugin instantiation to reflect new API
Signed-off-by: Shmuel Kallner <[email protected]>
* Updated tests to reflect new API
Signed-off-by: Shmuel Kallner <[email protected]>
* Do not rename the imported package
Signed-off-by: Shmuel Kallner <[email protected]>
* Only upper layer of code should log errors
Signed-off-by: Shmuel Kallner <[email protected]>
* Only pass what is needed to instantiate the plugins
Signed-off-by: Shmuel Kallner <[email protected]>
* Review updates
Signed-off-by: Shmuel Kallner <[email protected]>
* Review update
Signed-off-by: Shmuel Kallner <[email protected]>
* Review update. Make more clear that the code only checks for already defined names
Signed-off-by: Shmuel Kallner <[email protected]>
* fixed e2e doc in makefile (does not require GPUs) (#976)
Signed-off-by: Nir Rozenbaum <[email protected]>
* API: Adds 5xx Status Code for Invalid ExtRef (#991)
Signed-off-by: Daneyon Hansen <[email protected]>
* feat(conformance): Add test for invalid EPP service reference (#959)
* fix boilerplate header
* add tests for InferencePoolInvalidEPPService
* change to expect error on httproute refcond
* moved the creation of the context to main.go. (#995)
this is useful when writing a different main like llm-d, allowing to propogate the same context to the whole system.
Signed-off-by: Nir Rozenbaum <[email protected]>
* fix dead links (#989)
* feat: add health check for epp cluster (#966)
* feat: add health check for epp cluster
Signed-off-by: zhengkezhou1 <[email protected]>
* remove tls
Signed-off-by: zhengkezhou1 <[email protected]>
* don't use tls
Signed-off-by: zhengkezhou1 <[email protected]>
* health checking flag
Signed-off-by: zhengkezhou1 <[email protected]>
* fix import
Signed-off-by: zhengkezhou1 <[email protected]>
* add tls options
Signed-off-by: zhengkezhou1 <[email protected]>
---------
Signed-off-by: zhengkezhou1 <[email protected]>
* Server unit test and utility to help with such tests (#820)
Signed-off-by: Ira <[email protected]>
* Update dynamic-lora-sidecar to expose metrics to track loaded adapters (#980)
* Add a metrics to track loaded adapters
* Update the sample manifests
* Add explanation of metrics from dyanmic LoRA adapter sidecar
* Add explanation of metrics from dyanmic LoRA adapter sidecar (take 2)
* Update metrics.md based on feedback
* refactor: Replace prefix cache structure with golang-lru (#928)
* refactor: Replace prefix cache structure with golang-lru
Signed-off-by: Kfir Toledo <[email protected]>
Co-authored-by: Maroon Ayoub <[email protected]>
* fix: rename prefix scorer parameters and convert test to benchmark test
Signed-off-by: Kfir Toledo <[email protected]>
* feat: Add per server LRU capacity
Signed-off-by: Kfir Toledo <[email protected]>
* fix: Fix typos and error handle
Signed-off-by: Kfir Toledo <[email protected]>
* fix: add safety check for LRUCapacityPerServer
Signed-off-by: Kfir Toledo <[email protected]>
---------
Signed-off-by: Kfir Toledo <[email protected]>
Co-authored-by: Maroon Ayoub <[email protected]>
* feat(conformance): Add HTTPRouteMultipleRulesDifferentPools test (#834)
* copy of accepted inference pool test to start from.
* add yaml file for the test
* update time out
* update the yaml file to add port 9002
* read timeout config from local repo
* remove excess comments
* correct spelling for scenarios
* check route condition on RouteConditionResolvedRefs
* remove empty lines in yaml
* set optional/defaulted fields as unspecified
* fix timeout
* fix boilerplate header
* change varialbe names to use primary secondary consistently.
* remove extra comments
* factor out common code
* Add actual http traffic validation using echo-basic
* remove extra comments from manifest
* remove modifiedTimeoutConfig.HTTPRouteMustHaveCondition per review comment.
* intermediate update
* fix the test run
* factor out common code
* move epp def to shared manifest
* remove extra comments
* revert back to two epps
* add to do for epp image
* switch to GeneralMustHaveConditionTimeout
* undo gateway version changes
* remove unused HTTPRouteMustHaveConditions
* update doc string for GetPod
* update docstring
* Remove resource type from names in manifests.
* remove type from name
* remove health check
* add todo for combining getpod methods
* configuration implementation (after rebase...)
Signed-off-by: Shmuel Kallner <[email protected]>
* After review, made code more obvious
Signed-off-by: Shmuel Kallner <[email protected]>
* Fixed merge issues
Signed-off-by: Shmuel Kallner <[email protected]>
---------
Signed-off-by: Shmuel Kallner <[email protected]>
Signed-off-by: Nir Rozenbaum <[email protected]>
Signed-off-by: Daneyon Hansen <[email protected]>
Signed-off-by: zhengkezhou1 <[email protected]>
Signed-off-by: Ira <[email protected]>
Signed-off-by: Kfir Toledo <[email protected]>
Co-authored-by: Nir Rozenbaum <[email protected]>
Co-authored-by: Daneyon Hansen <[email protected]>
Co-authored-by: sina chavoshi <[email protected]>
Co-authored-by: Xudong Wang <[email protected]>
Co-authored-by: Zhengke Zhou <[email protected]>
Co-authored-by: Ira Rosen <[email protected]>
Co-authored-by: Shotaro Kohama <[email protected]>
Co-authored-by: Kfir Toledo <[email protected]>
Co-authored-by: Maroon Ayoub <[email protected]>1 parent 68c73c0 commit 7bd2053
File tree
31 files changed
+1728
-28
lines changed- api/config/v1alpha1
- cmd/epp
- runner
- pkg/epp
- common/config
- plugins
- registry
- requestcontrol
- scheduling
- config
- framework/plugins
- filter
- multi/prefix
- picker
- profile
- scorer
- test/testdata
31 files changed
+1728
-28
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
27 | 35 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
0 commit comments