Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
53 commits
Select commit Hold shift + click to select a range
717575c
(llmisvc): migrate to v1 InferencePool with v1alpha2 failover
KillianGolds Oct 2, 2025
7b048d8
Resolve pre-commit linting errors
KillianGolds Oct 2, 2025
a47c80f
Downgrade gateway-api-inference-extension to v1.0.0
KillianGolds Oct 2, 2025
79e6c27
upgrade controller-gen to v0.17.2
KillianGolds Oct 2, 2025
8c16e7c
Apply fork workaround for GIE v1.0.0 validation bug and regenerate ma…
KillianGolds Oct 6, 2025
86aeed4
Add GIE v1 InferencePool CRD for integration tests
KillianGolds Oct 7, 2025
4b4844b
Fix openapi_generated.go formatting
KillianGolds Oct 7, 2025
02b0abe
Add GIE v1alpha2 InferencePool CRD for integration tests
KillianGolds Oct 7, 2025
06565e5
fix(llmisvc): add required matchLabels to scheduler config
KillianGolds Oct 7, 2025
a1d8e51
fix(llmisvc): add required port field to endpointPickerRef
KillianGolds Oct 7, 2025
68c55f0
fix(llmisvc): fix integration tests for GIE v1 migration
KillianGolds Oct 7, 2025
c8353e1
chore: regenerate openapi_generated.go
KillianGolds Oct 7, 2025
7f0bebf
fix(llmisvc): fix GIE v1 InferencePool test helpers
KillianGolds Oct 7, 2025
3eb5d45
fix(llmisvc): use int64 for unstructured port numbers
KillianGolds Oct 7, 2025
4ffa44a
fix(llmisvc): restore config merge logic and update tests for GIE v1
KillianGolds Oct 8, 2025
b382549
fix(llmisvc): add GIE v1 scheme registration and InferencePool watch
KillianGolds Oct 9, 2025
920b8eb
fix(llmisvc): add watches for v1alpha2 InferencePool and InferenceModel
KillianGolds Oct 14, 2025
3bed0b4
chore: upgrade to Kubernetes v0.34, Gateway API v1.4, and KEDA v2.18
KillianGolds Oct 15, 2025
3581f80
fix: update Dockerfiles to use Go 1.24.7
KillianGolds Oct 15, 2025
60b5a4f
fix: allow Go 1.24.6 to build code requiring 1.24.7
KillianGolds Oct 15, 2025
7e2f7c8
fix: use GOTOOLCHAIN=auto to handle Go version mismatch
KillianGolds Oct 15, 2025
65536cd
add GOTOOLCHAIN=auto to all Dockerfiles for Go version compatibility
KillianGolds Oct 15, 2025
73ce255
fix(build): disable go-licenses checks in localmodel Dockerfiles
KillianGolds Oct 15, 2025
ff187b6
fix(rbac): add GIE v1 inferencemodels and inferenceobjectives permiss…
KillianGolds Oct 15, 2025
e416655
fix(e2e): upgrade Gateway API Inference Extension to v1.0.0
KillianGolds Oct 15, 2025
3e2bbcd
fix(llmisvc): implement dual-pool fallback in InferencePool readiness…
KillianGolds Oct 16, 2025
ef835ac
fix(llmisvc): auto-inject dual InferencePool backend refs for schedul…
KillianGolds Oct 16, 2025
9d04f60
Make Precommit failure
KillianGolds Oct 16, 2025
8f5a127
fix(llmisvc): use separate pool names for v1 and v1alpha2 backend refs
KillianGolds Oct 16, 2025
b63f4ec
fix(llmisvc): check Gateway Controller support before migrating to v1
KillianGolds Oct 16, 2025
f41abf5
fix(llmisvc): check ResolvedRefs condition in InferencePool readiness…
KillianGolds Oct 16, 2025
5786f15
fix(llmisvc): defer InferencePool readiness evaluation until after HT…
KillianGolds Oct 17, 2025
97449c2
fix(ci): install GIE controller in OpenShift E2E for InferencePool su…
KillianGolds Oct 17, 2025
2a8caa2
fix(ci): enable InferencePool support in Istio
KillianGolds Oct 17, 2025
bd989f7
fix(llmisvc): add retry logic for HTTPRoute update conflicts
KillianGolds Oct 17, 2025
4108db0
fix(ci): use openshift-default GatewayClass name for E2E tests
KillianGolds Oct 17, 2025
38bd1a9
chore(deps): switch to upstream gateway-api with merged validation fix
KillianGolds Oct 17, 2025
254ea6c
fix(llmisvc): add retry logic for finalizer operations and fix async …
KillianGolds Oct 17, 2025
7567270
chore: update generated files from precommit hook
KillianGolds Oct 17, 2025
ecb1a22
fix(llmisvc): trim HTTPRoute to single rule for GIE v1 migration
KillianGolds Oct 17, 2025
abb40d0
fix(typo)
KillianGolds Oct 17, 2025
b7caa8f
fix(llmisvc): initialize Config field in test reconciler
KillianGolds Oct 17, 2025
8ca895c
fix(typo): Update other typo
KillianGolds Oct 17, 2025
3b4c9ad
fix(llmisvc): add required fields to InferencePool status for GIE v1 …
KillianGolds Oct 17, 2025
f673d7b
fix(llmisvc): correct Service backend detection in extractRoutePath
KillianGolds Oct 18, 2025
2baf173
test(llmisvc): make auth tests conditional on RHCL availability
KillianGolds Oct 18, 2025
dc6c47c
fix(llmisvc): make InferencePool config compatible across KServe vers…
KillianGolds Oct 19, 2025
bb9a3aa
chore: run precommit fixes and codegen
KillianGolds Oct 19, 2025
578871d
fix(llmisvc): enable v1alpha2 fallback in dual-pool strategy
KillianGolds Oct 19, 2025
6e09281
fix(test): correct rhcl_available fixture usage in auth tests
KillianGolds Oct 19, 2025
2e0f069
refactor(test): remove duplicate backend ref builder functions
KillianGolds Oct 19, 2025
2f227a8
chore: run precommit
KillianGolds Oct 20, 2025
b68114a
fix(llmisvc): watch v1 InferencePool status changes for migration
KillianGolds Oct 20, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@ WORKDIR /go/src/github.com/kserve/kserve
COPY go.mod go.mod
COPY go.sum go.sum

# Allow Go to automatically download the required toolchain version
ENV GOTOOLCHAIN=auto
RUN go mod download

COPY cmd/ cmd/
Expand Down
3 changes: 3 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -198,6 +198,9 @@ test: fmt vet manifests envtest test-qpext
test-qpext:
cd qpext && go test -v ./... -cover

test-llmisvc: fmt vet manifests envtest
KUBEBUILDER_ASSETS="$$($(ENVTEST) use $(ENVTEST_K8S_VERSION) -p path)" go test --timeout 20m ./pkg/controller/llmisvc/... -coverprofile coverage.out -coverpkg ./pkg/... ./cmd...

# Build manager binary
manager: generate fmt vet go-lint
go build -o bin/manager ./cmd/manager
Expand Down
2 changes: 1 addition & 1 deletion Makefile.tools.mk
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ POETRY = $(PYTHON_BIN)/poetry

## Tool versions.
GOLANGCI_LINT_VERSION ?= v1.64.8
CONTROLLER_TOOLS_VERSION ?= v0.16.2
CONTROLLER_TOOLS_VERSION ?= v0.17.2
ENVTEST_VERSION ?= latest
YQ_VERSION ?= v4.28.1
HELM_DOCS_VERSION ?= v1.12.0
Expand Down
2 changes: 2 additions & 0 deletions agent.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@ WORKDIR /go/src/github.com/kserve/kserve
COPY go.mod go.mod
COPY go.sum go.sum

# Allow Go to automatically download the required toolchain version
ENV GOTOOLCHAIN=auto
RUN go mod download

COPY cmd/ cmd/
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.16.2
controller-gen.kubebuilder.io/version: v0.17.2
name: clusterservingruntimes.serving.kserve.io
spec:
group: serving.kserve.io
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.16.2
controller-gen.kubebuilder.io/version: v0.17.2
name: clusterstoragecontainers.serving.kserve.io
spec:
group: serving.kserve.io
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.16.2
controller-gen.kubebuilder.io/version: v0.17.2
name: inferencegraphs.serving.kserve.io
spec:
group: serving.kserve.io
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.16.2
controller-gen.kubebuilder.io/version: v0.17.2
name: inferenceservices.serving.kserve.io
spec:
group: serving.kserve.io
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.16.2
controller-gen.kubebuilder.io/version: v0.17.2
name: llminferenceserviceconfigs.serving.kserve.io
spec:
group: serving.kserve.io
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.16.2
controller-gen.kubebuilder.io/version: v0.17.2
name: llminferenceservices.serving.kserve.io
spec:
group: serving.kserve.io
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.16.2
controller-gen.kubebuilder.io/version: v0.17.2
name: localmodelcaches.serving.kserve.io
spec:
group: serving.kserve.io
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.16.2
controller-gen.kubebuilder.io/version: v0.17.2
name: localmodelnodegroups.serving.kserve.io
spec:
group: serving.kserve.io
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.16.2
controller-gen.kubebuilder.io/version: v0.17.2
name: localmodelnodes.serving.kserve.io
spec:
group: serving.kserve.io
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.16.2
controller-gen.kubebuilder.io/version: v0.17.2
name: servingruntimes.serving.kserve.io
spec:
group: serving.kserve.io
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.16.2
controller-gen.kubebuilder.io/version: v0.17.2
name: trainedmodels.serving.kserve.io
spec:
group: serving.kserve.io
Expand Down
11 changes: 0 additions & 11 deletions charts/kserve-resources/templates/clusterrole.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -121,17 +121,6 @@ rules:
- watch
- apiGroups:
- inference.networking.k8s.io
resources:
Comment thread
KillianGolds marked this conversation as resolved.
- inferencepools
verbs:
- create
- delete
- get
- list
- patch
- update
- watch
- apiGroups:
- inference.networking.x-k8s.io
resources:
- inferencemodels
Expand Down
7 changes: 7 additions & 0 deletions cmd/manager/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ import (
istioclientv1beta1 "istio.io/client-go/pkg/apis/networking/v1beta1"
corev1 "k8s.io/api/core/v1"
rbacv1 "k8s.io/api/rbac/v1"
"k8s.io/client-go/dynamic"
"k8s.io/client-go/kubernetes"
typedcorev1 "k8s.io/client-go/kubernetes/typed/core/v1"
_ "k8s.io/client-go/plugin/pkg/client/auth/gcp"
Expand Down Expand Up @@ -311,10 +312,16 @@ func main() {
setupLog.Info("Setting up LLMInferenceService controller")
llmEventBroadcaster := record.NewBroadcaster()
llmEventBroadcaster.StartRecordingToSink(&typedcorev1.EventSinkImpl{Interface: clientSet.CoreV1().Events("")})
dynamicClient, err := dynamic.NewForConfig(mgr.GetConfig())
if err != nil {
setupLog.Error(err, "unable to create dynamic client")
os.Exit(1)
}
if err = (&llmisvc.LLMInferenceServiceReconciler{
Client: mgr.GetClient(),
Config: mgr.GetConfig(),
Clientset: clientSet,
DynamicClient: dynamicClient,
Comment thread
spolti marked this conversation as resolved.
EventRecorder: llmEventBroadcaster.NewRecorder(scheme, corev1.EventSource{Component: "LLMInferenceServiceController"}),
}).SetupWithManager(mgr); err != nil {
setupLog.Error(err, "unable to create controller", "v1beta1Controller", "InferenceService")
Expand Down
137 changes: 136 additions & 1 deletion config/crd/full/serving.kserve.io_clusterservingruntimes.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.16.2
controller-gen.kubebuilder.io/version: v0.17.2
name: clusterservingruntimes.serving.kserve.io
spec:
group: serving.kserve.io
Expand Down Expand Up @@ -516,6 +516,23 @@ spec:
- fieldPath
type: object
x-kubernetes-map-type: atomic
fileKeyRef:
properties:
key:
type: string
optional:
default: false
type: boolean
path:
type: string
volumeName:
type: string
required:
- key
- path
- volumeName
type: object
x-kubernetes-map-type: atomic
resourceFieldRef:
properties:
containerName:
Expand Down Expand Up @@ -604,6 +621,23 @@ spec:
- fieldPath
type: object
x-kubernetes-map-type: atomic
fileKeyRef:
properties:
key:
type: string
optional:
default: false
type: boolean
path:
type: string
volumeName:
type: string
required:
- key
- path
- volumeName
type: object
x-kubernetes-map-type: atomic
resourceFieldRef:
properties:
containerName:
Expand Down Expand Up @@ -1032,6 +1066,29 @@ spec:
type: object
restartPolicy:
type: string
restartPolicyRules:
items:
properties:
action:
type: string
exitCodes:
properties:
operator:
type: string
values:
items:
format: int32
type: integer
type: array
x-kubernetes-list-type: set
required:
- operator
type: object
required:
- action
type: object
type: array
x-kubernetes-list-type: atomic
securityContext:
properties:
allowPrivilegeEscalation:
Expand Down Expand Up @@ -1908,6 +1965,25 @@ spec:
type: array
x-kubernetes-list-type: atomic
type: object
podCertificate:
properties:
certificateChainPath:
type: string
credentialBundlePath:
type: string
keyPath:
type: string
keyType:
type: string
maxExpirationSeconds:
format: int32
type: integer
signerName:
type: string
required:
- keyType
- signerName
type: object
secret:
properties:
items:
Expand Down Expand Up @@ -2585,6 +2661,23 @@ spec:
- fieldPath
type: object
x-kubernetes-map-type: atomic
fileKeyRef:
properties:
key:
type: string
optional:
default: false
type: boolean
path:
type: string
volumeName:
type: string
required:
- key
- path
- volumeName
type: object
x-kubernetes-map-type: atomic
resourceFieldRef:
properties:
containerName:
Expand Down Expand Up @@ -3013,6 +3106,29 @@ spec:
type: object
restartPolicy:
type: string
restartPolicyRules:
items:
properties:
action:
type: string
exitCodes:
properties:
operator:
type: string
values:
items:
format: int32
type: integer
type: array
x-kubernetes-list-type: set
required:
- operator
type: object
required:
- action
type: object
type: array
x-kubernetes-list-type: atomic
securityContext:
properties:
allowPrivilegeEscalation:
Expand Down Expand Up @@ -3855,6 +3971,25 @@ spec:
type: array
x-kubernetes-list-type: atomic
type: object
podCertificate:
properties:
certificateChainPath:
type: string
credentialBundlePath:
type: string
keyPath:
type: string
keyType:
type: string
maxExpirationSeconds:
format: int32
type: integer
signerName:
type: string
required:
- keyType
- signerName
type: object
secret:
properties:
items:
Expand Down
Loading
Loading