Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion test/sidecar/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ a simple 1P1D sample application using the NIXL connector.
To deploy this application in the `eval` cluster, run this command:

```
$ kustomize build test/config/overlays/fmass/nixl | oc apply -f -
$ kustomize build test/sidecar/config/overlays/llmd/nixl | oc apply -f -
```

Wait a bit (up to 10mn) for the pods to be running.
Expand Down
10 changes: 0 additions & 10 deletions test/sidecar/config/nixl/inferencemodel.yaml

This file was deleted.

18 changes: 12 additions & 6 deletions test/sidecar/config/nixl/inferencepool.yaml
Original file line number Diff line number Diff line change
@@ -1,11 +1,17 @@
apiVersion: inference.networking.x-k8s.io/v1alpha2
apiVersion: inference.networking.k8s.io/v1
kind: InferencePool
metadata:
name: qwen2-0-5b
spec:
extensionRef:
name: qwen2-0-5b-epp
selector:
llm-d.ai/inferenceServing: "true"
llm-d.ai/model: qwen2-0-5b
targetPortNumber: 8000
matchLabels:
llm-d.ai/inferenceServing: "true"
llm-d.ai/model: qwen2-0-5b
endpointPickerRef:
name: qwen2-0-5b-epp
kind: Service
port:
number: 9002
targetPorts:
- number: 8000

1 change: 0 additions & 1 deletion test/sidecar/config/nixl/kustomization.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
resources:
- qwen-decoder-pod.yaml
- qwen-prefiller-pod.yaml
- inferencemodel.yaml
- inferencepool.yaml
- qwen-epp.yaml
- qwen-epp-svc.yaml
Expand Down