-
Notifications
You must be signed in to change notification settings - Fork 688
feat: deploy SLA profiler to k8s #2030
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
76 commits
Select commit
Hold shift + click to select a range
065cb2a
feat: update k8s deploy yamls to use binary/python3
hhzhang16 aee478c
config part working
tedzhouhk 9455ad1
feat: add component type worker and bump image
hhzhang16 f3dd01a
fix: merge conflicts
mohammedabdulwahhab 7de97ef
fix: using health checks exposed by dynamo-run
mohammedabdulwahhab 16fd7f2
Merge branch 'main' of github.com:ai-dynamo/dynamo into hannahz/dep-2…
hhzhang16 3a29913
Merge branch 'hannahz/dep-216-create-deploy-crds-for-vllm_v1-example'…
hhzhang16 51835db
fix: check for message in logs
mohammedabdulwahhab 39b377f
Merge branch 'hannahz/dep-216-create-deploy-crds-for-vllm_v1-example'…
mohammedabdulwahhab dddb45f
Merge branch 'hannahz/dep-216-create-deploy-crds-for-vllm_v1-example'…
tedzhouhk 34bc79c
define apis
tedzhouhk 8c22d14
update script
tedzhouhk 9856dde
fix: add dynamodeployment lib
mohammedabdulwahhab 61a215b
fix: working client lib
mohammedabdulwahhab 5141334
fix: working client lib
mohammedabdulwahhab 8e25a29
integrate with utils.dynamo_deployment
tedzhouhk 1d87164
fix: port forward works
mohammedabdulwahhab aaf4544
Merge branch 'hzhou/profile_vllmv1_k8s' of https://github.com/ai-dyna…
mohammedabdulwahhab 65dec07
pc
tedzhouhk 0af209b
add dep; bug fix
tedzhouhk 918733a
Merge branch 'main' of https://github.com/ai-dynamo/dynamo into hzhou…
tedzhouhk 3f900ef
staging, port forward not working
tedzhouhk bd12d40
stage
tedzhouhk 7ac43a9
Merge branch 'main' of https://github.com/ai-dynamo/dynamo into hzhou…
mohammedabdulwahhab 9971acf
fix: running script
mohammedabdulwahhab a5d8aca
fix: fix
mohammedabdulwahhab 7b1d99a
Merge branch 'main' of https://github.com/ai-dynamo/dynamo into hzhou…
tedzhouhk f8f9363
add logic to find a free port
tedzhouhk 8e292f6
feat: add Kubernetes service account configuration for SLA profiling …
hhzhang16 d62731f
feat: use service DNS for interfacing with deployments when profiling…
hhzhang16 a1aea5a
Revert "feat: use service DNS for interfacing with deployments when p…
hhzhang16 06bfe3b
feat: use service DNS instead of port forwarding for K8s-deployed SLA…
hhzhang16 6a2dcd0
feat: add service account configuration files and deployment changes
hhzhang16 606b4e3
feat: add profile_sla_rbac instead of the job
hhzhang16 0980195
feat: wip of profiling vllm_v1
hhzhang16 babf639
feat: wip of profiling sla job
hhzhang16 f934160
feat: use in-cluster service accounts if possible
hhzhang16 e911248
feat: use sa instead of pullsecret in job
hhzhang16 3d2284a
feat: working serviceaccount
hhzhang16 6062b8a
feat: wip of using dns instead of portforward if running in k8s
hhzhang16 fcfa5f4
feat: service dns fixes with k8s client
hhzhang16 aec7f6f
feat: fully replace port with base_url
hhzhang16 0155042
feat: resize profiling pvc to make it larger
hhzhang16 832f570
feat: wip of cleaning up deployments and testing
hhzhang16 ff96b9e
add try-catch waiting for deployment
tedzhouhk e95cecf
feat: skipping sweeps if they exist in the output dir
hhzhang16 cec3a0a
feat: cleaning up sla profiler deployment
hhzhang16 f16158a
feat: update readme
hhzhang16 5419885
Merge branch 'main' of https://github.com/ai-dynamo/dynamo into hzhou…
tedzhouhk d2b6b00
feat: clean up outlying DGDs upon SLA profiling failure (#2016)
hhzhang16 df4795f
Merge branch 'hzhou/profile_vllmv1_k8s' of github.com:ai-dynamo/dynam…
hhzhang16 79c7e58
feat: newest deployment yamls
hhzhang16 450d371
add debug info
tedzhouhk d8ffe1a
Merge branch 'hzhou/profile_vllmv1_k8s' of https://github.com/ai-dyna…
tedzhouhk b66c347
feat: fixes for CI
hhzhang16 615fdfb
Merge branches 'main' and 'hzhou/profile_vllmv1_k8s' of github.com:ai…
hhzhang16 e505288
feat: update deploy images
hhzhang16 5772013
feat: remove k8s.sh
hhzhang16 ad89cec
feat: remove readme
hhzhang16 8086166
chore: cleanup, add doc (#2053)
tedzhouhk 65cc1e7
feat: add instructions on how to view images/profiling results
hhzhang16 f6ef37e
Merge branch 'main' of github.com:ai-dynamo/dynamo into hannahz/dep-2…
hhzhang16 a26f1bd
feat: shorten copyright headings
hhzhang16 62155db
Merge branch 'main' of github.com:ai-dynamo/dynamo into hannahz/dep-2…
hhzhang16 dd41754
Merge branch 'main' of https://github.com/ai-dynamo/dynamo into hanna…
tedzhouhk 5f60d6a
Merge branch 'hannahz/dep-233-deploy-sla-profiler-to-k8s' of https://…
tedzhouhk f84d3c5
mypy
tedzhouhk 0bb3389
Merge branch 'main' of https://github.com/ai-dynamo/dynamo into hanna…
tedzhouhk 93d6734
docs: minor path change
hhzhang16 1716dab
docs: rewrite rbac
hhzhang16 8ceb061
docs: remove mentions of dynamo serve
hhzhang16 89d57ff
add note
tedzhouhk a7d28c6
Merge branch 'hannahz/dep-233-deploy-sla-profiler-to-k8s' of https://…
tedzhouhk 10d6426
typo
tedzhouhk 9413ef1
increase timeout, update yaml
tedzhouhk 0a586b1
pc
tedzhouhk File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,15 @@ | ||
| # SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. | ||
| # SPDX-License-Identifier: Apache-2.0 | ||
grahamking marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| apiVersion: rbac.authorization.k8s.io/v1 | ||
| kind: RoleBinding | ||
| metadata: | ||
| name: profile-sla-binding | ||
| namespace: ${NAMESPACE} | ||
| subjects: | ||
| - kind: ServiceAccount | ||
| name: profile-sla-sa | ||
| namespace: ${NAMESPACE} | ||
| roleRef: | ||
| kind: Role | ||
| name: profile-sla-role | ||
| apiGroup: rbac.authorization.k8s.io | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,48 @@ | ||
| # SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. | ||
| # SPDX-License-Identifier: Apache-2.0 | ||
| apiVersion: batch/v1 | ||
| kind: Job | ||
| metadata: | ||
| name: profile-sla | ||
| namespace: ${NAMESPACE} | ||
| spec: | ||
| template: | ||
| spec: | ||
| serviceAccountName: profile-sla-sa | ||
| containers: | ||
| - name: profile-sla | ||
| image: ${DOCKER_IMAGE} | ||
| resources: | ||
| requests: | ||
| cpu: "1" | ||
| memory: "2Gi" | ||
| limits: | ||
| cpu: "2" | ||
| memory: "4Gi" | ||
| env: | ||
| - name: HUGGING_FACE_HUB_TOKEN | ||
| valueFrom: | ||
| secretKeyRef: | ||
| name: hf-token-secret | ||
| key: HF_TOKEN | ||
| - name: NATS_SERVER | ||
| value: nats://${NAMESPACE}-nats:4222 | ||
| - name: ETCD_ENDPOINTS | ||
| value: ${NAMESPACE}-etcd:2379 | ||
| command: ["python", "/workspace/benchmarks/profiler/profile_sla.py"] | ||
| args: | ||
| - --config | ||
| - ${DGD_CONFIG_FILE} | ||
| - --output-dir | ||
| - /workspace/profiling_results | ||
| - --namespace | ||
| - ${NAMESPACE} | ||
| volumeMounts: | ||
| - name: output-volume | ||
| mountPath: /workspace/profiling_results | ||
| restartPolicy: Never | ||
| volumes: | ||
| - name: output-volume | ||
| persistentVolumeClaim: | ||
| claimName: profiling-pvc | ||
| backoffLimit: 0 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,19 @@ | ||
| # SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. | ||
| # SPDX-License-Identifier: Apache-2.0 | ||
| apiVersion: rbac.authorization.k8s.io/v1 | ||
| kind: Role | ||
| metadata: | ||
| name: profile-sla-role | ||
| namespace: ${NAMESPACE} | ||
| rules: | ||
| # DynamoGraphDeployment custom resources - needed for create/get/delete operations | ||
| - apiGroups: ["nvidia.com"] | ||
| resources: ["dynamographdeployments"] | ||
| verbs: ["get", "create", "delete"] | ||
| # Pods - needed for listing pods by label selector and getting logs | ||
| - apiGroups: [""] | ||
| resources: ["pods"] | ||
| verbs: ["list"] | ||
| - apiGroups: [""] | ||
| resources: ["pods/log"] | ||
| verbs: ["get"] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,9 @@ | ||
| # SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. | ||
| # SPDX-License-Identifier: Apache-2.0 | ||
| apiVersion: v1 | ||
| kind: ServiceAccount | ||
| metadata: | ||
| name: profile-sla-sa | ||
| namespace: ${NAMESPACE} | ||
| imagePullSecrets: | ||
| - name: nvcr-imagepullsecret |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,13 @@ | ||
| # SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. | ||
| # SPDX-License-Identifier: Apache-2.0 | ||
| apiVersion: v1 | ||
| kind: PersistentVolumeClaim | ||
| metadata: | ||
| name: profiling-pvc | ||
| namespace: ${NAMESPACE} | ||
| spec: | ||
| accessModes: | ||
| - ReadWriteOnce | ||
| resources: | ||
| requests: | ||
| storage: 50Gi |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.