Skip to content

Conversation

@atchernych
Copy link
Contributor

@atchernych atchernych commented Jun 5, 2025

Overview:

Add the ability for a user to overwrite the entrypoint and PVCs in the sdk.
DEP-157

Details:

It is the first iteration of the feature.
A more generic approach will be developed.

Where should the reviewer start?

Still need a go test and type enforcement Model class for python

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

  • closes GitHub issue: #xxx

Summary by CodeRabbit

  • New Features
    • Added support for specifying a single main container with detailed Kubernetes configuration in component and graph deployments.
    • Users can now override container commands and arguments via Kubernetes-specific configuration fields.
    • Extended CRDs to include a comprehensive mainContainer specification for more granular pod customization.
  • Bug Fixes
    • Corrected environment variable export syntax in deployment documentation.
  • Documentation
    • Improved deployment instructions and added port forwarding guidance in the Hello World example.
    • Provided commented examples for Kubernetes container overrides in sample code.
  • Chores
    • Updated YAML parsing dependency for improved compatibility and maintenance.
    • Updated attributions and dependency documentation to reflect new YAML library usage.
  • Style
    • Fixed trailing blank lines in CRD YAML files for consistency.
  • Enhancements
    • Added Docker socket mounting option for workspace support in container runtime.

@github-actions github-actions bot added the feat label Jun 5, 2025
@atchernych atchernych changed the title feat: kubernetes overwrites for PVC and entrypoint feat: kubernetes overwrites for PVC and entrypoint [WIP] Jun 5, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jun 5, 2025

Walkthrough

This update introduces support for specifying a dedicated mainContainer in various Kubernetes CRDs and associated Go and Python code. It extends CRD schemas, Go structs, and SDK models to allow precise overrides of container command and arguments. The YAML parsing library is switched from gopkg.in/yaml.v2 to github.com/goccy/go-yaml. Minor documentation and script improvements are also included.

Changes

Files / Areas Changed Summary
deploy/cloud/operator/config/crd/bases/nvidia.com_dynamocomponentdeployments.yaml
deploy/cloud/helm/crds/templates/nvidia.com_dynamocomponentdeployments.yaml
Added comprehensive mainContainer field to extraPodSpec in the CRD schema, mirroring Kubernetes container spec structure.
deploy/cloud/operator/config/crd/bases/nvidia.com_dynamocomponents.yaml
deploy/cloud/helm/crds/templates/nvidia.com_dynamocomponents.yaml
Added mainContainer object to CRD schemas, providing detailed single-container specification alongside existing arrays.
deploy/cloud/operator/config/crd/bases/nvidia.com_dynamographdeployments.yaml
deploy/cloud/helm/crds/templates/nvidia.com_dynamographdeployments.yaml
Added mainContainer object to service extraPodSpec in CRD schemas, fully typed and required name field.
deploy/cloud/operator/api/dynamo/common/common.go
deploy/cloud/operator/api/dynamo/common/zz_generated.deepcopy.go
Added MainContainer pointer field to ExtraPodSpec struct and deep copy logic for it.
deploy/cloud/operator/internal/controller/dynamocomponentdeployment_controller.go Added logic to override main container's Command and Args using ExtraPodSpec.MainContainer if specified.
deploy/cloud/operator/internal/controller/dynamocomponent_controller.go Switched YAML parsing library to github.com/goccy/go-yaml.
deploy/cloud/operator/internal/dynamo/graph.go Switched YAML library, added ExtraPodSpec to Config, and logic for main container command/args overrides in deployments.
deploy/cloud/operator/go.mod Replaced gopkg.in/yaml.v2 with github.com/goccy/go-yaml dependency.
ATTRIBUTIONS-Go.md Updated dependency references from gopkg.in/yaml.v2 to github.com/goccy/go-yaml.
deploy/sdk/src/dynamo/sdk/core/protocol/interface.py
deploy/sdk/src/dynamo/sdk/cli/build.py
Added kubernetes_overrides field to ServiceConfig, serialization logic for Kubernetes overrides, and docstring.
examples/hello_world/hello_world.py Added commented example for Kubernetes container overrides in the service decorator.
docs/examples/hello_world.md Improved deployment instructions and added explicit port-forwarding command.
deploy/cloud/helm/README.md Fixed shell syntax for setting PROJECT_ROOT.
container/run.sh Added Docker socket volume mount when --mount-workspace is enabled.
deploy/cloud/operator/internal/dynamo/graph_test.go Added test case verifying ExtraPodSpec.MainContainer command and args overrides in generated deployments.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant PythonSDK
    participant CRD/YAML
    participant GoOperator

    User->>PythonSDK: Define service with kubernetes_overrides
    PythonSDK->>CRD/YAML: Serialize to manifest (mainContainer overrides)
    CRD/YAML->>GoOperator: Apply CRD with mainContainer in extraPodSpec
    GoOperator->>GoOperator: On reconcile, override Command/Args if mainContainer specified
    GoOperator->>Kubernetes: Deploy Pod with overridden main container spec
Loading

Poem

🐇
A hop and a leap, a container appears,
With main specs defined, it now perseveres.
Commands and args, so neatly set,
No more confusion, no need to fret!
With YAMLs that parse and CRDs that grow,
The bunnies deploy—off to the cloud we go!
☁️🥕


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🔭 Outside diff range comments (1)
deploy/sdk/src/dynamo/sdk/lib/service.py (1)

330-345: 🛠️ Refactor suggestion

strings.Fields-style splitting will break quoted arguments

service() converts a string entry-point in the controller via strings.Fields (see controller comment below).
If the user supplies:
entrypoint="python -m myapp --arg='value with spaces'"
the quote is lost and the container receives three args. Consider using github.com/mattn/go-shellwords (import-size is tiny) or ask users for an []string instead of a single string.

🧹 Nitpick comments (3)
deploy/cloud/operator/api/v1alpha1/dynamocomponentdeployment_types.go (1)

113-116: Missing field-level validation on nested struct

Entrypoint and PVCSettings accept arbitrary strings/maps. Consider:

+kubebuilder:validation:MinLength=1

for Entrypoint and

+kubebuilder:validation:MinProperties=1

for PVCSettings to guard against empty overwrites that would otherwise be silently ignored by the controller.

deploy/sdk/src/dynamo/sdk/cli/build.py (1)

149-159: Potentially unused top-level copy of kubernetes_overwrites

ServiceInfo.from_service() stores the dict in the root of the Pydantic model, but ManifestInfo.to_dict() only serialises the copy you later inject into service_dict["config"].
The root-level value is therefore redundant and slightly inflates the YAML size. If it is not consumed elsewhere, drop it:

-        kubernetes_overwrites = None
-        if hasattr(service, "kubernetes_overwrites") and service.kubernetes_overwrites:
-            kubernetes_overwrites = service.kubernetes_overwrites.model_dump()
+        kubernetes_overwrites: t.Optional[t.Dict[str, t.Any]] = (
+            service.kubernetes_overwrites.model_dump()
+            if getattr(service, "kubernetes_overwrites", None)
+            else None
+        )

and delete the kubernetes_overwrites attribute from ServiceInfo altogether.

deploy/sdk/src/dynamo/sdk/lib/service.py (1)

49-65: Minor: embed simple doc-strings for field semantics

Pydantic models render field descriptions in generated docs / JSON-schema.
Adding Field(..., description="…") for size, volume_access_mode, etc., improves operator UX and prevents misuse (e.g. mistaking GiB vs MiB).

🧰 Tools
🪛 Pylint (3.3.7)

[refactor] 49-49: Too few public methods (0/2)

(R0903)


[refactor] 60-60: Too few public methods (0/2)

(R0903)

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2ae40af and 1f10640.

📒 Files selected for processing (4)
  • deploy/cloud/operator/api/v1alpha1/dynamocomponentdeployment_types.go (2 hunks)
  • deploy/cloud/operator/internal/controller/dynamocomponentdeployment_controller.go (3 hunks)
  • deploy/sdk/src/dynamo/sdk/cli/build.py (3 hunks)
  • deploy/sdk/src/dynamo/sdk/lib/service.py (7 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (3)
deploy/cloud/operator/api/v1alpha1/dynamocomponentdeployment_types.go (1)
deploy/sdk/src/dynamo/sdk/lib/service.py (2)
  • KubernetesOverwrites (60-64)
  • PVC (49-57)
deploy/sdk/src/dynamo/sdk/cli/build.py (1)
deploy/sdk/src/dynamo/sdk/lib/service.py (1)
  • service (322-368)
deploy/cloud/operator/internal/controller/dynamocomponentdeployment_controller.go (2)
deploy/cloud/operator/api/v1alpha1/dynamocomponentdeployment_types.go (2)
  • DynamoComponentDeployment (136-142)
  • KubernetesOverwrites (113-116)
deploy/sdk/src/dynamo/sdk/lib/service.py (2)
  • PVC (49-57)
  • KubernetesOverwrites (60-64)
🪛 Pylint (3.3.7)
deploy/sdk/src/dynamo/sdk/lib/service.py

[refactor] 49-49: Too few public methods (0/2)

(R0903)


[refactor] 60-60: Too few public methods (0/2)

(R0903)

🔇 Additional comments (1)
deploy/cloud/operator/api/v1alpha1/dynamocomponentdeployment_types.go (1)

82-85: Add CRD validation / +optional markers for the new field

KubernetesOverwrites is a pointer so it will be treated as optional, but the generated CRD/Open-API schema will miss that information unless you add an explicit marker:

// +optional
// +kubebuilder:validation:Optional
KubernetesOverwrites *KubernetesOverwrites `json:"kubernetesOverwrites,omitempty"`

Run make generate afterwards to refresh the CRD manifests.

@atchernych atchernych changed the title feat: kubernetes overwrites for PVC and entrypoint [WIP] Draft: feat: kubernetes overwrites for PVC and entrypoint [WIP] Jun 5, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🔭 Outside diff range comments (1)
deploy/cloud/operator/config/crd/bases/nvidia.com_dynamocomponentdeployments.yaml (1)

1565-1591: ⚠️ Potential issue

Fix recursiveReadOnly field type
In Kubernetes API, recursiveReadOnly is a boolean, but the schema declares it as a string. Update to:

- recursiveReadOnly:
-   type: string
+ recursiveReadOnly:
+   type: boolean
♻️ Duplicate comments (4)
deploy/cloud/operator/internal/controller/dynamocomponentdeployment_controller.go (4)

1375-1384: Improve PVC validation as highlighted in past reviews.

The function should validate PVC names more thoroughly to prevent admission errors.

Guard against empty PVC names to fail fast and avoid admission errors:

 addPVC := func(crd metav1.Object, volumeName string, pvc *v1alpha1.PVC) {
-	if pvc == nil {
+	if pvc == nil || pvc.Name == nil || *pvc.Name == "" {
 		return
 	}

1406-1423: Add warning when overwriting existing volumes as noted in past reviews.

Users should be notified when their overwrites replace existing volumes/mounts.

Surface when an overwrite replaces an existing volume/mount rather than silently swapping:

 	if used[volumeName] {
+		// TODO: replace with your operator's logging or event recorder
+		klog.Warningf("PVC volume %q was already added; overwriting with new configuration", volumeName)
 		for i, v := range volumes {

1387-1390: Use more specific default mount path to avoid collisions.

The generic /mnt/default path mentioned in past reviews is indeed risky.

Use a more specific default mount path to avoid collisions:

-	mountPath := fmt.Sprintf("/mnt/%s", volumeName)
+	mountPath := fmt.Sprintf("/mnt/pvc/%s", volumeName)

1804-1820: Address entrypoint parsing concerns from past reviews.

The implementation uses shellwords.Parse (good improvement from past feedback) but still completely replaces Command/Args, discarding the carefully-built default arguments.

Prepend the original args instead of replacing them to preserve existing flags:

 			} else if len(parts) > 0 {
 				container.Command = []string{parts[0]}
 				if len(parts) > 1 {
-					container.Args = parts[1:]
+					container.Args = append(parts[1:], container.Args...)
 				}
 			}
🧹 Nitpick comments (6)
deploy/cloud/operator/config/crd/bases/nvidia.com_dynamocomponents.yaml (2)

1560-1582: Annotate volumeMounts with list mapping directives
The volumeMounts array correctly has type: array and items, but lacks x-kubernetes-list-type and a map key (e.g. mountPath or name). Add them to align with Kubernetes CRD best practices and avoid collisions.

   properties:
     volumeMounts:
       type: array
+      x-kubernetes-list-type: map
+      x-kubernetes-list-map-keys:
+      - name
       items:
         properties:
           mountPath:
             type: string
           name:
             type: string
@@
         required:
         - mountPath
         - name

1582-2364: Define list mapping and keys for volumes
The volumes block introduces a broad set of volume types under items, but omits x-kubernetes-list-type and x-kubernetes-list-map-keys (e.g. name). Including these ensures each volume entry is uniquely keyed and the CRD generator handles merges correctly.

   properties:
     volumes:
       type: array
+      x-kubernetes-list-type: map
+      x-kubernetes-list-map-keys:
+      - name
       items:
         properties:
           name:
             type: string
           awsElasticBlockStore:
             properties:
               volumeID:
                 type: string
@@
         required:
         - name
deploy/cloud/operator/config/crd/bases/nvidia.com_dynamographdeployments.yaml (2)

440-442: Add docs and optional marker for new entrypoint field.
The CRD defines entrypoint without a description or nullable: true, which would improve generated documentation and clarify its optionality. Add a description and mark the field as nullable.


1897-1918: Reuse core VolumeMount schema.
volumeMounts duplicates the standard core.v1.VolumeMount definition. To ensure future compatibility and avoid divergence, import the official schema via $ref or enable x-kubernetes-preserve-unknown-fields: true.

deploy/cloud/operator/config/crd/bases/nvidia.com_dynamocomponentdeployments.yaml (2)

385-386: Add schema description for entrypoint
The new entrypoint field is useful, but consider adding an x-kubernetes-description (or description) and, if applicable, a default to clarify its purpose in the CRD and improve user guidance.


1864-2641: Reference upstream volume schemas
Inlining every Kubernetes volume type increases maintenance burden and risks drifting from the official API. Leverage x-kubernetes-embedded-resource or import a shared common CRD for volumes to ensure compatibility with upstream changes.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1f10640 and 34b07a3.

⛔ Files ignored due to path filters (1)
  • deploy/cloud/operator/go.sum is excluded by !**/*.sum
📒 Files selected for processing (11)
  • deploy/cloud/operator/api/dynamo/common/common.go (1 hunks)
  • deploy/cloud/operator/api/v1alpha1/dynamocomponentdeployment_types.go (2 hunks)
  • deploy/cloud/operator/config/crd/bases/nvidia.com_dynamocomponentdeployments.yaml (3 hunks)
  • deploy/cloud/operator/config/crd/bases/nvidia.com_dynamocomponents.yaml (2 hunks)
  • deploy/cloud/operator/config/crd/bases/nvidia.com_dynamographdeployments.yaml (3 hunks)
  • deploy/cloud/operator/config/rbac/role.yaml (0 hunks)
  • deploy/cloud/operator/go.mod (1 hunks)
  • deploy/cloud/operator/internal/controller/dynamocomponentdeployment_controller.go (4 hunks)
  • deploy/sdk/src/dynamo/sdk/cli/build.py (3 hunks)
  • deploy/sdk/src/dynamo/sdk/lib/service.py (7 hunks)
  • examples/hello_world/hello_world.py (1 hunks)
💤 Files with no reviewable changes (1)
  • deploy/cloud/operator/config/rbac/role.yaml
✅ Files skipped from review due to trivial changes (2)
  • deploy/cloud/operator/go.mod
  • examples/hello_world/hello_world.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • deploy/sdk/src/dynamo/sdk/cli/build.py
  • deploy/cloud/operator/api/v1alpha1/dynamocomponentdeployment_types.go
🧰 Additional context used
🪛 golangci-lint (1.64.8)
deploy/cloud/operator/internal/controller/dynamocomponentdeployment_controller.go

1364-1364: cannot use corev1.ResourceRequirements{…} (value of struct type "k8s.io/api/core/v1".ResourceRequirements) as "k8s.io/api/core/v1".VolumeResourceRequirements value in struct literal

(typecheck)


1494-1494: undefined: claims

(typecheck)


1495-1495: undefined: claims

(typecheck)


1496-1496: undefined: claims

(typecheck)


1352-1352: declared and not used: createClaim

(typecheck)

🪛 Pylint (3.3.7)
deploy/sdk/src/dynamo/sdk/lib/service.py

[refactor] 49-49: Too few public methods (0/2)

(R0903)


[refactor] 60-60: Too few public methods (0/2)

(R0903)

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: Mirror Repository to GitLab
🔇 Additional comments (6)
deploy/cloud/operator/api/dynamo/common/common.go (1)

66-68: LGTM! Clean struct extension for volume support.

The new fields properly extend ExtraPodSpec to support Kubernetes volume management with appropriate types and optional semantics.

deploy/sdk/src/dynamo/sdk/lib/service.py (3)

49-65: Well-designed Pydantic models for Kubernetes overrides.

The PVC and KubernetesOverrides models follow good practices with proper typing, optional fields, and clear field names that map to Kubernetes concepts.

🧰 Tools
🪛 Pylint (3.3.7)

[refactor] 49-49: Too few public methods (0/2)

(R0903)


[refactor] 60-60: Too few public methods (0/2)

(R0903)


98-98: Clean integration of kubernetes_overrides into DynamoService.

The parameter is properly added to the constructor and stored as an instance attribute, maintaining consistency with the existing pattern.

Also applies to: 163-163


330-330: Elegant handling of dict-to-object conversion in service decorator.

The decorator properly handles both dict and KubernetesOverrides instances, providing a user-friendly API while maintaining type safety.

Also applies to: 343-344, 365-365

deploy/cloud/operator/internal/controller/dynamocomponentdeployment_controller.go (1)

1432-1444: Excellent! Shared memory volume restored.

Great work addressing the past review feedback - the /dev/shm in-memory volume is properly restored, preventing regression for libraries like PyTorch that depend on POSIX shared memory.

deploy/cloud/operator/config/crd/bases/nvidia.com_dynamocomponentdeployments.yaml (1)

2761-2781: Verify pvcSettings naming consistency
The CRD defines mountPoint, volumeAccessMode, etc., here. Ensure these keys match the SDK and controller structs (e.g., renaming mountPoint to mountPath to align with volumeMounts) to avoid mapping mismatches at runtime.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
deploy/cloud/operator/internal/controller/dynamocomponentdeployment_controller.go (1)

1344-1512: ⚠️ Potential issue

Fix compilation errors and clean up unused code.

The function has critical compilation errors that prevent building:

  1. Undefined variable claims at lines 1494-1499: The variable is referenced but never declared
  2. Commented-out functionality: The createClaim function and claims handling logic are partially commented out but still referenced

Apply this diff to fix the compilation errors:

 func buildPVCVolumesAndMounts(
 	component *v1alpha1.DynamoComponentDeployment,
 	sharedMemorySizeLimit resource.Quantity,
-) (volumes []corev1.Volume, mounts []corev1.VolumeMount /* claims []corev1.PersistentVolumeClaim */) {
+) (volumes []corev1.Volume, mounts []corev1.VolumeMount) {

-	// TODO use constructPVC() from common?
-	// Helper: create PVC object from your custom PVC struct
-	// createClaim := func(volumeName string, pvc *v1alpha1.PVC) *corev1.PersistentVolumeClaim {
-	// 	if pvc == nil || pvc.Create == nil || *pvc.Name == "" || !*pvc.Create {
-	// 		return nil
-	// 	}
-	// 	claimName := getPvcName(component, pvc.Name)
-	// 	storageClassName := pvc.StorageClass
-	// 	return &corev1.PersistentVolumeClaim{
-	// 		ObjectMeta: metav1.ObjectMeta{
-	// 			Name: claimName,
-	// 		},
-	// 		Spec: corev1.PersistentVolumeClaimSpec{
-	// 			AccessModes: []corev1.PersistentVolumeAccessMode{pvc.VolumeAccessMode},
-	// 			Resources: corev1.VolumeResourceRequirements{
-	// 				Requests: corev1.ResourceList{
-	// 					corev1.ResourceStorage: pvc.Size,
-	// 				},
-	// 			},
-	// 			StorageClassName: &storageClassName,
-	// 		},
-	// 	}
-	// }

-		// TODO: not sure about the claims in the overwrites.
-		// Add PVCClaims from ExtraPodSpec
-		for _, claim := range extra.PVCClaims {
-			if used[claim.Name] {
-				// Overwrite existing claim with same name
-				for i := range claims {
-					if claims[i].Name == claim.Name {
-						claims[i] = claim
-						break
-					}
-				}
-			} else {
-				claims = append(claims, claim)
-				used[claim.Name] = true
-			}
-		}

-	return volumes, mounts /*. claims */
+	return volumes, mounts
🧰 Tools
🪛 golangci-lint (1.64.8)

1497-1497: undefined: claims

(typecheck)


1498-1498: undefined: claims

(typecheck)


1499-1499: undefined: claims

(typecheck)

🧹 Nitpick comments (1)
deploy/cloud/operator/internal/controller/dynamocomponentdeployment_controller.go (1)

1377-1432: Improve PVC validation and default mount paths.

The function addresses some previous feedback but could benefit from additional improvements:

  1. Add validation for empty PVC names to fail fast:
 	// Use getPvcName to fall back to CRD name if pvc.Name is nil
 	claimName := getPvcName(crd, pvc.Name)
-	if claimName == "" {
+	if claimName == "" || (pvc.Name != nil && *pvc.Name == "") {
 		return
 	}
  1. Consider using a more specific default mount path to reduce collision risk:
-	mountPath := fmt.Sprintf("/mnt/%s", volumeName)
+	mountPath := fmt.Sprintf("/mnt/pvc/%s", volumeName)

The warning logging for overwrites (line 1409) is well implemented and addresses previous feedback.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 34b07a3 and f47dc56.

📒 Files selected for processing (1)
  • deploy/cloud/operator/internal/controller/dynamocomponentdeployment_controller.go (5 hunks)
🧰 Additional context used
🪛 golangci-lint (1.64.8)
deploy/cloud/operator/internal/controller/dynamocomponentdeployment_controller.go

1497-1497: undefined: claims

(typecheck)


1498-1498: undefined: claims

(typecheck)


1499-1499: undefined: claims

(typecheck)

🔇 Additional comments (3)
deploy/cloud/operator/internal/controller/dynamocomponentdeployment_controller.go (3)

36-36: LGTM: Import additions support new functionality.

The klog and shellwords imports are appropriate for the PVC warning logs and entrypoint parsing functionality added in this PR.

Also applies to: 67-67


1677-1679: LGTM: Clean integration of centralized volume handling.

The integration properly delegates PVC volume and mount creation to the new buildPVCVolumesAndMounts function while maintaining existing functionality.


1807-1822: LGTM: Robust entrypoint handling with proper quote parsing.

The implementation correctly uses shellwords.Parse for proper quote handling and includes appropriate error handling. The complete replacement of Command and Args gives users full control over the container startup, which aligns with the entrypoint override feature intent.

Note: This differs from previous feedback about preserving original args, but the current approach provides more flexibility for complete entrypoint customization.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
deploy/cloud/operator/internal/controller/dynamocomponentdeployment_controller.go (1)

1810-1825: Entrypoint overwrite still discards original arguments.

This implementation still has the issue flagged in previous reviews - it completely replaces container.Args instead of preserving the original uv run dynamo serve flags.

Unless the intent is to give full control, consider preserving the original arguments:

 				container.Command = []string{parts[0]}
 				if len(parts) > 1 {
-					container.Args = parts[1:]
+					container.Args = append(parts[1:], container.Args...)
 				}
🧹 Nitpick comments (1)
deploy/cloud/operator/internal/controller/dynamocomponentdeployment_controller.go (1)

1347-1515: Remove commented code for better maintainability.

The core logic correctly addresses past feedback with proper PVC name validation, overwrite warnings, and specific mount paths. However, the extensive commented-out code should be removed.

Apply this diff to clean up the commented sections:

-	// TODO use constructPVC() from common?
-	// Helper: create PVC object from your custom PVC struct
-	// createClaim := func(volumeName string, pvc *v1alpha1.PVC) *corev1.PersistentVolumeClaim {
-	// 	if pvc == nil || pvc.Create == nil || *pvc.Name == "" || !*pvc.Create {
-	// 		return nil
-	// 	}
-	// 	claimName := getPvcName(component, pvc.Name)
-	// 	storageClassName := pvc.StorageClass
-	// 	return &corev1.PersistentVolumeClaim{
-	// 		ObjectMeta: metav1.ObjectMeta{
-	// 			Name: claimName,
-	// 		},
-	// 		Spec: corev1.PersistentVolumeClaimSpec{
-	// 			AccessModes: []corev1.PersistentVolumeAccessMode{pvc.VolumeAccessMode},
-	// 			Resources: corev1.VolumeResourceRequirements{
-	// 				Requests: corev1.ResourceList{
-	// 					corev1.ResourceStorage: pvc.Size,
-	// 				},
-	// 			},
-	// 			StorageClassName: &storageClassName,
-	// 		},
-	// 	}
-	// }

-		// TODO ?
-		// if claim := createClaim(volumeName, pvc); claim != nil {
-		// 	claims = append(claims, *claim)
-		// }

-		// TODO: not sure about the claims in the overwrites.
-		// Add PVCClaims from ExtraPodSpec
-		// for _, claim := range extra.PVCClaims {
-		// 	if used[claim.Name] {
-		// 		// Overwrite existing claim with same name
-		// 		for i := range claims {
-		// 			if claims[i].Name == claim.Name {
-		// 				claims[i] = claim
-		// 				break
-		// 			}
-		// 		}
-		// 	} else {
-		// 		claims = append(claims, claim)
-		// 		used[claim.Name] = true
-		// 	}
-		// }

-	return volumes, mounts /*. claims */
+	return volumes, mounts
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f47dc56 and 2eaa07e.

📒 Files selected for processing (4)
  • deploy/cloud/operator/api/dynamo/common/common.go (1 hunks)
  • deploy/cloud/operator/api/dynamo/common/zz_generated.deepcopy.go (1 hunks)
  • deploy/cloud/operator/api/v1alpha1/zz_generated.deepcopy.go (2 hunks)
  • deploy/cloud/operator/internal/controller/dynamocomponentdeployment_controller.go (7 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • deploy/cloud/operator/api/dynamo/common/common.go
🧰 Additional context used
🧬 Code Graph Analysis (1)
deploy/cloud/operator/api/v1alpha1/zz_generated.deepcopy.go (1)
deploy/sdk/src/dynamo/sdk/lib/service.py (1)
  • PVC (49-57)
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: Mirror Repository to GitLab
🔇 Additional comments (6)
deploy/cloud/operator/api/v1alpha1/zz_generated.deepcopy.go (2)

266-278: LGTM! Deepcopy logic correctly handles the PVC map structure.

The autogenerated deepcopy logic properly handles the transition from a single PVC pointer to a map of PVC pointers. The implementation correctly allocates a new map, handles nil values, and performs deep copying of each PVC instance.


318-322: LGTM! Standard deepcopy logic for optional string pointer.

The deepcopy implementation for the new Entrypoint field follows the correct pattern for optional pointer fields - checking for nil, allocating a new pointer, and copying the string value.

deploy/cloud/operator/api/dynamo/common/zz_generated.deepcopy.go (1)

154-174: LGTM! Correct deepcopy implementation for new volume-related fields.

The autogenerated deepcopy logic properly handles the three new slice fields (Volumes, VolumeMounts, PVCClaims) in ExtraPodSpec. Each implementation follows the standard pattern:

  • Nil-safety checks
  • Proper slice allocation
  • Element-wise deep copying using DeepCopyInto

These changes align with the broader enhancements for Kubernetes volume and PVC handling described in the PR.

deploy/cloud/operator/internal/controller/dynamocomponentdeployment_controller.go (3)

36-36: LGTM on new imports.

The addition of k8s.io/klog/v2 and github.com/mattn/go-shellwords are appropriate for the logging and entrypoint parsing functionality respectively.

Also applies to: 67-67


818-854: LGTM on reconcilePVC refactor for multiple PVC support.

The function correctly handles the transition from single to multiple PVCs with proper error handling, logging context, and conditional PVC creation based on the Create flag.


1674-1682: LGTM on volume/mount integration.

The integration correctly uses the new centralized buildPVCVolumesAndMounts function and properly calculates the shared memory size limit.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
deploy/sdk/src/dynamo/sdk/core/protocol/interface.py (1)

80-93: Widen accepted sequence type & normalise to list for safer downstream handling

entrypoint / cmd end up being iterated over when we render the final
Container spec. Accepting any Sequence[str] (not just list) gives users
natural flexibility (tuples, pydantic FrozenSet, …) while still preserving
static types; normalising to an actual list inside the validator prevents
accidental surprises further down the pipeline.

-from typing import Any, Dict, Generic, List, Optional, Set, Tuple, Type, TypeVar
+from typing import Any, Dict, Generic, List, Optional, Sequence, Set, Tuple, Type, TypeVar-    entrypoint: List[str] | None = None
-    cmd: List[str] | None = None
+    entrypoint: Sequence[str] | None = None
+    cmd: Sequence[str] | None = None-        if v is None or isinstance(v, list):
+        if v is None:
+            return v
+        # Already a list → keep as-is; tuple → convert; anything else → error
+        if isinstance(v, list):
+            return v
+        if isinstance(v, tuple):
+            return list(v)
         if isinstance(v, str):
             return [v]
-        raise TypeError("Must be str or list[str]")
+        raise TypeError("Must be str, list[str] or tuple[str]")

Side benefit: the Pylint “too few public methods” warning disappears once
Sequence is imported, because the symbol count tips over the threshold.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between dbb901a and b1e6c35.

📒 Files selected for processing (5)
  • deploy/cloud/operator/internal/controller/dynamocomponent_controller.go (1 hunks)
  • deploy/cloud/operator/internal/controller/dynamocomponentdeployment_controller.go (2 hunks)
  • deploy/cloud/operator/internal/dynamo/graph.go (5 hunks)
  • deploy/sdk/src/dynamo/sdk/cli/build.py (6 hunks)
  • deploy/sdk/src/dynamo/sdk/core/protocol/interface.py (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • deploy/cloud/operator/internal/controller/dynamocomponent_controller.go
  • deploy/cloud/operator/internal/controller/dynamocomponentdeployment_controller.go
  • deploy/cloud/operator/internal/dynamo/graph.go
  • deploy/sdk/src/dynamo/sdk/cli/build.py
🧰 Additional context used
🪛 Pylint (3.3.7)
deploy/sdk/src/dynamo/sdk/core/protocol/interface.py

[refactor] 77-77: Too few public methods (0/2)

(R0903)

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: Build and Test - vllm
🔇 Additional comments (1)
deploy/sdk/src/dynamo/sdk/core/protocol/interface.py (1)

104-104: Confirm kubernetes_overrides is serialised in all manifest paths

The new field hooks into build / deploy logic, but older call-sites might still
serialise ServiceConfig to YAML/JSON directly (e.g. cached manifests,
docs-generation). Double-check those helpers so the overrides don’t get
silently dropped.

@atchernych
Copy link
Contributor Author

/ok to test 7a2b227

@atchernych atchernych enabled auto-merge (squash) June 16, 2025 17:05
revert changes to run.sh

Signed-off-by: atchernych <[email protected]>
Copy link
Contributor

@biswapanda biswapanda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

  • few comments doc/UX related comments

Copy link
Contributor

@nv-anants nv-anants left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for updating the attributions

@atchernych atchernych merged commit fbf1ffd into main Jun 17, 2025
12 checks passed
@atchernych atchernych deleted the dep-157-extend-decor branch June 17, 2025 20:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants