Skip to content

Conversation

@samzong
Copy link

@samzong samzong commented Jul 24, 2025

What type of PR is this?

/kind documentation

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #168

Special notes for your reviewer:

Does this PR introduce a user-facing change?

NONE

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @samzong, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

I've significantly expanded the documentation for OME, focusing on how models are managed and stored, particularly with Persistent Volume Claims (PVCs). This update introduces detailed architectural explanations, a comprehensive reference for all supported storage types, and practical user guides. The goal is to provide clear, in-depth information for users and developers on leveraging various storage backends, ensuring they can efficiently manage and serve models within the OME ecosystem.

Highlights

  • New PVC Storage Architecture Documentation: I've added comprehensive documentation detailing the architecture and data flow for using Persistent Volume Claims (PVCs) as a storage backend for models in OME. This explains how PVCs are handled differently from other storage types, bypassing the model agent and leveraging native Kubernetes volume management.
  • Comprehensive Storage Types Reference: I've introduced a new, exhaustive reference guide for all supported storage types within OME, including PVC, OCI Object Storage, HuggingFace Hub, AWS S3, Azure Blob, Google Cloud Storage, GitHub Releases, and custom Vendor Storage. This guide covers URI formats, authentication, examples, and selection criteria for each.
  • Dedicated PVC Storage Troubleshooting Guide: I've created a dedicated troubleshooting guide for common issues encountered when using PVC storage with OME models. This guide provides symptoms, causes, solutions, and diagnostic tools to help users resolve problems efficiently.
  • New User Guide for PVC Storage Configuration: I've added a user-focused guide on how to configure and utilize PVC storage for models, including step-by-step instructions, examples for different access modes, and best practices for performance and security.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds extensive documentation for using PVC storage and managing models in OME. The new documentation is well-structured and comprehensive, covering architecture, user guides, reference material, and troubleshooting. My review focuses on improving the correctness and consistency of the examples and diagnostic scripts. I've identified a bug in the troubleshooting scripts that would cause them to fail for ClusterBaseModels, an incomplete credential configuration in a migration example, and some minor inconsistencies in model names and formatting across the new documents. Overall, this is a great addition to the project's documentation.

Comment on lines 447 to 252
PVC_NAME=$(kubectl get basemodel $MODEL_NAME -n $NAMESPACE -o jsonpath='{.spec.storage.storageUri}' | sed 's/.*pvc:\/\/\([^\/]*\).*/\1/')
kubectl get pvc $PVC_NAME -n $NAMESPACE -o wide
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The script to extract the PVC name from the storageUri is not robust enough to handle ClusterBaseModel URIs, which have the format pvc://{namespace}:{pvc-name}/{sub-path}. The sed command will extract namespace:pvc-name as PVC_NAME, which is not a valid name for the kubectl get pvc command, causing the script to fail for ClusterBaseModel.

Consider replacing these lines with a more robust parsing logic, for example:

# Extract PVC info from BaseModel
PVC_URI=$(kubectl get basemodel $MODEL_NAME -n $NAMESPACE -o jsonpath='{.spec.storage.storageUri}')
PVC_URI_PATH=$(echo "$PVC_URI" | sed 's|pvc://||')
PVC_SPEC=$(echo "$PVC_URI_PATH" | cut -d/ -f1)

if [[ "$PVC_SPEC" == *":"* ]]; then
  # Handle ClusterBaseModel format: namespace:pvc-name
  PVC_NAMESPACE=$(echo "$PVC_SPEC" | cut -d: -f1)
  PVC_NAME=$(echo "$PVC_SPEC" | cut -d: -f2)
  kubectl get pvc "$PVC_NAME" -n "$PVC_NAMESPACE" -o wide
else
  # Handle BaseModel format: pvc-name
  PVC_NAME="$PVC_SPEC"
  kubectl get pvc "$PVC_NAME" -n "$NAMESPACE" -o wide
fi

Comment on lines 472 to 527
PVC_INFO=$(kubectl get basemodel $MODEL_NAME -n $NAMESPACE -o jsonpath='{.spec.storage.storageUri}')
echo "Storage URI: $PVC_INFO"

if [[ $PVC_INFO == pvc://* ]]; then
PVC_NAME=$(echo $PVC_INFO | sed 's/pvc:\/\/\([^\/]*\).*/\1/')
PVC_PATH=$(echo $PVC_INFO | sed 's/pvc:\/\/[^\/]*\/\(.*\)/\1/')

echo "PVC Name: $PVC_NAME"
echo "Sub Path: $PVC_PATH"

# Check PVC status
kubectl describe pvc $PVC_NAME -n $NAMESPACE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This script has the same issue as the 'Quick Status Check' script. The logic to parse the storageUri on lines 472 and 476-477 doesn't correctly handle ClusterBaseModel URIs (pvc://namespace:pvc-name/sub-path), which will cause the script to fail. The logic should be updated to correctly parse the namespace and PVC name for both BaseModel and ClusterBaseModel.

OME integrates with various Kubernetes ecosystem components:

- **Kueue** - Gang scheduling for multi-pod workloads
- **LeaderWorkerSet** - Resilient multi-node deployments
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This line has trailing spaces. While most Markdown renderers will ignore this, it's best practice to remove them for cleaner source files. I've noticed this in a few other places in the new documentation files as well (e.g., user-guide/storage/pvc-storage.md).

Suggested change
- **LeaderWorkerSet** - Resilient multi-node deployments
- **LeaderWorkerSet** - Resilient multi-node deployments

name: llama-hf
spec:
storage:
storageUri: "hf://meta-llama/Llama-3.3-70B-Instruct"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The model name meta-llama/Llama-3.3-70B-Instruct appears to be a typo, as Llama-3.3 is not a known public model. Other examples in the documentation use Llama-3-70b-instruct or Llama-3.1-70b-instruct. For consistency and to provide a working example for users to copy, it would be better to use a real model name like meta-llama/Llama-3.1-70B-Instruct. This inconsistency appears in a few other places in the new documentation as well.

Suggested change
storageUri: "hf://meta-llama/Llama-3.3-70B-Instruct"
storageUri: "hf://meta-llama/Llama-3.1-70B-Instruct"

Comment on lines 573 to 577
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: aws-credentials
key: access_key_id
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The environment variables for the migrator container are incomplete. The aws s3 sync command requires both AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY for authentication with access keys. The current example only provides the access key ID, which will likely cause the job to fail for users who are not using IAM roles for service accounts. It would also be helpful to include AWS_REGION.

Suggested change
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: aws-credentials
key: access_key_id
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: aws-credentials
key: access_key_id
- name: AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: aws-credentials
key: secret_access_key
- name: AWS_REGION
value: "us-west-2" # Or your desired region

@samzong samzong marked this pull request as ready for review July 25, 2025 02:50
@samzong samzong requested a review from slin1237 as a code owner July 25, 2025 02:50
@samzong
Copy link
Author

samzong commented Jul 25, 2025

@slin1237 I finished the first version of this PR, there should be many problems, hope to get everyone's Review.

**Why?** DaemonSet pods cannot efficiently mount PVCs, especially ReadWriteOnce
volumes.

```mermaid
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i dont think mermaid works in hugo atm
i made multiple attempts to get it to work, can you verify it works?

Copy link
Collaborator

@slin1237 slin1237 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leaving other comments

| **PVC** | `pvc://[ns:]name/path` | Skipped | Controller+Job | None |
| **OCI** | `oci://n/ns/b/bucket/o/path` | Downloads | Agent | Instance/User Principal |
| **HuggingFace** | `hf://model[@branch]` | Downloads | Agent | Token (optional) |
| **S3** | `s3://bucket[@region]/path` | Downloads | Agent | AWS credentials |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

other cloud vendors are officially supported yet, please remove them for now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Task 8: Documentation

2 participants