-
Notifications
You must be signed in to change notification settings - Fork 44
new docs for PVC storage and model management #216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @samzong, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
I've significantly expanded the documentation for OME, focusing on how models are managed and stored, particularly with Persistent Volume Claims (PVCs). This update introduces detailed architectural explanations, a comprehensive reference for all supported storage types, and practical user guides. The goal is to provide clear, in-depth information for users and developers on leveraging various storage backends, ensuring they can efficiently manage and serve models within the OME ecosystem.
Highlights
- New PVC Storage Architecture Documentation: I've added comprehensive documentation detailing the architecture and data flow for using Persistent Volume Claims (PVCs) as a storage backend for models in OME. This explains how PVCs are handled differently from other storage types, bypassing the model agent and leveraging native Kubernetes volume management.
- Comprehensive Storage Types Reference: I've introduced a new, exhaustive reference guide for all supported storage types within OME, including PVC, OCI Object Storage, HuggingFace Hub, AWS S3, Azure Blob, Google Cloud Storage, GitHub Releases, and custom Vendor Storage. This guide covers URI formats, authentication, examples, and selection criteria for each.
- Dedicated PVC Storage Troubleshooting Guide: I've created a dedicated troubleshooting guide for common issues encountered when using PVC storage with OME models. This guide provides symptoms, causes, solutions, and diagnostic tools to help users resolve problems efficiently.
- New User Guide for PVC Storage Configuration: I've added a user-focused guide on how to configure and utilize PVC storage for models, including step-by-step instructions, examples for different access modes, and best practices for performance and security.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request adds extensive documentation for using PVC storage and managing models in OME. The new documentation is well-structured and comprehensive, covering architecture, user guides, reference material, and troubleshooting. My review focuses on improving the correctness and consistency of the examples and diagnostic scripts. I've identified a bug in the troubleshooting scripts that would cause them to fail for ClusterBaseModels, an incomplete credential configuration in a migration example, and some minor inconsistencies in model names and formatting across the new documents. Overall, this is a great addition to the project's documentation.
| PVC_NAME=$(kubectl get basemodel $MODEL_NAME -n $NAMESPACE -o jsonpath='{.spec.storage.storageUri}' | sed 's/.*pvc:\/\/\([^\/]*\).*/\1/') | ||
| kubectl get pvc $PVC_NAME -n $NAMESPACE -o wide |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The script to extract the PVC name from the storageUri is not robust enough to handle ClusterBaseModel URIs, which have the format pvc://{namespace}:{pvc-name}/{sub-path}. The sed command will extract namespace:pvc-name as PVC_NAME, which is not a valid name for the kubectl get pvc command, causing the script to fail for ClusterBaseModel.
Consider replacing these lines with a more robust parsing logic, for example:
# Extract PVC info from BaseModel
PVC_URI=$(kubectl get basemodel $MODEL_NAME -n $NAMESPACE -o jsonpath='{.spec.storage.storageUri}')
PVC_URI_PATH=$(echo "$PVC_URI" | sed 's|pvc://||')
PVC_SPEC=$(echo "$PVC_URI_PATH" | cut -d/ -f1)
if [[ "$PVC_SPEC" == *":"* ]]; then
# Handle ClusterBaseModel format: namespace:pvc-name
PVC_NAMESPACE=$(echo "$PVC_SPEC" | cut -d: -f1)
PVC_NAME=$(echo "$PVC_SPEC" | cut -d: -f2)
kubectl get pvc "$PVC_NAME" -n "$PVC_NAMESPACE" -o wide
else
# Handle BaseModel format: pvc-name
PVC_NAME="$PVC_SPEC"
kubectl get pvc "$PVC_NAME" -n "$NAMESPACE" -o wide
fi| PVC_INFO=$(kubectl get basemodel $MODEL_NAME -n $NAMESPACE -o jsonpath='{.spec.storage.storageUri}') | ||
| echo "Storage URI: $PVC_INFO" | ||
|
|
||
| if [[ $PVC_INFO == pvc://* ]]; then | ||
| PVC_NAME=$(echo $PVC_INFO | sed 's/pvc:\/\/\([^\/]*\).*/\1/') | ||
| PVC_PATH=$(echo $PVC_INFO | sed 's/pvc:\/\/[^\/]*\/\(.*\)/\1/') | ||
|
|
||
| echo "PVC Name: $PVC_NAME" | ||
| echo "Sub Path: $PVC_PATH" | ||
|
|
||
| # Check PVC status | ||
| kubectl describe pvc $PVC_NAME -n $NAMESPACE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This script has the same issue as the 'Quick Status Check' script. The logic to parse the storageUri on lines 472 and 476-477 doesn't correctly handle ClusterBaseModel URIs (pvc://namespace:pvc-name/sub-path), which will cause the script to fail. The logic should be updated to correctly parse the namespace and PVC name for both BaseModel and ClusterBaseModel.
| OME integrates with various Kubernetes ecosystem components: | ||
|
|
||
| - **Kueue** - Gang scheduling for multi-pod workloads | ||
| - **LeaderWorkerSet** - Resilient multi-node deployments |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line has trailing spaces. While most Markdown renderers will ignore this, it's best practice to remove them for cleaner source files. I've noticed this in a few other places in the new documentation files as well (e.g., user-guide/storage/pvc-storage.md).
| - **LeaderWorkerSet** - Resilient multi-node deployments | |
| - **LeaderWorkerSet** - Resilient multi-node deployments |
| name: llama-hf | ||
| spec: | ||
| storage: | ||
| storageUri: "hf://meta-llama/Llama-3.3-70B-Instruct" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The model name meta-llama/Llama-3.3-70B-Instruct appears to be a typo, as Llama-3.3 is not a known public model. Other examples in the documentation use Llama-3-70b-instruct or Llama-3.1-70b-instruct. For consistency and to provide a working example for users to copy, it would be better to use a real model name like meta-llama/Llama-3.1-70B-Instruct. This inconsistency appears in a few other places in the new documentation as well.
| storageUri: "hf://meta-llama/Llama-3.3-70B-Instruct" | |
| storageUri: "hf://meta-llama/Llama-3.1-70B-Instruct" |
| - name: AWS_ACCESS_KEY_ID | ||
| valueFrom: | ||
| secretKeyRef: | ||
| name: aws-credentials | ||
| key: access_key_id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The environment variables for the migrator container are incomplete. The aws s3 sync command requires both AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY for authentication with access keys. The current example only provides the access key ID, which will likely cause the job to fail for users who are not using IAM roles for service accounts. It would also be helpful to include AWS_REGION.
| - name: AWS_ACCESS_KEY_ID | |
| valueFrom: | |
| secretKeyRef: | |
| name: aws-credentials | |
| key: access_key_id | |
| - name: AWS_ACCESS_KEY_ID | |
| valueFrom: | |
| secretKeyRef: | |
| name: aws-credentials | |
| key: access_key_id | |
| - name: AWS_SECRET_ACCESS_KEY | |
| valueFrom: | |
| secretKeyRef: | |
| name: aws-credentials | |
| key: secret_access_key | |
| - name: AWS_REGION | |
| value: "us-west-2" # Or your desired region |
Signed-off-by: samzong <[email protected]>
|
@slin1237 I finished the first version of this PR, there should be many problems, hope to get everyone's Review. |
| **Why?** DaemonSet pods cannot efficiently mount PVCs, especially ReadWriteOnce | ||
| volumes. | ||
|
|
||
| ```mermaid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i dont think mermaid works in hugo atm
i made multiple attempts to get it to work, can you verify it works?
slin1237
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
leaving other comments
| | **PVC** | `pvc://[ns:]name/path` | Skipped | Controller+Job | None | | ||
| | **OCI** | `oci://n/ns/b/bucket/o/path` | Downloads | Agent | Instance/User Principal | | ||
| | **HuggingFace** | `hf://model[@branch]` | Downloads | Agent | Token (optional) | | ||
| | **S3** | `s3://bucket[@region]/path` | Downloads | Agent | AWS credentials | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
other cloud vendors are officially supported yet, please remove them for now
What type of PR is this?
/kind documentation
What this PR does / why we need it:
Which issue(s) this PR fixes:
Fixes #168
Special notes for your reviewer:
Does this PR introduce a user-facing change?