-
Notifications
You must be signed in to change notification settings - Fork 20
Adding RAG to assisted-chat #241
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: andrej1991 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
Skipping CI for Draft Pull Request. |
WalkthroughAdds a Changes
Sequence Diagram(s)sequenceDiagram
participant CI as CI / Build
participant Rag as rag stage (base image)
participant Final as Final image
participant Runtime as Container runtime
rect rgb(245,250,255)
note right of Rag: Provides embedding model and FAISS DB
CI->>Rag: FROM ... AS rag
CI->>Final: COPY --from=rag /all-mpnet-base-v2/ -> /app-root/all-mpnet-base-v2/
CI->>Final: COPY --from=rag /llama_stack_vector_db/faiss_store.db -> /llama_stack_vector_db/faiss_store.db
CI->>Final: RUN chmod 777 /llama_stack_vector_db/
CI->>Final: (apply llama-stack patch where present)
end
Final->>Runtime: container start (ENTRYPOINT: migrate.py && lightspeed_stack.py)
Runtime->>Runtime: service loads embeddings from /app-root/all-mpnet-base-v2 and FAISS DB
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20–30 minutes
Possibly related PRs
Suggested labels
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
0ca316b to
743c4d0
Compare
8f5509d to
db904e2
Compare
| * First call knowledge_search with the user's question | ||
| * Use the retrieved information as your primary source | ||
| * Only supplement with your knowledge if needed | ||
| * Only include C3 (Oracle Compute Cloud) to your response if the user specifically asks for it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is C3? I don't see that referenced anywhere else. Maybe it would be better to just leave that acronym out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
C3 is in chapter 5 of the OCI document. I was told that the model should include it to it's response only if the user wants it. I want to make sure that the model does not include it to it's answer by default
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
template.yaml (2)
364-369: Missing sentence-transformers provider definition.The embedding model configuration (lines 444-450) references
provider_id: sentence-transformers, but theproviders.inferencesection only defines thegeminiprovider. You need to add thesentence-transformersprovider entry here, similar to how it's defined inconfig/llama_stack_client_config.yaml(lines 20-22).Apply this diff to add the missing provider:
providers: inference: - provider_id: gemini provider_type: remote::gemini config: api_key: dummy-to-stop-llama-stack-from-complaining-even-though-we-use-vertex-and-not-gemini-directly + - provider_id: sentence-transformers + provider_type: inline::sentence-transformers + config: {} vector_io:
461-465: Missing RAG toolgroup definition.The
tool_groupssection is missing thebuiltin::ragtoolgroup that is defined inconfig/llama_stack_client_config.yaml(lines 128-129). This toolgroup is required for the RAG knowledge_search functionality referenced in the system prompt (line 229).Apply this diff to add the missing toolgroup:
tool_groups: - toolgroup_id: mcp::assisted provider_id: model-context-protocol mcp_endpoint: uri: "${MCP_SERVER_URL}" + - toolgroup_id: builtin::rag + provider_id: rag-runtime server:
♻️ Duplicate comments (1)
test/evals/eval_data.yaml (1)
58-63: Fix typo in eval query.Line 61 contains a typo: "assing" should be "assign".
- eval_query: How do I assing roles to nodes when deploying to OCI? + eval_query: How do I assign roles to nodes when deploying to OCI?
🧹 Nitpick comments (3)
template.yaml (1)
228-232: Review internal function call instructions for consistency.The guidelines instruct the AI to "call knowledge_search" (line 229), but line 225 explicitly states "Do not instruct the user to either call a function or run a tool." While line 229 is an internal instruction to the model (not to the user), the phrasing could be clarified to avoid confusion.
Consider rephrasing to: "When answering OCI questions, use knowledge_search to retrieve information..."
Containerfile.add_llama_to_lightspeed (1)
3-3: Consider using digest instead of latest tag for reproducibility.Line 3 uses
FROM localhost/local-ai-chat-lightspeed-stack:latestwhileContainerfile.assisted-chatuses a digest-based reference. Usinglatestcan lead to non-reproducible builds. Consider using a specific digest or version tag.Based on learnings
test/scripts/common.sh (1)
128-128: Address shellcheck warning about variable declaration.Shellcheck warns that declaring and assigning in one line can mask return values. While this is generally low-risk for
fetch_cluster_data, consider separating the declaration and assignment for better error handling.- local cluster_data=$(fetch_cluster_data "$cluster_name") + local cluster_data + cluster_data=$(fetch_cluster_data "$cluster_name")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
Containerfile.add_llama_to_lightspeed(2 hunks)Containerfile.assisted-chat(2 hunks)config/llama_stack_client_config.yaml(4 hunks)template-params.dev.env(1 hunks)template.yaml(5 hunks)test/evals/eval_data.yaml(2 hunks)test/scripts/common.sh(4 hunks)test/scripts/verify_create_eval_test_oci.sh(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
test/scripts/verify_create_eval_test_oci.sh (1)
test/scripts/common.sh (3)
setup_shell_options(14-18)validate_environment(21-24)wait_and_validate_cluster(115-168)
🪛 Shellcheck (0.11.0)
test/scripts/common.sh
[warning] 128-128: Declare and assign separately to avoid masking return values.
(SC2155)
🔇 Additional comments (13)
config/llama_stack_client_config.yaml (4)
16-22: LGTM: Multi-provider configuration is well-structured.The addition of separate provider entries for
geminiandsentence-transformersproperly supports the multi-provider architecture needed for RAG functionality.
23-30: Verify FAISS database file exists in the RAG base image.The vector_io configuration references
/app-root/llama_stack_vector_db/faiss_store.db. Ensure this file exists in theassisted-chat-ragbase image and is properly copied during the container build.
107-119: LGTM: Embedding model and vector DB configuration is consistent.The embedding model configuration correctly references the filesystem path where the model is copied, and the vector DB configuration properly links the embedding model with matching dimensions (768).
128-129: LGTM: RAG toolgroup correctly configured.The
builtin::ragtoolgroup is properly mapped to therag-runtimeprovider defined earlier in the configuration.template.yaml (2)
280-280: LGTM: OCI platform workflow guidance is appropriate.The instruction to manually assign roles for OCI clusters aligns with the OCI deployment flow documented in the evaluation tests.
370-385: Environment variable naming is internally consistent in the deployment.The embedded
llama_stack_client_config.yamlin template.yaml uses${env.LLAMA_STACK_SQLITE_STORE_DIR}(line 383), which is defined in the pod environment (line 506-507). The external config file atconfig/llama_stack_client_config.yamluses a different variable name (SQLITE_STORE_DIR), but it is not referenced by this deployment. The template.yaml deployment is internally consistent and does not require changes.Likely an incorrect or invalid review comment.
test/scripts/verify_create_eval_test_oci.sh (1)
1-26: LGTM: OCI verification script is well-structured.The test script properly:
- Sources common helper functions
- Sets up shell options and validates environment
- Defines expected OCI-specific parameters including platform
- Calls the validation function with all required arguments
The script correctly leverages the enhanced platform validation added to
test/scripts/common.sh.test/evals/eval_data.yaml (1)
38-83: LGTM: Comprehensive OCI evaluation coverage.The new OCI evaluation groups provide good test coverage for:
- Custom manifest handling
- Deployment flow
- Role assignment
- Volume performance requirements
These align well with the RAG knowledge search functionality and OCI platform support added in this PR.
test/scripts/common.sh (3)
59-59: LGTM: Platform extraction logic is correct.The platform extraction correctly handles both regular platform types and external platforms with custom names (like OCI), using the ternary operator to select the appropriate field.
62-86: LGTM: Validation functions follow consistent patterns.Both
validate_expected_ssh_keyandvalidate_expected_platformfollow the same pattern of optional validation, properly returning 0 (success) when the expected value is not provided.
114-167: LGTM: Platform validation integrated correctly.The
wait_and_validate_clusterfunction properly:
- Accepts the new optional
expected_platformparameter- Passes it through to validation
- Includes platform information in success and error output when expected
The changes maintain backward compatibility by making the platform parameter optional.
Containerfile.assisted-chat (1)
16-17: No issues found—paths are correctly aligned.The RAG asset paths in the Containerfile COPY directives match the configuration references:
/app-root/all-mpnet-base-v2/is referenced in both template.yaml and llama_stack_client_config.yaml/app-root/llama_stack_vector_db/is referenced consistently across both configuration files- Both Containerfile.assisted-chat and Containerfile.add_llama_to_lightspeed use identical target paths
The code changes are correct.
Containerfile.add_llama_to_lightspeed (1)
1-1: No issues found. The different registries and digests are intentional.The two Containerfiles serve different purposes:
Containerfile.assisted-chatis production-oriented (usingquay.io/redhat-services-prodregistry and remote base images), whileContainerfile.add_llama_to_lightspeedis development-oriented (usingquay.io/redhat-user-workloadsregistry andlocalhostbase image with localllama-stackdirectory). This architecture is consistent with the repository's design as documented in the README, which distinguishes between production deployment and local development configurations.
| LIGHTSPEED_FEEDBACK_ENABLED=false | ||
| DISABLE_QUERY_SYSTEM_PROMPT=false | ||
| ASSISTED_CHAT_DEFAULT_MODEL=gemini/gemini-2.0-flash | ||
| ASSISTED_CHAT_DEFAULT_MODEL=gemini-2.0-flash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Verify model name consistency across configuration files.
The model name has been changed to gemini-2.0-flash, but in config/llama_stack_client_config.yaml (lines 93-95), the model is registered as gemini/gemini-2.0-flash with the provider prefix. This mismatch could prevent the default model from being found.
Run the following script to check how the model is referenced throughout the codebase:
🏁 Script executed:
#!/bin/bash
# Check model name references across configuration files
echo "=== Models in llama_stack_client_config.yaml ==="
rg -n "model_id.*gemini" config/llama_stack_client_config.yaml
echo "=== Default model references ==="
rg -n "ASSISTED_CHAT_DEFAULT_MODEL|default_model" template-params.dev.env template.yamlLength of output: 538
Fix the model name mismatch between environment variable and configuration file.
The model name in template-params.dev.env is missing the provider prefix. The model_id is the identifier for the model for Llama Stack's purposes, while in config/llama_stack_client_config.yaml, all models are registered with the gemini/ prefix. The environment variable should be updated to ASSISTED_CHAT_DEFAULT_MODEL=gemini/gemini-2.0-flash to match the registered model identifier.
🤖 Prompt for AI Agents
template-params.dev.env around line 7: the ASSISTED_CHAT_DEFAULT_MODEL value
lacks the provider prefix and doesn't match the model_id used in
config/llama_stack_client_config.yaml; update the environment variable value to
include the provider prefix so it reads
ASSISTED_CHAT_DEFAULT_MODEL=gemini/gemini-2.0-flash to match the registered
model identifier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the form gemin/gemini-* is not being supported anymore
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
test/evals/eval_data.yaml (1)
52-57: Fix typo in eval query.Line 55 contains a typo: "assing" should be "assign".
Apply this diff to fix the typo:
- eval_query: How do I assing roles to nodes when deploying to OCI? + eval_query: How do I assign roles to nodes when deploying to OCI?
🧹 Nitpick comments (1)
test/scripts/common.sh (1)
115-167: Consider addressing shellcheck warning.The function correctly implements platform validation support. However, shellcheck flags line 128 for declaring and assigning in one statement, which can mask return values from fetch_cluster_data.
If you want to address the shellcheck warning, you could split the declaration and assignment:
local counter=0 while true; do - local cluster_data=$(fetch_cluster_data "$cluster_name") + local cluster_data + cluster_data=$(fetch_cluster_data "$cluster_name") if [[ -n "$cluster_data" && "$cluster_data" != "null" ]]; thenHowever, this is a minor improvement and the current code functions correctly.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
Containerfile.add_llama_to_lightspeed(2 hunks)Containerfile.assisted-chat(2 hunks)config/llama_stack_client_config.yaml(4 hunks)template-params.dev.env(1 hunks)template.yaml(5 hunks)test/evals/eval_data.yaml(2 hunks)test/scripts/common.sh(4 hunks)test/scripts/verify_create_eval_test_oci.sh(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- template-params.dev.env
- config/llama_stack_client_config.yaml
🧰 Additional context used
🧬 Code graph analysis (1)
test/scripts/verify_create_eval_test_oci.sh (1)
test/scripts/common.sh (3)
setup_shell_options(14-18)validate_environment(21-24)wait_and_validate_cluster(115-168)
🪛 Shellcheck (0.11.0)
test/scripts/common.sh
[warning] 128-128: Declare and assign separately to avoid masking return values.
(SC2155)
🔇 Additional comments (13)
template.yaml (5)
207-207: LGTM!The addition of OCI to the supported platforms list is clear and appropriate.
228-232: LGTM!The OCI response guidelines appropriately prioritize RAG-retrieved information through knowledge_search, ensuring the system leverages the vector database for OCI-specific queries.
370-385: LGTM!The vector_io and files provider configurations correctly enable RAG support. The paths align with the assets copied in the Containerfiles, and the sqlite-backed storage is appropriate for this deployment.
429-431: LGTM!The rag-runtime provider addition enables knowledge-search tool calls, which is essential for the RAG functionality described in the PR objectives.
444-457: LGTM!The embedding model and vector database configurations are correct and consistent. The all-mpnet-base-v2 model path matches the Containerfile assets, and the 768-dimensional embeddings are properly configured.
Containerfile.add_llama_to_lightspeed (2)
1-1: LGTM!The rag stage is properly defined using a content-addressable digest, ensuring reproducible builds.
21-22: LGTM!The COPY directives correctly bring the prebuilt embedding model and vector database assets into the image. The paths align with the template.yaml configuration, and ownership is properly set.
Containerfile.assisted-chat (2)
1-1: LGTM!The production rag stage is properly defined using the SAAS image with content-addressable digest.
16-17: LGTM!The COPY directives correctly bring the RAG assets into the production image, consistent with the development Containerfile and template.yaml configuration.
test/scripts/verify_create_eval_test_oci.sh (1)
1-26: LGTM!The OCI verification script is well-structured and follows the established patterns of other verification scripts. The expected values are appropriate, and the platform validation using "oci" correctly leverages the extended capabilities of common.sh.
test/scripts/common.sh (3)
59-59: LGTM!The ACTUAL_PLATFORM extraction correctly handles both external platform types (with platform_name) and native platform types. The jq expression is appropriate for the data structure.
62-86: LGTM!The validate_expected_ssh_key and validate_expected_platform functions follow a consistent pattern, properly handling optional validation by checking if the expected value is provided before comparing.
88-110: LGTM!The validate_cluster_properties function is correctly extended to support optional platform validation. The logic properly chains validators and returns appropriate success/failure codes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
template.yaml (2)
365-369: Critical: Missing sentence-transformers inference provider.The embedded llama_stack_client_config.yaml in template.yaml only includes the gemini inference provider, but the standalone config/llama_stack_client_config.yaml file includes both gemini (lines 16-19) and sentence-transformers (lines 20-22). The sentence-transformers provider is required for the embedding model to function.
Add the missing inference provider after the gemini provider:
- provider_id: gemini provider_type: remote::gemini config: api_key: dummy-to-stop-llama-stack-from-complaining-even-though-we-use-vertex-and-not-gemini-directly + - provider_id: sentence-transformers + provider_type: inline::sentence-transformers + config: {} vector_io:
461-467: Critical: Missing RAG tool group configuration.The tool_groups section in template.yaml only includes the mcp::assisted tool group, but the standalone config/llama_stack_client_config.yaml includes an additional builtin::rag tool group (lines 128-129). Without this entry, the RAG functionality will not be exposed to the agent.
Add the missing tool group before the server section:
tool_groups: - toolgroup_id: mcp::assisted provider_id: model-context-protocol mcp_endpoint: uri: "${MCP_SERVER_URL}" + - toolgroup_id: builtin::rag + provider_id: rag-runtime server: port: ${LLAMA_STACK_SERVER_PORT}
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
Containerfile.add_llama_to_lightspeed(2 hunks)Containerfile.assisted-chat(2 hunks)config/llama_stack_client_config.yaml(4 hunks)template-params.dev.env(1 hunks)template.yaml(5 hunks)test/evals/eval_data.yaml(2 hunks)test/scripts/common.sh(4 hunks)test/scripts/verify_create_eval_test_oci.sh(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- template-params.dev.env
- test/evals/eval_data.yaml
- Containerfile.add_llama_to_lightspeed
🧰 Additional context used
🧬 Code graph analysis (1)
test/scripts/verify_create_eval_test_oci.sh (1)
test/scripts/common.sh (3)
setup_shell_options(14-18)validate_environment(21-24)wait_and_validate_cluster(115-168)
🪛 Shellcheck (0.11.0)
test/scripts/common.sh
[warning] 128-128: Declare and assign separately to avoid masking return values.
(SC2155)
🔇 Additional comments (19)
Containerfile.assisted-chat (2)
1-1: LGTM - Digest-based image reference ensures reproducibility.The hardcoded SHA256 digest ensures consistent builds by pinning to a specific RAG image version.
16-17: LGTM - Embedding model and vector DB assets properly copied.The COPY commands correctly preserve ownership (1001:1001) and copy the required RAG assets to expected locations.
config/llama_stack_client_config.yaml (6)
16-22: LGTM - Embedding provider correctly added.The sentence-transformers inference provider is properly configured to support the embedding model used for RAG.
23-30: Verify the vector database file exists at the specified path.The FAISS vector store points to
/app-root/llama_stack_vector_db/faiss_store.db. Ensure this file is properly copied by the Containerfile (which it is at line 17 of Containerfile.assisted-chat).
78-80: LGTM - RAG runtime provider configured.The rag-runtime provider is properly registered to enable knowledge search functionality.
107-112: LGTM - Embedding model configuration is correct.The sentence-transformers/all-mpnet-base-v2 model is properly configured with the correct embedding dimension (768) and path.
114-119: LGTM - Vector database configuration aligns with RAG requirements.The vector DB correctly references the embedding model and provider with consistent IDs.
128-129: LGTM - RAG tool group properly exposed.The builtin::rag tool group is correctly configured to enable RAG functionality in the agent.
test/scripts/common.sh (5)
59-59: LGTM - Platform extraction logic correctly handles external platform type.The jq expression properly extracts the platform name, handling both external platforms (like OCI) and native platform types.
62-73: LGTM - SSH key validation function is well-structured.The function correctly handles optional SSH key validation, returning success when no key is expected or when keys match.
75-86: LGTM - Platform validation function follows consistent pattern.The function mirrors the SSH key validation structure and correctly validates platform when expected.
87-110: LGTM - Validation logic properly extended to support platform checking.The function signature and validation logic correctly integrate SSH key and platform validation while maintaining backward compatibility.
141-143: LGTM - Output messages properly include platform information.The success and error messages correctly display platform information when validation is requested.
Also applies to: 154-156
test/scripts/verify_create_eval_test_oci.sh (1)
1-26: LGTM - OCI validation script is well-structured.The script properly:
- Sources common helper functions
- Sets up shell options for safety
- Validates required environment variables
- Configures OCI-specific expected values
- Calls validation with correct parameters including the new platform check
template.yaml (5)
207-207: LGTM - OCI platform support documented in system prompt.The system prompt correctly includes OCI as a supported on-premise platform for Assisted Installer deployments.
228-232: LGTM - RAG workflow instructions added to system prompt.The instructions correctly guide the agent to use knowledge_search for OCI questions, with appropriate guidance about C3 (Oracle Compute Cloud) inclusion.
370-385: LGTM - Vector I/O and files providers properly configured.The FAISS vector store and local filesystem providers are correctly configured with appropriate paths and storage settings.
429-431: LGTM - RAG runtime provider registered.The rag-runtime provider is properly configured to enable RAG functionality.
452-457: LGTM - Vector database configuration is correct.The vector DB properly references the embedding model and aligns with the provider configuration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (2)
test/scripts/common.sh (1)
128-128: Resolve SC2155: Separate variable declaration from command substitution.This warning was flagged in the previous review but remains unresolved. Combining
localdeclaration with command substitution masks the return value offetch_cluster_data, potentially hiding failures.Apply this diff to separate declaration and assignment:
- local cluster_data=$(fetch_cluster_data "$cluster_name") + local cluster_data + cluster_data=$(fetch_cluster_data "$cluster_name")test/evals/eval_data.yaml (1)
52-58: Fix typo in eval_query.Line 55 contains a typo: "assing" should be "assigning".
- eval_query: How do I assing roles to nodes when deploying to OCI? + eval_query: How do I assign roles to nodes when deploying to OCI?
🧹 Nitpick comments (1)
test/scripts/common.sh (1)
62-86: Refactor duplicated validation functions.
validate_expected_ssh_keyandvalidate_expected_platformfollow nearly identical patterns (conditional check, equality comparison, return codes). Consider extracting a generic validator to reduce duplication.Apply this refactor:
-validate_expected_ssh_key() { - local expected_ssh_key="${1:-}" - if [[ -n "$expected_ssh_key" ]]; then - if [[ "$ACTUAL_SSH_KEY" == "$expected_ssh_key" ]]; then - return 0 - else - return 1 - fi - else - return 0 - fi -} - -validate_expected_platform() { - local expected_platform="${1:-}" - if [[ -n "$expected_platform" ]]; then - if [[ "$ACTUAL_PLATFORM" == "$expected_platform" ]]; then - return 0 - else - return 1 - fi - else - return 0 - fi -} +validate_expected_value() { + local expected="$1" + local actual="$2" + [[ -z "$expected" ]] || [[ "$actual" == "$expected" ]] +} + +validate_expected_ssh_key() { + validate_expected_value "$1" "$ACTUAL_SSH_KEY" +} + +validate_expected_platform() { + validate_expected_value "$1" "$ACTUAL_PLATFORM" +}
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
Containerfile.add_llama_to_lightspeed(2 hunks)Containerfile.assisted-chat(2 hunks)config/llama_stack_client_config.yaml(4 hunks)template-params.dev.env(1 hunks)template.yaml(5 hunks)test/evals/eval_data.yaml(2 hunks)test/scripts/common.sh(4 hunks)test/scripts/verify_create_eval_test_oci.sh(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
- test/scripts/verify_create_eval_test_oci.sh
- template-params.dev.env
- Containerfile.add_llama_to_lightspeed
- Containerfile.assisted-chat
- template.yaml
🧰 Additional context used
🪛 Shellcheck (0.11.0)
test/scripts/common.sh
[warning] 128-128: Declare and assign separately to avoid masking return values.
(SC2155)
🔇 Additional comments (9)
test/scripts/common.sh (2)
97-110: Validation logic integration looks good.The updated validation correctly chains the new SSH key and platform validators with the existing property checks, maintaining proper error handling.
114-168: Function signature and parameter threading correct.The addition of
expected_platformparameter (line 123) is properly threaded through tovalidate_cluster_properties(line 131), and the conditional output blocks (lines 141–143, 154–156) follow the existing pattern for optional parameters.Verify that all callers of
wait_and_validate_clusterhave been updated to pass the expected number of positional arguments. Please search for invocations of this function to confirm compatibility.test/evals/eval_data.yaml (2)
38-51: LGTM!The OCI manifest and deployment flow evaluation groups are well-structured with clear intent-based evaluations. The expected intents are specific and properly contextualized for the OCI platform.
136-155: LGTM!The two OCI cluster creation evaluation groups are well-structured. The cluster naming is consistent, the verification scripts are properly referenced, and the description clearly differentiates the two scenarios (OCI vs. Oracle Cloud Infrastructure terminology). The changes align with the past review resolution for cluster naming consistency.
config/llama_stack_client_config.yaml (5)
16-22: LGTM!The inference providers are properly restructured with explicit separation of remote (Gemini) and inline (sentence-transformers) providers. This aligns with the embedding model configuration added downstream.
78-80: LGTM!The rag-runtime provider is correctly configured as an inline tool_runtime with appropriate empty config. This enables RAG knowledge-search calls as described in the PR objectives.
107-119: LGTM!The embedding model and vector_db configurations are properly aligned: embedding_dimension (768) is consistent, the vector_db correctly references the embedding model by ID, and the provider IDs are internally consistent. The naming convention with the "-4-19" release suffix is preserved from prior review feedback.
Ensure the Containerfile sets up the embedding model at
/app-root/all-mpnet-base-v2and the vector store at/app-root/llama_stack_vector_db/faiss_store.dbto match the configured paths.
128-129: LGTM!The builtin::rag tool group is correctly configured to reference the rag-runtime provider. The existing mcp::assisted tool group is preserved, maintaining backward compatibility while enabling the new RAG capabilities.
23-38: Verify persistent storage requirement for/tmp/llama-stack-files.Path consistency is confirmed: Containerfiles correctly copy
faiss_store.dbto/app-root/llama_stack_vector_db/and embedding model to/app-root/all-mpnet-base-v2, matching the config references.However, the files provider uses
/tmp/llama-stack-files(ephemeral) while the metadata_store references${env.LLAMA_STACK_SQLITE_STORE_DIR}(persistent mount). Verify whether files storage should also be mounted to persistent storage or if this ephemeral/persistent asymmetry is intentional. If files need to persist across pod restarts, configure/tmp/llama-stack-filesas a volumeMount.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (3)
template-params.dev.env (1)
7-7: Verify model name matches registered model_id in config.The model name
gemini-2.0-flashshould include the provider prefix to match the registered model identifiergemini/gemini-2.0-flashinconfig/llama_stack_client_config.yaml(line 93).test/evals/eval_data.yaml (2)
71-77: Fix YAML syntax — embedded quotes are treated as literal content.The pipe literal block scalar with embedded quotes causes the quotes to be parsed as literal characters. This is inconsistent with other
expected_responsefields in the file.
55-55: Fix typo in eval_query."assing" should be "assign" in the query.
- eval_query: How do I assing roles to nodes when deploying to OCI? + eval_query: How do I assign roles to nodes when deploying to OCI?
🧹 Nitpick comments (1)
test/scripts/common.sh (1)
128-128: Address ShellCheck warning by separating variable declaration and assignment.Combining
localdeclaration with command substitution can mask the command's return value, potentially hiding failures fromfetch_cluster_data.Based on static analysis hints.
- local cluster_data=$(fetch_cluster_data "$cluster_name") + local cluster_data + cluster_data=$(fetch_cluster_data "$cluster_name")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
Containerfile.add_llama_to_lightspeed(2 hunks)Containerfile.assisted-chat(2 hunks)config/llama_stack_client_config.yaml(4 hunks)template-params.dev.env(1 hunks)template.yaml(5 hunks)test/evals/eval_data.yaml(2 hunks)test/scripts/common.sh(4 hunks)test/scripts/verify_create_eval_test_oci.sh(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- Containerfile.add_llama_to_lightspeed
- Containerfile.assisted-chat
🧰 Additional context used
🧬 Code graph analysis (1)
test/scripts/verify_create_eval_test_oci.sh (1)
test/scripts/common.sh (3)
setup_shell_options(14-18)validate_environment(21-24)wait_and_validate_cluster(115-168)
🪛 Shellcheck (0.11.0)
test/scripts/common.sh
[warning] 128-128: Declare and assign separately to avoid masking return values.
(SC2155)
🔇 Additional comments (23)
test/evals/eval_data.yaml (2)
136-144: LGTM — OCI cluster creation test looks well-structured.The evaluation flow properly specifies cluster parameters, includes cleanup, and references the appropriate verification script.
146-154: LGTM — Second OCI test variant appropriately tests alternate terminology.This variant tests "Oracle Cloud Infrastructure" vs "OCI" to ensure the system handles both forms correctly, which is good test coverage.
config/llama_stack_client_config.yaml (7)
16-22: LGTM — Provider configuration is well-structured.The gemini and sentence-transformers providers are properly configured with appropriate provider types.
23-30: LGTM — FAISS vector store configuration looks correct.The SQLite kvstore and DB path are properly configured for the inline FAISS provider.
31-38: LGTM — Files API configuration is properly set up.The localfs provider with SQLite metadata store is correctly configured to support the vector store's file dependencies.
78-80: LGTM — RAG runtime provider added correctly.The inline rag-runtime provider is properly configured to enable knowledge-search capabilities.
107-112: LGTM — Embedding model configuration is correct.The sentence-transformers model with 768-dimensional embeddings is properly registered with the correct provider and path.
114-119: LGTM — Vector database configuration is consistent.The vector_db entry properly references the embedding model and matches the provider_id used in the vector_io configuration.
128-129: LGTM — RAG toolgroup correctly references rag-runtime.The builtin::rag toolgroup is properly linked to the rag-runtime provider defined earlier.
template.yaml (6)
207-207: LGTM — OCI added to supported platforms.The system prompt correctly includes Oracle Cloud Infrastructure (OCI) in the list of supported on-premise installation targets.
228-232: LGTM — OCI-specific guidance is appropriately scoped.The instructions to call
knowledge_searchfor OCI questions and use retrieved information as the primary source aligns well with the RAG functionality being added.
370-377: LGTM — Vector I/O configuration matches standalone config.The FAISS vector store configuration is consistent with
config/llama_stack_client_config.yaml(lines 23-30).
378-385: LGTM — Files configuration matches standalone config.The localfs provider configuration is consistent with
config/llama_stack_client_config.yaml(lines 31-38), using environment variable substitution appropriately for the deployment context.
429-431: LGTM — RAG runtime configuration matches standalone config.The rag-runtime provider is consistently configured across both template and standalone config files.
446-451: LGTM — Vector database configuration is consistent.The vector_dbs entry matches the configuration in
config/llama_stack_client_config.yaml(lines 114-119), ensuring consistent RAG functionality across deployment modes.test/scripts/verify_create_eval_test_oci.sh (3)
1-8: LGTM — Script setup follows best practices.The script correctly sources the common helper functions and performs necessary setup and validation checks.
10-16: LGTM — Expected values align with OCI test requirements.The expected values match the cluster configuration specified in
test/evals/eval_data.yaml(lines 141, 151) for OCI multi-node deployments, including the critical platform validation.
18-26: LGTM — Function call correctly passes all required parameters.The
wait_and_validate_clusterinvocation properly includes all eight parameters including the newexpected_platformparameter for OCI validation.test/scripts/common.sh (5)
59-59: LGTM — Platform extraction handles both internal and external platform types.The jq expression correctly extracts
platform.external.platform_namefor external platforms (like OCI) and falls back toplatform.typefor internal platforms, which aligns with the Assisted Installer API structure.
62-73: LGTM — SSH key validation handles optional parameter correctly.The function properly validates the SSH key when expected, and correctly returns success when no validation is required (empty expected_ssh_key).
75-86: LGTM — Platform validation follows the same pattern as SSH key validation.The function correctly validates the platform when expected and returns success when no validation is required.
88-110: LGTM — Validation logic properly chains all validators.The function correctly validates core properties first, then conditionally validates SSH key and platform using the dedicated validator functions.
114-168: LGTM — Enhanced validation function properly handles all parameters.The function correctly accepts the new
expected_platformparameter, passes it through the validation chain, and includes platform information in both success and error messages when applicable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
test/scripts/common.sh (1)
128-128: Fix SC2155: Split variable declaration and assignment to avoid masking return values.Combining
localdeclaration with command substitution masks the command's exit status, making it impossible to detect iffetch_cluster_datafails. Split the declaration and assignment to properly capture any errors.Apply this diff:
- local cluster_data=$(fetch_cluster_data "$cluster_name") + local cluster_data + cluster_data=$(fetch_cluster_data "$cluster_name")
🧹 Nitpick comments (1)
test/evals/eval_data.yaml (1)
55-55: Consider more natural phrasing: "on OCI" instead of "to OCI".The phrase "deploying to OCI" reads awkwardly. "Deploying on OCI" or "when deploying on OCI" would be more idiomatic.
Apply this diff:
- eval_query: How do I assign roles to nodes when deploying to OCI? + eval_query: How do I assign roles to nodes when deploying on OCI?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
Containerfile.add_llama_to_lightspeed(2 hunks)Containerfile.assisted-chat(2 hunks)config/llama_stack_client_config.yaml(4 hunks)template-params.dev.env(1 hunks)template.yaml(6 hunks)test/evals/eval_data.yaml(3 hunks)test/prow/Dockerfile.plus-llama(2 hunks)test/scripts/common.sh(4 hunks)test/scripts/verify_create_eval_test_oci.sh(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
- template-params.dev.env
- test/prow/Dockerfile.plus-llama
- Containerfile.add_llama_to_lightspeed
- test/scripts/verify_create_eval_test_oci.sh
🧰 Additional context used
🪛 Shellcheck (0.11.0)
test/scripts/common.sh
[warning] 128-128: Declare and assign separately to avoid masking return values.
(SC2155)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Red Hat Konflux / assisted-chat-saas-main-on-pull-request
- GitHub Check: Red Hat Konflux / assisted-chat-test-image-saas-main-on-pull-request
🔇 Additional comments (21)
test/scripts/common.sh (2)
62-73: LGTM!The SSH key validation logic is correct and follows a sensible pattern—validating only when an expected value is provided.
75-86: LGTM!The platform validation function follows the same optional validation pattern as SSH key validation and is implemented correctly.
test/evals/eval_data.yaml (6)
38-44: LGTM!The OCI manifest evaluation test is well-structured and the expected intent correctly describes the requirement to copy the dynamic custom manifest from the Terraform stack.
45-51: LGTM!The OCI deployment flow test appropriately validates that the assistant understands the multi-step process for OCI deployments.
66-78: LGTM!The YAML syntax issue with embedded quotes has been resolved. The multiline expected_response now uses proper literal block scalar format without literal quote characters.
136-145: LGTM!The OCI cluster creation test is properly structured with verification script and expected response. The cluster naming pattern aligns with the verification script's expectations.
146-155: LGTM!This test appropriately validates that both "OCI" and "Oracle Cloud Infrastructure" trigger the same behavior, ensuring the RAG system recognizes both forms.
237-237: LGTM!The updated expected intent correctly includes OCI in the list of supported platforms, consistent with the RAG enhancements in this PR.
config/llama_stack_client_config.yaml (6)
16-22: LGTM!The sentence-transformers inference provider is correctly configured for embedding model support.
23-30: LGTM!The vector_io provider configuration correctly points to the FAISS database copied by the Containerfile and uses appropriate SQLite backend storage.
78-80: LGTM!The rag-runtime tool provider is correctly configured and will enable RAG functionality through the builtin::rag tool group.
107-112: LGTM!The embedding model configuration is consistent with the Containerfile paths and correctly specifies the 768-dimensional all-mpnet-base-v2 model.
114-119: LGTM!The vector database configuration correctly links the embedding model to the FAISS vector store with consistent IDs and dimensions throughout.
128-129: LGTM!The builtin::rag tool group is correctly configured to use the rag-runtime provider, completing the RAG integration.
template.yaml (6)
214-216: LGTM!The system prompt correctly updates the capabilities scope to include Oracle Cloud Infrastructure (OCI) while maintaining clear boundaries for unsupported platforms.
235-239: LGTM!The RAG guidance for OCI queries is well-structured, instructing the model to prioritize knowledge_search results while appropriately handling specialized content like C3 (Oracle Compute Cloud) only when explicitly requested.
377-395: LGTM!The provider configurations in the template match the standalone config file and correctly configure sentence-transformers, FAISS vector store, and local filesystem providers for RAG functionality.
439-441: LGTM!The rag-runtime provider configuration in the template is consistent with the standalone config file.
454-460: LGTM!The embedding model configuration has been corrected to use
provider_id: sentence-transformersand is now consistent with the standalone config file.
462-467: LGTM!The vector database and RAG tool group configurations in the template are consistent with the standalone config file, completing the RAG integration across all configuration surfaces.
Also applies to: 476-477
Containerfile.assisted-chat (1)
16-18: Reconsider chmod 777 in favor of more restrictive permissions for OpenShift deployments.The current configuration uses chmod 777 on the vector DB directory and chmod 666 on the database file. For OpenShift random UIDs, chmod 775 on the directory and chmod 664 on the file would be sufficient and more secure—any UID can still read and traverse the directory (required for random UIDs), while only the owner (1001) and group can write.
Note: This pattern appears consistently in three Containerfiles (Containerfile.assisted-chat, Containerfile.add_llama_to_lightspeed, and test/prow/Dockerfile.plus-llama). If 777 is a hard requirement due to application constraints, please document the rationale.
|
/test eval-test |
5 similar comments
|
/test eval-test |
|
/test eval-test |
|
/test eval-test |
|
/test eval-test |
|
/test eval-test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (3)
template.yaml (1)
214-216: Clarify supported platform list wording and capitalization.The supported-platforms bullet is correct but could be clearer and match prior reviewer wording (plus fix “nutanix” capitalization and “baremetal” spacing).
Example refinement:
- - Supported: On-premise OpenShift installs via Assisted Installer on baremetal hosts, nutanix, virtual machines (VMs) like vSphere, KVM or libvirt, and Oracle Cloud Infrastructure (OCI). + - Supported: On-premise OpenShift installs via Assisted Installer on bare metal hosts, Oracle Cloud Infrastructure (OCI), Nutanix, or virtual machines (VMs) like vSphere, KVM, or libvirt.Containerfile.assisted-chat (1)
17-18: Revisit 666/777 permissions if possible, or document why they are required.I understand from prior discussion that OpenShift may run the container with an arbitrary UID, which can force very permissive filesystem ACLs. Still, if the process ultimately runs as UID 1001 (as configured later) or as a UID that can be mapped via group/FSGroup, you might be able to tighten this:
- Use something like
--chmod=664or644onfaiss_store.dbif the DB is effectively read-only at runtime.- Use
chmod 755or775on/llama_stack_vector_db/if the owning user/group is sufficient for writes.If arbitrary, non-1001 UIDs truly need write access, keeping 777 may be unavoidable, but adding a short comment in the Dockerfile explaining this constraint would help future readers understand the tradeoff.
test/scripts/common.sh (1)
128-132: Address ShellCheck SC2155 by separating declaration and assignment.ShellCheck (SC2155) is right to flag
local cluster_data=$(...)here; splitting the declaration and assignment avoids masking return codes and improves readability:- local cluster_data=$(fetch_cluster_data "$cluster_name") + local cluster_data + cluster_data=$(fetch_cluster_data "$cluster_name")This keeps
cluster_datalocal while making error handling clearer and satisfies the linter.
🧹 Nitpick comments (6)
template.yaml (2)
235-239: Tighten OCI + knowledge_search guidance grammar.The intent is good; just minor grammar/wording tweaks:
- - When a user query directly contains OCI related topic adhere to the following guidelines: + - When a user query directly relates to Oracle Cloud Infrastructure (OCI), adhere to the following guidelines: @@ - * Only include C3 (Oracle Compute Cloud) to your response if the user specifically asks for it. + * Only include C3 (Oracle Compute Cloud) in your response if the user specifically asks for it.
372-396: RAG providers, embedding model, and vector DB wiring are internally consistent.
sentence-transformersinline provider, embedding modelsentence-transformers/all-mpnet-base-v2, andprovider_model_id: /app-root/all-mpnet-base-v2match the paths copied in the Containerfile.vector_ioproviderocp-assisted-installer-4-19withinline::faissanddb_path: /llama_stack_vector_db/faiss_store.dbaligns with the same path in the standalone config file and Containerfile.filesproviderlocalfsusing${env.LLAMA_STACK_SQLITE_STORE_DIR}is consistent with the deployment env var.rag-runtimeis correctly registered undertool_runtimeand exposed via thebuiltin::ragtool_group.This all looks coherent and in sync with
config/llama_stack_client_config.yaml.Consider adding a brief comment somewhere noting that the embedded client config and
config/llama_stack_client_config.yamlmust stay in sync to avoid subtle drift between in-cluster and local configs.Also applies to: 440-442, 455-461, 463-468, 477-478
test/scripts/common.sh (1)
59-60: Platform and SSH key validation wiring looks sound; update usage comment for the new param.
ACTUAL_PLATFORMextraction andvalidate_expected_platform()correctly make platform validation opt‑in and integrate cleanly intovalidate_cluster_properties().- The new
validate_expected_ssh_key()helper keeps SSH key validation optional without cluttering the main condition.wait_and_validate_cluster()now threadsexpected_platformthrough and prints platform details on success and mismatch, which is very helpful for debugging OCI tests.One small thing: the usage comment above
wait_and_validate_cluster()still only mentions[expected_ssh_key]. It would be good to document the new optional parameter:-# Usage: wait_and_validate_cluster <cluster_name_prefix> <expected_version> <expected_domain> <expected_single_node> <expected_cpu_arch> <cluster_type> [expected_ssh_key] +# Usage: wait_and_validate_cluster <cluster_name_prefix> <expected_version> <expected_domain> <expected_single_node> <expected_cpu_arch> <cluster_type> [expected_ssh_key] [expected_platform]Also applies to: 62-86, 88-96, 101-105, 123-123, 141-143, 154-156
test/evals/eval_data.yaml (2)
136-155: OCI cluster creation evals are consistent with scripts and existing flows (minor style nit).
- Both
oci_cluster_creation_convand_conv_2mirror the existing multi-node flow patterns (name, version, domain, architecture) and reuseverify_create_eval_test_oci.sh, which is good for consistency.- The cluster name
eval-test-oci-uniq-cluster-namematches the default${UNIQUE_ID}behavior explained in prior discussion.- Expected responses correctly avoid including the Discovery ISO URL and instead offer to retrieve it, aligning with the system prompt.
Tiny style tweak if you care:
- eval_query: create a new multi node cluster with OCI platform integration, ... + eval_query: create a new multi-node cluster with OCI platform integration, ... @@ - eval_query: create a new multi node cluster with Oracle Cloud Infrastructure platform integration, ... + eval_query: create a new multi-node cluster with Oracle Cloud Infrastructure platform integration, ...
237-238: Capitalize Nutanix and (optionally) tweak “on-premise” wording.The updated
capabilities_scope_convintent matches the new system prompt, but there are two small wording nits:- expected_intent: A helpful response listing the supported on-premise platforms including bare metal, vSphere, OCI, and nutanix. + expected_intent: A helpful response listing the supported on-premises platforms including bare metal, vSphere, OCI, and Nutanix.Purely cosmetic, but keeps vendor name capitalization and “on-premises” usage consistent.
config/llama_stack_client_config.yaml (1)
16-22: RAG provider, embedding, and vector DB config is consistent with the template and image.
sentence-transformersis correctly registered as an inline inference provider and referenced by the embedding model.- The embedding model
sentence-transformers/all-mpnet-base-v2hasembedding_dimension: 768andprovider_model_id: /app-root/all-mpnet-base-v2, matching the directory copied in the Containerfile.vector_ioproviderocp-assisted-installer-4-19usesinline::faisswithdb_path: /llama_stack_vector_db/faiss_store.db, aligning with the Containerfile copy location and the in-cluster template.- The
localfsfiles provider with sqlite metadata_store reuses${env.SQLITE_STORE_DIR}consistently with telemetry and metadata_store.rag-runtimeis wired undertool_runtimeand exposed via thebuiltin::ragtoolgroup, mirroring the template’s ConfigMap.Everything looks coherent; this should behave the same locally and in-cluster as long as the same asset paths are present.
If you ever change the embedding model or vector DB ID, consider centralizing those identifiers (e.g., via YAML anchors or a small shared include) to avoid having to update them in multiple places across this file and
template.yaml.Also applies to: 24-38, 78-80, 107-112, 114-119, 128-129
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
Containerfile.add_llama_to_lightspeed(2 hunks)Containerfile.assisted-chat(2 hunks)config/llama_stack_client_config.yaml(4 hunks)template-params.dev.env(1 hunks)template.yaml(7 hunks)test/evals/eval_data.yaml(3 hunks)test/prow/Dockerfile.plus-llama(2 hunks)test/scripts/common.sh(4 hunks)test/scripts/verify_create_eval_test_oci.sh(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
- template-params.dev.env
- Containerfile.add_llama_to_lightspeed
- test/prow/Dockerfile.plus-llama
- test/scripts/verify_create_eval_test_oci.sh
🧰 Additional context used
🪛 Shellcheck (0.11.0)
test/scripts/common.sh
[warning] 128-128: Declare and assign separately to avoid masking return values.
(SC2155)
🔇 Additional comments (3)
template.yaml (1)
338-343: Cluster-name resolution rules look consistent and precise.The updated guidance to always silently list clusters, require an exact name match, and only fall back to asking for cluster ID when no exact match exists is clear and matches the expected
list_clusters→cluster_infoflow.Containerfile.assisted-chat (1)
1-1: RAG stage and asset wiring look correct and reproducible.
- The new
ragstage is pinned by digest, which is good for reproducibility.- The model is copied to
/app-root/all-mpnet-base-v2/, matching theprovider_model_idin both configs.- The vector DB is copied to
/llama_stack_vector_db/faiss_store.db, matching the configureddb_path.- Dropping back to
USER 1001after patch/migrate keeps the runtime non-root.No issues here.
Also applies to: 16-17, 23-23
test/evals/eval_data.yaml (1)
38-58: OCI manifest, deployment flow, roles, and storage requirements evals look well-aligned.
OCI_manifest_convclearly tests that the assistant surfaces the need to copydynamic_custom_manifest.ymlfrom the OCI Terraform stack into Assisted Installer.OCI_deployment_flowintent matches the high-level Assisted Installer + OCI flow (configure OCI, configure cluster in Assisted Installer, create OCI infra, complete installation).OCI_assign_rolescorrectly encodes that roles on OCI need to be assigned manually for discovered hosts.OCI_requirements_convprovides detailed, structured VPUs guidance in the expected response; the use of a block scalar (|) is appropriate now that the stray quotes have been removed.These additions look coherent and should give good coverage of OCI-specific knowledge.
Also applies to: 66-77
|
/test eval-test |
3 similar comments
|
/test eval-test |
|
/test eval-test |
|
/test eval-test |
|
/test eval-test |
1 similar comment
|
/test eval-test |
|
/test eval-test |
| * **Prioritize Informed Information Gathering:** During initial cluster creation, focus on efficiently collecting the four required parameters, **NEVER asking for what is already known.** | ||
| * If a step requires specific information (e.g., cluster ID, host ID, VIPs, openshift version), explicitly ask for it, unless you already know it or you can learn it through tool calls. | ||
| * When a cluster name is provided (not a UUID), strictly adhere to this logic: | ||
| * First you must silently list all known clusters to get the cluster name and cluster ID |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should list all clusters only if it doesn't already know the relevant cluster.
If it just created it or already listed all clusters in a previous turn there's no reason to list them again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The prompt was extended so these steps would be skipped if the UUID is already know
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
test/scripts/common.sh (1)
128-128: Fix ShellCheck violation SC2155: separate declaration and assignment.Line 128 declares and assigns
cluster_datain a single statement, which masks the return value offetch_cluster_data. Split into separate declaration and assignment to properly detect command failures.- local cluster_data=$(fetch_cluster_data "$cluster_name") + local cluster_data + cluster_data=$(fetch_cluster_data "$cluster_name")
🧹 Nitpick comments (1)
test/scripts/common.sh (1)
75-86: Platform validation allows literal "null" string to pass.The
validate_expected_platformfunction returns success (0) whenexpected_platformis empty, even ifACTUAL_PLATFORMcontains the literal string"null"(returned by jq -r when platform data is missing). This may mask malformed cluster data.Recommendation: Align with the platform extraction logic at line 59—if the cluster platform is genuinely required, add an early check to reject
"null"in all cases:validate_expected_platform() { local expected_platform="${1:-}" + # Reject null string returned by jq -r when platform data is missing + if [[ "$ACTUAL_PLATFORM" == "null" ]]; then + return 1 + fi if [[ -n "$expected_platform" ]]; then if [[ "$ACTUAL_PLATFORM" == "$expected_platform" ]]; then return 0 else return 1 fi else return 0 fi }Alternatively, if platform is truly optional for some test scenarios, document this assumption clearly in comments so future maintainers understand the acceptance of
"null"is intentional.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
Containerfile.add_llama_to_lightspeed(2 hunks)Containerfile.assisted-chat(2 hunks)config/llama_stack_client_config.yaml(4 hunks)template-params.dev.env(1 hunks)template.yaml(7 hunks)test/evals/eval_data.yaml(3 hunks)test/prow/Dockerfile.plus-llama(2 hunks)test/scripts/common.sh(4 hunks)test/scripts/verify_create_eval_test_oci.sh(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
- template-params.dev.env
- Containerfile.add_llama_to_lightspeed
- test/scripts/verify_create_eval_test_oci.sh
- test/evals/eval_data.yaml
- Containerfile.assisted-chat
🧰 Additional context used
🪛 Shellcheck (0.11.0)
test/scripts/common.sh
[warning] 128-128: Declare and assign separately to avoid masking return values.
(SC2155)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Red Hat Konflux / assisted-chat-saas-main-on-pull-request
- GitHub Check: Red Hat Konflux / assisted-chat-test-image-saas-main-on-pull-request
🔇 Additional comments (13)
test/prow/Dockerfile.plus-llama (2)
1-1: Verify RAG base image availability and provenance.Ensure the RAG base image digest at
quay.io/redhat-services-prod/assisted-installer-tenant/saas/assisted-chat-rag@sha256:39edcb0c36c3f0140acd2dfe5d0545dfbc3fb2e20e481b266aaf8dd962275af4is: (a) accessible during CI/CD builds, (b) regularly scanned for vulnerabilities, and (c) documented as a trusted source.
22-24: LGTM on artifact copying and permission adjustments.The COPY commands correctly transfer the embedding model and vector store from the RAG stage with appropriate ownership (1001:1001) and permissions (666 for DB file, 777 for directory) to support write access by the application user.
test/scripts/common.sh (3)
62-73: LGTM on new SSH key validator.The
validate_expected_ssh_keyfunction correctly performs optional validation—returns success when no key is expected, and validates equality when a key is provided.
88-110: LGTM on platform parameter threading.The updates to
validate_cluster_properties(adding parameter at line 95 and conditional validation at lines 101–102) correctly extend SSH key and platform validation in a backward-compatible way—both are optional and only validated if provided.
114-168: LGTM on platform integration inwait_and_validate_cluster.Platform parameter is properly threaded (line 123), passed to validators (line 131), and surfaced in success (lines 141–143) and error (lines 154–156) output paths. The conditional output ensures backward compatibility when platform is not specified.
template.yaml (4)
380-398: LGTM on inline providers configuration.Sentence-transformers, FAISS vector store, and localfs file storage are properly configured with correct kvstore paths and metadata store locations. FAISS sqlite DB path (
/llama_stack_vector_db/faiss_store.db) matches the copied artifact from the Dockerfile.
457-470: LGTM on embedding model and vector DB consistency.Embedding model (sentence-transformers/all-mpnet-base-v2, 768 dimensions) is correctly referenced by the vector DB entry. Provider IDs align with the inline providers configured above (lines 380–398).
442-444: LGTM on rag-runtime tool provider.Tool runtime now includes the inline rag-runtime provider with empty config, enabling RAG capabilities for the tool group defined later.
236-240: LGTM on OCI-specific RAG guidance.The new guidelines appropriately direct the model to prioritize knowledge_search results for OCI queries and only include Oracle Compute Cloud (C3) details when explicitly requested. This supports the RAG integration goals.
config/llama_stack_client_config.yaml (4)
16-38: LGTM on inline providers setup.Sentence-transformers inference provider, FAISS vector_io, and localfs files providers are correctly configured with proper kvstore paths, provider IDs, and env var fallbacks. Alignment with template.yaml embedding and vector DB setup is consistent.
78-80: LGTM on rag-runtime tool provider.The rag-runtime provider is correctly added to the tool_runtime section with inline::rag-runtime type, enabling RAG capabilities.
107-119: LGTM on embedding model and vector DB metadata.Embedding model (sentence-transformers/all-mpnet-base-v2, 768 dimensions) and vector DB entry (ocp-assisted-installer-4-19) are correctly configured and cross-referenced. Provider IDs align with the inline providers (lines 23–38). Consistent with template.yaml (lines 457–470).
128-129: LGTM on rag tool group wiring.The builtin::rag tool group is properly linked to the rag-runtime provider, enabling RAG-backed tool execution.
|
/retest |
70aca72 to
065f97d
Compare
|
/test eval-test |
…lidating the knowledge in RAG
template.yaml and llama_stack_client_config.yaml:
adds vector IO and vector db to the config to enable RAG in lightspeed-core
adds the rag-runtime agent so knowledge-search could be called
vector store relies on files api, so it was enabled too
eval_data.yaml:
adding the new evaluation tests
tests/scripts:
to verify if a cluster was properly created the platform has to be checked if it was set to oci
Containerfile:
copying the faiss_store.db (which is the vector database) and the embeding model to the assisted-chat image
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (5)
template-params.dev.env (1)
7-7: Model name format clarified by maintainer.The removal of the
gemini/provider prefix is consistent with the maintainer's clarification that the prefixed form is no longer supported. No issues detected.test/scripts/common.sh (2)
59-60: Platform extraction logic is adequate per maintainer discussion.The jq expression correctly extracts the platform value. While it may yield the literal string "null" when data is missing, the validation logic handles this appropriately by comparing against expected values when provided.
128-128: Fix SC2155 violation: Separate variable declaration and assignment.The
localdeclaration masks the exit status of the command substitution. Whileset -eprovides some protection, best practice is to separate declaration and assignment to properly capture any errors fromfetch_cluster_data.Apply this diff:
- local cluster_data=$(fetch_cluster_data "$cluster_name") + local cluster_data + cluster_data=$(fetch_cluster_data "$cluster_name")Containerfile.assisted-chat (1)
21-23: Directory permissions remain overly permissive per ongoing discussion.The
chmod 777on the vector DB directory was previously flagged as a security concern. While the maintainer noted OpenShift's random UID requirement, reviewer carbonin suggested775might suffice. Consider testing with more restrictive permissions (775 for directory, 644 for file) to maintain security while supporting OpenShift's UID model.test/prow/Dockerfile.plus-llama (1)
27-29: Code duplication and permission concerns mirror Containerfile.assisted-chat.These lines are identical to Containerfile.assisted-chat (lines 21-23). The same security concerns about
chmod 777apply here. Consider whether the CI/test environment could use more restrictive permissions (775) or if a shared base stage could reduce duplication.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
Containerfile.add_llama_to_lightspeed(2 hunks)Containerfile.assisted-chat(2 hunks)config/llama_stack_client_config.yaml(4 hunks)template-params.dev.env(1 hunks)template.yaml(7 hunks)test/evals/eval_data.yaml(3 hunks)test/prow/Dockerfile.plus-llama(2 hunks)test/scripts/common.sh(4 hunks)test/scripts/verify_create_eval_test_oci.sh(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- test/scripts/verify_create_eval_test_oci.sh
🚧 Files skipped from review as they are similar to previous changes (1)
- Containerfile.add_llama_to_lightspeed
🧰 Additional context used
🪛 Shellcheck (0.11.0)
test/scripts/common.sh
[warning] 128-128: Declare and assign separately to avoid masking return values.
(SC2155)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Red Hat Konflux / assisted-chat-test-image-saas-main-on-pull-request
- GitHub Check: Red Hat Konflux / assisted-chat-saas-main-on-pull-request
🔇 Additional comments (12)
Containerfile.assisted-chat (1)
1-1: LGTM: Digest pinning ensures reproducible builds.The RAG base image uses SHA256 digest pinning, which is a best practice for build reproducibility and supply chain security.
test/scripts/common.sh (1)
62-86: LGTM: Validation functions are well-structured.Both
validate_expected_ssh_keyandvalidate_expected_platformfollow consistent patterns and correctly implement optional validation logic. The functions properly return success when no expected value is provided or when values match.template.yaml (5)
236-240: LGTM: OCI-specific RAG guidance is clear and appropriate.The system prompt correctly instructs the assistant to query the knowledge base for OCI-related topics and to only include C3 (Oracle Compute Cloud) details when explicitly requested by the user. This ensures RAG is properly utilized for OCI deployments.
339-344: Cluster name resolution logic is sound.The updated logic properly handles cluster name-to-ID resolution by first silently listing all clusters, performing exact string matching, and only prompting for cluster ID when no match is found. This approach avoids unnecessary API calls when the ID is already known.
380-398: LGTM: Provider configurations align with container setup.The sentence-transformers, Faiss vector_io, and localfs file provider configurations are properly structured. The database paths match the locations established in the Containerfiles, and environment variable substitution is correctly used for flexible deployment.
442-444: LGTM: RAG runtime properly integrated.The rag-runtime provider is correctly configured as an inline provider and properly wired to the
builtin::ragtool group. This enables knowledge search functionality for OCI-related queries.Also applies to: 479-480
457-470: LGTM: Model and vector DB configurations are consistent.The embedding model configuration correctly references the sentence-transformers provider and the model path matches the Containerfile. The vector database configuration properly links to both the embedding model and the Faiss provider, with matching embedding dimensions throughout.
test/evals/eval_data.yaml (3)
38-78: LGTM: OCI evaluation tests are well-structured.The new OCI conversation groups properly test RAG-based knowledge retrieval for Oracle Cloud Infrastructure deployments. Test queries cover manifest requirements, deployment flows, role assignment, and performance recommendations. The YAML syntax issues from previous reviews have been addressed.
136-155: LGTM: OCI cluster creation tests provide good coverage.The two OCI cluster creation conversations properly test both the abbreviated "OCI" and full "Oracle Cloud Infrastructure" terms, verifying that the assistant handles terminology variations correctly. The cluster naming aligns with the verification script's expectations.
237-237: LGTM: Capabilities scope updated to include OCI.The expected intent correctly lists OCI among supported on-premise platforms, aligning with the RAG functionality added in this PR.
config/llama_stack_client_config.yaml (2)
107-119: LGTM: Model and vector DB configurations match template.yaml.The embedding model and vector database configurations are consistent with template.yaml. All provider IDs, paths, and dimensions align correctly across both configuration files.
128-129: LGTM: RAG tool group properly configured.The
builtin::ragtool group is correctly wired to therag-runtimeprovider, consistent with the configuration in template.yaml.
| storage_dir: /tmp/llama-stack-files | ||
| metadata_store: | ||
| type: sqlite | ||
| db_path: ${env.SQLITE_STORE_DIR:=/tmp/.llama/distributions/starter}/files_metadata.db |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Verify environment variable naming inconsistency.
This file uses ${env.SQLITE_STORE_DIR} while template.yaml line 396 uses ${env.LLAMA_STACK_SQLITE_STORE_DIR} for the metadata store path. Confirm whether this difference is intentional or if the variable names should be consistent across both configurations.
Run the following script to check how these environment variables are set in the deployment:
🏁 Script executed:
#!/bin/bash
# Search for SQLITE_STORE_DIR environment variable definitions
echo "=== Searching for SQLITE_STORE_DIR definitions ==="
rg -n "SQLITE_STORE_DIR" --type yaml
echo -e "\n=== Checking if both variants are used ==="
rg -n "LLAMA_STACK_SQLITE_STORE_DIR|SQLITE_STORE_DIR" template.yaml config/llama_stack_client_config.yamlLength of output: 1204
Align environment variable naming with template.yaml.
The config file uses ${env.SQLITE_STORE_DIR} on lines 36, 64, and 83, but template.yaml line 521 defines LLAMA_STACK_SQLITE_STORE_DIR as the environment variable. Update all three occurrences in config/llama_stack_client_config.yaml to use ${env.LLAMA_STACK_SQLITE_STORE_DIR:=/tmp/.llama/distributions/starter} for consistency with the template configuration.
🤖 Prompt for AI Agents
In config/llama_stack_client_config.yaml around lines 36, 64, and 83, the file
references ${env.SQLITE_STORE_DIR} but template.yaml defines
LLAMA_STACK_SQLITE_STORE_DIR; update all three occurrences to
${env.LLAMA_STACK_SQLITE_STORE_DIR:=/tmp/.llama/distributions/starter} so the
environment variable name and default match the template configuration.
adding RAG to assisted-chat and also creating evaluation tests for validating the knowledge in RAG
template.yaml and llama_stack_client_config.yaml:
eval_data.yaml:
tests/scripts:
Containerfile:
Summary by CodeRabbit
New Features
Tests
Chores
✏️ Tip: You can customize this high-level summary in your review settings.