-
Notifications
You must be signed in to change notification settings - Fork 58
LCORE-169: Provide initial set of opinionated & tested llama-stack co… #424
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughAdds duplicated RAG Configuration entries to README, expands docs/rag_guide.md with FAISS, pgvector, provider/tooling notes and examples, and adds multiple new example YAMLs demonstrating OpenAI, pgvector, FAISS, vLLM (Granite, Llama, Qwen3) and Gemini setups. All edits are documentation and examples only. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor User
participant Agent
participant RAG_Tool as RAG Tool Runtime
participant VecIO as Vector I/O
participant Emb as Embedding Model
participant LLM as Inference Provider
User->>Agent: Query
Agent->>RAG_Tool: Invoke RAG
RAG_Tool->>Emb: Embed(query)
RAG_Tool->>VecIO: Search(embedding, top_k)
VecIO-->>RAG_Tool: Retrieved chunks
RAG_Tool->>Agent: Contextualized prompt
Agent->>LLM: Prompt + Context
LLM-->>Agent: Completion
Agent-->>User: Answer
note over VecIO,Emb: VecIO provider: FAISS / pgvector
note over LLM: Provider: vLLM / OpenAI / Gemini
sequenceDiagram
autonumber
actor Operator
participant Stack as Llama Stack
participant Remote as Remote Provider
Operator->>Stack: Deploy/Configure model (url, token, env)
Stack->>Remote: /v1/chat/completions (auth)
Remote-->>Stack: Response
Stack-->>Operator: Served completion
note over Stack,Remote: Applies to Granite, Llama, Qwen3, Gemini examples
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🧹 Nitpick comments (34)
examples/openai-faiss-run.yaml (5)
2-6: Clean up trailing spaces to satisfy YAML linters and CI.Multiple lines have trailing spaces flagged by yamllint. Remove them to keep CI green.
-# +# -# +# -# - You will need an OpenAI API key␠ +# - You will need an OpenAI API key -# - You can generate the vector index with the rag-content tool (https://github.com/lightspeed-core/rag-content)␠ +# - You can generate the vector index with the rag-content tool (https://github.com/lightspeed-core/rag-content) - - model_id: gpt-test ␠ + - model_id: gpt-test - provider_model_id: /home/USER/lightspeed-stack/embedding_models/all-mpnet-base-v2 ␠ + provider_model_id: /home/USER/lightspeed-stack/embedding_models/all-mpnet-base-v2 - - provider_id: sentence-transformers ␠ + - provider_id: sentence-transformers - - provider_id: ocp-docs ␠ + - provider_id: ocp-docs - - provider_id: rag-runtime ␠ + - provider_id: rag-runtime - embedding_model: sentence-transformers/all-mpnet-base-v2 ␠ + embedding_model: sentence-transformers/all-mpnet-base-v2Also applies to: 18-18, 28-29, 32-32, 36-36, 59-59, 68-68, 81-81
72-78: Drop null-valued fields in tool_groups for brevity.args and mcp_endpoint set to null are optional noise; removing them keeps the example minimal.
tool_groups: - provider_id: rag-runtime toolgroup_id: builtin::rag - args: null - mcp_endpoint: null
23-29: Parameterize user-specific absolute paths.Hard-coded /home/USER paths make the example brittle. Prefer env vars with sensible defaults.
- provider_id: sentence-transformers # This ID is a reference to 'providers.inference' - provider_model_id: /home/USER/lightspeed-stack/embedding_models/all-mpnet-base-v2 + provider_id: sentence-transformers # This ID is a reference to 'providers.inference' + provider_model_id: ${env.EMBEDDING_MODEL_PATH:=/opt/lightspeed/embedding_models/all-mpnet-base-v2} @@ - db_path: /home/USER/lightspeed-stack/vector_dbs/ocp_docs/faiss_store.db + db_path: ${env.VECTOR_DB_PATH:=/opt/lightspeed/vector_dbs/ocp_docs/faiss_store.db}Also applies to: 62-65
83-83: Ensure newline at end of file.POSIX newline at EOF avoids tooling diffs and linter errors.
- vector_db_id: openshift-index # This ID was defined during index generation + vector_db_id: openshift-index # This ID was defined during index generation +
1-16: Provide runnable verification steps for this example.Given PR objectives, please include steps to build the FAISS index and validate the flow end-to-end.
Proposed outline:
- Prereqs: export OPENAI_API_KEY; install rag-content per https://github.com/lightspeed-core/rag-content.
- Build FAISS index: rag-content build --input ./docs --embedding-model sentence-transformers/all-mpnet-base-v2 --out ${VECTOR_DB_PATH:=/opt/lightspeed/vector_dbs/ocp_docs}/openshift-index
- Start server: llamastack serve --config examples/openai-faiss-run.yaml
- Sanity query: use the builtin RAG toolgroup via your agents API to ask a doc-grounded question and attach the vector_db_id openshift-index. Capture status 200, model output, and top-k doc IDs as artifacts.
Do you want me to add a short TESTING.md section and wire it from README?examples/vllm-qwen3-run.yaml (6)
3-5: Remove trailing spaces in comments.Tiny lint cleanups on comments.
-# +# -# +#
56-58: Avoid zero-width space as default service name.Using \u200B yields an invisible string that’s hard to spot in logs. Prefer empty string or a descriptive default.
- service_name: "${env.OTEL_SERVICE_NAME:=\u200B}" + service_name: "${env.OTEL_SERVICE_NAME:=vllm-qwen3}"
27-31: Externalize the API token.Replace placeholder with an environment variable to prevent accidental commits of secrets and ease deployment.
- api_token: <add your api key> + api_token: ${env.QWEN_API_TOKEN}
39-45: Caution: example Postgres defaults are insecure.Defaults include user/password “user/password”. Fine for local testing, but please add a warning comment to avoid using in production.
- user: ${env.POSTGRES_USER:=user} - password: ${env.POSTGRES_PASSWORD:=password} + user: ${env.POSTGRES_USER:=user} # Do not use default creds in production + password: ${env.POSTGRES_PASSWORD:=password} # Replace via env @@ - user: ${env.POSTGRES_USER:=user} - password: ${env.POSTGRES_PASSWORD:=password} + user: ${env.POSTGRES_USER:=user} # Do not use default creds in production + password: ${env.POSTGRES_PASSWORD:=password} # Replace via env @@ - user: ${env.POSTGRES_USER:=user} - password: ${env.POSTGRES_PASSWORD:=password} + user: ${env.POSTGRES_USER:=user} # Do not use default creds in production + password: ${env.POSTGRES_PASSWORD:=password} # Replace via envAlso applies to: 46-52, 76-83, 84-90
108-108: Ensure newline at end of file.Adds POSIX newline to silence linter.
server: port: 8321 +
1-11: Add minimal verification steps for this config.Given PR asks for testing artifacts, include a snippet on how to hit vLLM and confirm a basic completion path.
Suggested steps:
- Set POSTGRES_* env vars and QWEN_API_TOKEN.
- Start server: llamastack serve --config examples/vllm-qwen3-run.yaml
- Smoke test: call the inference chat/completions endpoint with model_id qwen3-32b-maas; capture response time and a 200 result.
I can contribute a short README section with curl examples if helpful.examples/vllm-llama-faiss-run.yaml (4)
2-2: Strip trailing spaces.Minor lint cleanups.
-# +# - +Also applies to: 27-27
76-80: Add newline at EOF.Silences linter and prevents spurious diffs.
provider_id: rhel-db vector_db_id: rhel-docs +
65-75: Optional: drop nulls in tool_groups.Keeps the example concise without losing meaning.
tool_groups: - provider_id: rag-runtime toolgroup_id: builtin::rag - args: null - mcp_endpoint: null
1-8: Add quick verification guidance.Please include minimal steps to verify RAG wiring with FAISS on vLLM.
Proposed:
- Start vLLM at http://localhost:8000/v1 with Llama 3.1 Instruct.
- Export VLLM_API_TOKEN (if enforced) and VECTOR_DB_PATH; build FAISS index matching vector_db_id rhel-docs.
- llamastack serve --config examples/vllm-llama-faiss-run.yaml
- Query via agents API using builtin::rag; collect a 200 result and top-k references as artifacts.
examples/gemini-run.yaml (2)
2-2: Trim trailing spaces in comments.Minor lint-only fix.
-# +# -# +#Also applies to: 5-5
23-29: Provide minimal TESTING steps inline with this example.Given the PR checklist request, please add a short “Testing” comment block or README pointer.
Proposed steps:
- export GEMINI_API_KEY and Postgres env vars (LLAMA_STACK_POSTGRES_*).
- Start server: llamastack serve --config examples/gemini-run.yaml
- Send a simple prompt with one of the configured model_ids; attach artifacts (HTTP 200, response JSON snippet).
I can draft the commands if you confirm the intended REST paths.Also applies to: 37-43, 44-50, 77-83
examples/openai-pgvector-run.yaml (5)
2-2: Remove trailing spaces to satisfy yamllint.Multiple trailing-space instances flagged; remove to keep CI clean.
-# +# -# +# - + - - provider_id: pgvector-example ␠ + - provider_id: pgvector-example - provider_id: pgvector-example ␠ + provider_id: pgvector-exampleAlso applies to: 7-7, 29-29, 58-58, 84-84
23-29: Parameterize embedding model path.Avoid user-specific absolute paths to improve portability.
- provider_model_id: /home/USER/lightspeed-stack/embedding_models/all-mpnet-base-v2 + provider_model_id: ${env.EMBEDDING_MODEL_PATH:=/opt/lightspeed/embedding_models/all-mpnet-base-v2}
75-80: Drop nulls in tool_groups to simplify.Optional but keeps example cleaner.
tool_groups: - provider_id: rag-runtime toolgroup_id: builtin::rag - args: null - mcp_endpoint: null
81-87: Add newline at EOF and keep vector_dbs block consistent.Also, slight consistency tweak with env-parameterized connection earlier.
provider_id: pgvector-example # A unique ID that becomes the PostgreSQL table name, prefixed with 'vector_store_'. # e.g., 'rhdocs' will create the table 'vector_store_rhdocs'. vector_db_id: rhdocs +
1-17: Request verification: end-to-end pgvector smoke test steps.Please include in the docs a brief test plan and artifacts.
Suggested:
- Ensure PostgreSQL with pgvector extension: CREATE EXTENSION IF NOT EXISTS vector;
- Export PGVECTOR_* envs and OPENAI_API_KEY.
- Create table via the stack (vector_db_id rhdocs => table vector_store_rhdocs).
- Generate embeddings and insert rows using the rag tool, referencing embedding_model sentence-transformers/all-mpnet-base-v2.
- Start: llamastack serve --config examples/openai-pgvector-run.yaml
- Query via agents API using builtin::rag, expect 200 and top-k matches. Attach logs and a sample response.
I can supply the exact SQL schema expected by llama-stack if you want it included in rag_guide.md.Also applies to: 57-69, 81-87
examples/vllm-granite-run.yaml (7)
3-5: Fix YAML lint issues: trailing spaces and missing EOF newline.Remove trailing spaces (Lines 3, 5) and add a newline at end of file to satisfy yamllint.
-# +# -# +#-server: - port: 8321 +server: + port: 8321 +Also applies to: 148-148
117-120: Remove unusable model-context-protocol provider or supply required config.
remote::model-context-protocolhas an empty config and is not referenced by any tool_group. This will likely error at runtime if loaded. Remove it or provide the required endpoint configuration.Option A — remove unused provider:
- - provider_id: model-context-protocol - provider_type: remote::model-context-protocol - config: {}Option B — keep it and add a URL (example):
- - provider_id: model-context-protocol - provider_type: remote::model-context-protocol - config: {} + - provider_id: model-context-protocol + provider_type: remote::model-context-protocol + config: + url: ${env.MCP_ENDPOINT_URL}
28-30: Confirm vLLM URL points to the OpenAI-compatible /v1 endpoint and token behavior.Ensure
${env.VLLM_URL}includes the/v1prefix (e.g., http://host:8000/v1/) expected byremote::vllm, and verify whether your deployment actually requiresapi_token. If not, consider omitting it to avoid confusion.Would you like me to add a brief comment in-file clarifying that
VLLM_URLmust target the OpenAI-compatible/v1endpoint?
131-134: Verify model registration: provider_model_id is null.Some stacks expect
provider_model_idto match the served model name (e.g., meta-llama/Llama-3.1-8B-Instruct) returned by the vLLM/v1/modelsendpoint. If null is acceptable with your version, OK; otherwise set it explicitly.models: - metadata: {} model_id: ${env.INFERENCE_MODEL} provider_id: granite - provider_model_id: null + provider_model_id: ${env.INFERENCE_MODEL}
40-47: Use provider-appropriate default storage paths for clarity.Default paths reference “ollama” directories in an example meant for vLLM/Granite. Consider renaming to avoid confusion.
- db_path: ${env.SQLITE_STORE_DIR:~/.llama/distributions/ollama}/agents_store.db + db_path: ${env.SQLITE_STORE_DIR:~/.llama/distributions/vllm}/agents_store.db- db_path: ${env.SQLITE_STORE_DIR:~/.llama/distributions/ollama}/responses_store.db + db_path: ${env.SQLITE_STORE_DIR:~/.llama/distributions/vllm}/responses_store.db- storage_dir: ${env.FILES_STORAGE_DIR:~/.llama/distributions/ollama/files} + storage_dir: ${env.FILES_STORAGE_DIR:~/.llama/distributions/vllm/files}- db_path: ${env.SQLITE_STORE_DIR:~/.llama/distributions/ollama}/files_metadata.db + db_path: ${env.SQLITE_STORE_DIR:~/.llama/distributions/vllm}/files_metadata.dbAlso applies to: 92-95
114-116: If this is truly “no RAG”, drop rag runtime and tool group.The config description says “no RAG”, but
rag-runtimeand abuiltin::ragtool group are included. Remove both to keep the example minimal and avoid unused providers.Remove rag runtime provider:
- - provider_id: rag-runtime - provider_type: inline::rag-runtime - config: {}Remove rag tool group:
- - toolgroup_id: builtin::rag - provider_id: rag-runtimeAlso applies to: 141-146
1-22: Provide quickstart test steps and required env vars for this example.Add a brief, runnable checklist so users can validate the config quickly.
Proposed snippet to add as a comment block near the top:
# Quickstart (example) # export VLLM_URL="http://localhost:8000/v1/" # export INFERENCE_MODEL="meta-llama/Llama-3.1-8B-Instruct" # or your served model # optional: export VLLM_API_TOKEN="..." # llama stack run examples/vllm-granite-run.yaml # curl -s ${VLLM_URL}/models | jq .Would you like me to submit a commit adding this quickstart block?
README.md (1)
462-462: Tighten wording: “configuration examples.”Minor copy edit for readability and consistency.
-The following configurations are llama-stack config examples from production deployments: +The following are llama-stack configuration examples from production deployments:docs/rag_guide.md (4)
52-54: Fix capitalization typo (“LLama” → “Llama”).-LLama Stack will then download the model to the `~/.cache/huggingface/hub` folder. +Llama Stack will then download the model to the `~/.cache/huggingface/hub` folder.
118-120: Unify list style to pass markdownlint (MD004).Switch dashes to asterisks to match the rest of the document.
-- `id` (`text`): UUID identifier of the chunk -- `document` (`jsonb`): json containing content and metadata associated with the embedding -- `embedding` (`vector(n)`): the embedding vector, where `n` is the embedding dimension and must match the model's output size (e.g. 768 for `all-mpnet-base-v2`) +* `id` (`text`): UUID identifier of the chunk +* `document` (`jsonb`): json containing content and metadata associated with the embedding +* `embedding` (`vector(n)`): the embedding vector, where `n` is the embedding dimension and must match the model's output size (e.g. 768 for `all-mpnet-base-v2`)
158-176: Clarify vLLM authentication expectations.
vllm-openaioften runs without an auth token by default. Confirm whether your setup requiresapi_tokenand note it accordingly to prevent confusion.I can add a one-line note: “If your vLLM endpoint does not require authentication, omit api_token.”
Also applies to: 204-206
257-257: Fix grammar: “were” not “where”.-The RAG tool calls where not working properly when experimenting with `mistralai/Mistral-7B-Instruct-v0.3` on vLLM. +The RAG tool calls were not working properly when experimenting with `mistralai/Mistral-7B-Instruct-v0.3` on vLLM.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (8)
README.md(2 hunks)docs/rag_guide.md(3 hunks)examples/gemini-run.yaml(1 hunks)examples/openai-faiss-run.yaml(1 hunks)examples/openai-pgvector-run.yaml(1 hunks)examples/vllm-granite-run.yaml(1 hunks)examples/vllm-llama-faiss-run.yaml(1 hunks)examples/vllm-qwen3-run.yaml(1 hunks)
🧰 Additional context used
🪛 YAMLlint (1.37.1)
examples/vllm-granite-run.yaml
[error] 3-3: trailing spaces
(trailing-spaces)
[error] 5-5: trailing spaces
(trailing-spaces)
[error] 148-148: no new line character at the end of file
(new-line-at-end-of-file)
examples/vllm-qwen3-run.yaml
[error] 3-3: trailing spaces
(trailing-spaces)
[error] 5-5: trailing spaces
(trailing-spaces)
[error] 108-108: no new line character at the end of file
(new-line-at-end-of-file)
examples/openai-pgvector-run.yaml
[error] 2-2: trailing spaces
(trailing-spaces)
[error] 7-7: trailing spaces
(trailing-spaces)
[error] 29-29: trailing spaces
(trailing-spaces)
[warning] 31-31: wrong indentation: expected 4 but found 2
(indentation)
[warning] 38-38: wrong indentation: expected 8 but found 6
(indentation)
[warning] 44-44: wrong indentation: expected 8 but found 6
(indentation)
[warning] 45-45: wrong indentation: expected 10 but found 8
(indentation)
[warning] 48-48: wrong indentation: expected 10 but found 8
(indentation)
[warning] 55-55: wrong indentation: expected 8 but found 6
(indentation)
[error] 58-58: trailing spaces
(trailing-spaces)
[warning] 61-61: wrong indentation: expected 8 but found 6
(indentation)
[warning] 67-67: wrong indentation: expected 10 but found 8
(indentation)
[error] 84-84: trailing spaces
(trailing-spaces)
[error] 87-87: no new line character at the end of file
(new-line-at-end-of-file)
examples/vllm-llama-faiss-run.yaml
[error] 2-2: trailing spaces
(trailing-spaces)
[error] 27-27: trailing spaces
(trailing-spaces)
[warning] 29-29: wrong indentation: expected 4 but found 2
(indentation)
[warning] 36-36: wrong indentation: expected 8 but found 6
(indentation)
[warning] 43-43: wrong indentation: expected 8 but found 6
(indentation)
[warning] 44-44: wrong indentation: expected 10 but found 8
(indentation)
[warning] 47-47: wrong indentation: expected 10 but found 8
(indentation)
[warning] 54-54: wrong indentation: expected 8 but found 6
(indentation)
[warning] 60-60: wrong indentation: expected 8 but found 6
(indentation)
[warning] 61-61: wrong indentation: expected 10 but found 8
(indentation)
[error] 80-80: no new line character at the end of file
(new-line-at-end-of-file)
examples/gemini-run.yaml
[error] 2-2: trailing spaces
(trailing-spaces)
[error] 5-5: trailing spaces
(trailing-spaces)
examples/openai-faiss-run.yaml
[error] 2-2: trailing spaces
(trailing-spaces)
[error] 6-6: trailing spaces
(trailing-spaces)
[error] 18-18: trailing spaces
(trailing-spaces)
[error] 28-28: trailing spaces
(trailing-spaces)
[error] 29-29: trailing spaces
(trailing-spaces)
[warning] 31-31: wrong indentation: expected 4 but found 2
(indentation)
[error] 32-32: trailing spaces
(trailing-spaces)
[error] 36-36: trailing spaces
(trailing-spaces)
[warning] 39-39: wrong indentation: expected 8 but found 6
(indentation)
[warning] 45-45: wrong indentation: expected 8 but found 6
(indentation)
[warning] 46-46: wrong indentation: expected 10 but found 8
(indentation)
[warning] 49-49: wrong indentation: expected 10 but found 8
(indentation)
[warning] 56-56: wrong indentation: expected 8 but found 6
(indentation)
[error] 59-59: trailing spaces
(trailing-spaces)
[warning] 62-62: wrong indentation: expected 8 but found 6
(indentation)
[warning] 63-63: wrong indentation: expected 10 but found 8
(indentation)
[error] 68-68: trailing spaces
(trailing-spaces)
[error] 81-81: trailing spaces
(trailing-spaces)
[error] 83-83: no new line character at the end of file
(new-line-at-end-of-file)
🪛 LanguageTool
README.md
[grammar] ~464-~464: There might be a mistake here.
Context: ...deployments: - Granite on vLLM example - [Qwen3 on vLLM example](examples/vllm-qwe...
(QB_NEW_EN)
[grammar] ~465-~465: There might be a mistake here.
Context: ...anite-run.yaml) - Qwen3 on vLLM example - [Gemini example](examples/gemini-run.yaml...
(QB_NEW_EN)
[grammar] ~468-~468: There might be a mistake here.
Context: ...le](examples/gemini-run.yaml) > [!NOTE] > RAG functionality is not tested fo...
(QB_NEW_EN)
docs/rag_guide.md
[grammar] ~122-~122: There might be a mistake here.
Context: ...768 for all-mpnet-base-v2) > [!NOTE] > The vector_db_id (e.g. rhdocs) is used...
(QB_NEW_EN)
[grammar] ~178-~178: There might be a mistake here.
Context: ...ion options, see the vLLM documentation: > [vLLM: Tool Calling](https://docs.vllm...
(QB_NEW_EN)
🪛 markdownlint-cli2 (0.17.2)
docs/rag_guide.md
118-118: Unordered list style
Expected: asterisk; Actual: dash
(MD004, ul-style)
119-119: Unordered list style
Expected: asterisk; Actual: dash
(MD004, ul-style)
120-120: Unordered list style
Expected: asterisk; Actual: dash
(MD004, ul-style)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build-pr
- GitHub Check: e2e_tests
🔇 Additional comments (3)
examples/gemini-run.yaml (1)
25-29: Standardize on${env.VAR}for all environment-driven YAML valuesOther examples in this PR consistently use the
${env.*}syntax. Mixing in bare${VAR}may result in literal strings if your loader doesn’t support that form. Please verify whether your config loader handles both styles; if not, switch all bare references to${env.VAR}.This applies to:
examples/gemini-run.yaml
• Lines 25–29, 54–56, 75–76, 84–98, 105–112Example change:
- provider_id: ${LLAMA_STACK_INFERENCE_PROVIDER} + provider_id: ${env.LLAMA_STACK_INFERENCE_PROVIDER}If bare
${VAR}is intentionally supported, please add a comment clarifying that behavior.README.md (2)
462-467: All referenced example files are presentVerified that the following files exist in the
examples/directory, so the README links won’t 404:
- examples/vllm-granite-run.yaml
- examples/vllm-qwen3-run.yaml
- examples/gemini-run.yaml
No further action required.
38-38: No duplicate “RAG Configuration” entries foundVerification confirms exactly one
# RAG Configurationheader (line 455) and one TOC entry (line 38). No further action needed.
| See the full working [config example](examples/openai-faiss-run.yaml) for more details. | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix relative links to examples from docs/ to examples/.
From docs/, the correct relative path is ../examples/...
Update FAISS example reference:
-See the full working [config example](examples/openai-faiss-run.yaml) for more details.
+See the full working [config example](../examples/openai-faiss-run.yaml) for more details.Update pgvector example reference:
-See the full working [config example](examples/openai-pgvector-run.yaml) for more details.
+See the full working [config example](../examples/openai-pgvector-run.yaml) for more details.Update vLLM Llama example reference:
-See the full working [config example](examples/vllm-llama-faiss-run.yaml) for more details.
+See the full working [config example](../examples/vllm-llama-faiss-run.yaml) for more details.Update OpenAI example reference:
-See the full working [config example](examples/openai-faiss-run.yaml) for more details.
+See the full working [config example](../examples/openai-faiss-run.yaml) for more details.Also applies to: 152-153, 208-209, 241-242
🤖 Prompt for AI Agents
docs/rag_guide.md lines 103-104 (and also update similar occurrences at 152-153,
208-209, 241-242): the example links currently point to examples/... which are
incorrect when referenced from docs/; change each link to use the correct
relative path prefix ../examples/ (e.g., ../examples/openai-faiss-run.yaml) for
the FAISS, pgvector, vLLM Llama, and OpenAI example references so they resolve
correctly from the docs directory.
| inference: | ||
| - provider_id: sentence-transformers | ||
| provider_type: inline::sentence-transformers | ||
| config: {} | ||
|
|
||
| - provider_id: openai | ||
| provider_type: remote::openai | ||
| config: | ||
| api_key: ${env.OPENAI_API_KEY} | ||
|
|
||
| agents: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Fix indentation to 4 spaces per level under providers and nested maps.
YAML lint expects 4-space indentation. Current blocks under providers/config/stores are under-indented by 2 spaces.
-providers:
- inference:
- - provider_id: sentence-transformers
- provider_type: inline::sentence-transformers
- config: {}
-
- - provider_id: openai
- provider_type: remote::openai
- config:
- api_key: ${env.OPENAI_API_KEY}
-
- agents:
- - provider_id: meta-reference
- provider_type: inline::meta-reference
- config:
- persistence_store:
- type: sqlite
- db_path: .llama/distributions/ollama/agents_store.db
- responses_store:
- type: sqlite
- db_path: .llama/distributions/ollama/responses_store.db
-
- safety:
- - provider_id: llama-guard
- provider_type: inline::llama-guard
- config:
- excluded_categories: []
-
- vector_io:
- - provider_id: ocp-docs
- provider_type: inline::faiss
- config:
- kvstore:
- type: sqlite
- db_path: /home/USER/lightspeed-stack/vector_dbs/ocp_docs/faiss_store.db
- namespace: null
-
- tool_runtime:
- - provider_id: rag-runtime
- provider_type: inline::rag-runtime
- config: {}
+providers:
+ inference:
+ - provider_id: sentence-transformers
+ provider_type: inline::sentence-transformers
+ config: {}
+
+ - provider_id: openai
+ provider_type: remote::openai
+ config:
+ api_key: ${env.OPENAI_API_KEY}
+
+ agents:
+ - provider_id: meta-reference
+ provider_type: inline::meta-reference
+ config:
+ persistence_store:
+ type: sqlite
+ db_path: .llama/distributions/ollama/agents_store.db
+ responses_store:
+ type: sqlite
+ db_path: .llama/distributions/ollama/responses_store.db
+
+ safety:
+ - provider_id: llama-guard
+ provider_type: inline::llama-guard
+ config:
+ excluded_categories: []
+
+ vector_io:
+ - provider_id: ocp-docs
+ provider_type: inline::faiss
+ config:
+ kvstore:
+ type: sqlite
+ db_path: /home/USER/lightspeed-stack/vector_dbs/ocp_docs/faiss_store.db
+ namespace: null
+
+ tool_runtime:
+ - provider_id: rag-runtime
+ provider_type: inline::rag-runtime
+ config: {}Also applies to: 45-51, 56-66, 67-71
🧰 Tools
🪛 YAMLlint (1.37.1)
[warning] 31-31: wrong indentation: expected 4 but found 2
(indentation)
[error] 32-32: trailing spaces
(trailing-spaces)
[error] 36-36: trailing spaces
(trailing-spaces)
[warning] 39-39: wrong indentation: expected 8 but found 6
(indentation)
🤖 Prompt for AI Agents
In examples/openai-faiss-run.yaml around lines 31 to 41 (and also apply to
45-51, 56-66, 67-71), the YAML blocks under "inference", provider entries,
"config", and other nested maps use 2-space indentation; update these to use 4
spaces per indentation level consistently (each nested key/value should be
indented by 4 spaces from its parent) so that provider entries, their
provider_type and config keys and values, and any stores/nested maps align with
YAML lint expectations.
| inference: | ||
| - provider_id: sentence-transformers | ||
| provider_type: inline::sentence-transformers | ||
| config: {} | ||
| - provider_id: openai | ||
| provider_type: remote::openai | ||
| config: | ||
| api_key: ${env.OPENAI_API_KEY} | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Indentation: switch to 4 spaces per nesting level.
Aligns with yamllint expectations and your other examples.
-providers:
- inference:
- - provider_id: sentence-transformers
- provider_type: inline::sentence-transformers
- config: {}
- - provider_id: openai
- provider_type: remote::openai
- config:
- api_key: ${env.OPENAI_API_KEY}
-
- agents:
- - provider_id: meta-reference
- provider_type: inline::meta-reference
- config:
- persistence_store:
- type: sqlite
- db_path: .llama/distributions/ollama/agents_store.db
- responses_store:
- type: sqlite
- db_path: .llama/distributions/ollama/responses_store.db
-
- safety:
- - provider_id: llama-guard
- provider_type: inline::llama-guard
- config:
- excluded_categories: []
-
- vector_io:
- - provider_id: pgvector-example
- provider_type: remote::pgvector
- config:
- host: localhost
- port: 5432
- db: pgvector_example # PostgreSQL database (psql -d pgvector_example)
- user: lightspeed # PostgreSQL user
- password: empty
- kvstore:
- type: sqlite
- db_path: .llama/distributions/pgvector/pgvector_registry.db
+providers:
+ inference:
+ - provider_id: sentence-transformers
+ provider_type: inline::sentence-transformers
+ config: {}
+ - provider_id: openai
+ provider_type: remote::openai
+ config:
+ api_key: ${env.OPENAI_API_KEY}
+
+ agents:
+ - provider_id: meta-reference
+ provider_type: inline::meta-reference
+ config:
+ persistence_store:
+ type: sqlite
+ db_path: .llama/distributions/ollama/agents_store.db
+ responses_store:
+ type: sqlite
+ db_path: .llama/distributions/ollama/responses_store.db
+
+ safety:
+ - provider_id: llama-guard
+ provider_type: inline::llama-guard
+ config:
+ excluded_categories: []
+
+ vector_io:
+ - provider_id: pgvector-example
+ provider_type: remote::pgvector
+ config:
+ host: ${env.PGVECTOR_HOST:=localhost}
+ port: ${env.PGVECTOR_PORT:=5432}
+ db: ${env.PGVECTOR_DB:=pgvector_example} # PostgreSQL database (psql -d pgvector_example)
+ user: ${env.PGVECTOR_USER:=lightspeed} # PostgreSQL user
+ password: ${env.PGVECTOR_PASSWORD:=empty}
+ kvstore:
+ type: sqlite
+ db_path: .llama/distributions/pgvector/pgvector_registry.dbAlso applies to: 44-50, 51-56, 61-69
🧰 Tools
🪛 YAMLlint (1.37.1)
[warning] 31-31: wrong indentation: expected 4 but found 2
(indentation)
[warning] 38-38: wrong indentation: expected 8 but found 6
(indentation)
🤖 Prompt for AI Agents
In examples/openai-pgvector-run.yaml around lines 31 to 39 (and similarly for
44-50, 51-56, 61-69), the YAML uses 2-space or inconsistent indentation; update
the file to use 4 spaces per nesting level consistently (e.g., under "inference"
indent each list item and their nested keys by 4 spaces, and ensure nested
mapping entries like "config:" and "api_key:" are indented an additional 4
spaces relative to their parent) so it conforms to yamllint expectations and
matches other examples.
| inference: | ||
| - provider_id: sentence-transformers | ||
| provider_type: inline::sentence-transformers | ||
| config: {} | ||
| - provider_id: vllm | ||
| provider_type: remote::vllm | ||
| config: | ||
| url: http://localhost:8000/v1/ | ||
| api_token: key |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Normalize indentation to 4 spaces and tweak vLLM URL/token.
- Align to 4-space indent per yamllint.
- Prefer URL without trailing slash to avoid double-slash issues with some clients.
- Externalize api_token.
-providers:
- inference:
- - provider_id: sentence-transformers
- provider_type: inline::sentence-transformers
- config: {}
- - provider_id: vllm
- provider_type: remote::vllm
- config:
- url: http://localhost:8000/v1/
- api_token: key
+providers:
+ inference:
+ - provider_id: sentence-transformers
+ provider_type: inline::sentence-transformers
+ config: {}
+ - provider_id: vllm
+ provider_type: remote::vllm
+ config:
+ url: http://localhost:8000/v1
+ api_token: ${env.VLLM_API_TOKEN}
@@
- agents:
- - provider_id: meta-reference
- provider_type: inline::meta-reference
- config:
- persistence_store:
- type: sqlite
- db_path: .llama/distributions/ollama/agents_store.db
- responses_store:
- type: sqlite
- db_path: .llama/distributions/ollama/responses_store.db
+ agents:
+ - provider_id: meta-reference
+ provider_type: inline::meta-reference
+ config:
+ persistence_store:
+ type: sqlite
+ db_path: .llama/distributions/ollama/agents_store.db
+ responses_store:
+ type: sqlite
+ db_path: .llama/distributions/ollama/responses_store.db
@@
- vector_io:
- - provider_id: rhel-db
- provider_type: inline::faiss
- config:
- kvstore:
- type: sqlite
- db_path: /home/USER/vector_dbs/rhel_index/faiss_store.db
- namespace: null
+ vector_io:
+ - provider_id: rhel-db
+ provider_type: inline::faiss
+ config:
+ kvstore:
+ type: sqlite
+ db_path: ${env.VECTOR_DB_PATH:=/opt/lightspeed/vector_dbs/rhel_index/faiss_store.db}
+ namespace: nullAlso applies to: 43-49, 56-64
🧰 Tools
🪛 YAMLlint (1.37.1)
[warning] 29-29: wrong indentation: expected 4 but found 2
(indentation)
[warning] 36-36: wrong indentation: expected 8 but found 6
(indentation)
🤖 Prompt for AI Agents
In examples/vllm-llama-faiss-run.yaml around lines 29-37 (and likewise adjust
blocks at 43-49 and 56-64), normalize indentation to 4 spaces for all YAML
entries, remove the trailing slash from the vllm url (use http://localhost:8000)
to avoid double-slash issues, and externalize the api_token (replace the inline
token with a reference to an environment variable or secret placeholder such as
${VLLM_API_TOKEN} or a secrets key) so no secret is stored directly in the file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
README.md (1)
455-458: Fixed: RAG guide link now points to docs/rag_guide.mdThe previously broken link has been corrected to use the underscore path.
🧹 Nitpick comments (2)
README.md (2)
459-466: Align list style and capitalization; minor polish
- Use asterisks for unordered lists here to satisfy MD004 and keep style consistent with the rest of the README.
- Capitalize “Llama Stack” for consistency with prior usage in the document.
Apply this diff:
-## Example configurations for inference - -The following configurations are llama-stack config examples from production deployments: - -- [Granite on vLLM example](examples/vllm-granite-run.yaml) -- [Qwen3 on vLLM example](examples/vllm-qwen3-run.yaml) -- [Gemini example](examples/gemini-run.yaml) +## Example configurations for inference + +The following configurations are Llama Stack config examples from production deployments: + +* [Granite on vLLM example](examples/vllm-granite-run.yaml) +* [Qwen3 on vLLM example](examples/vllm-qwen3-run.yaml) +* [Gemini example](examples/gemini-run.yaml)
467-468: Optional wording tweakCurrent wording is fine; if you want to be slightly more formal, consider “validated” instead of “tested.”
-> RAG functionality is **not tested** for these configurations. +> RAG functionality is **not validated** for these configurations.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
README.md(2 hunks)
🧰 Additional context used
🪛 LanguageTool
README.md
[grammar] ~463-~463: There might be a mistake here.
Context: ...deployments: - Granite on vLLM example - [Qwen3 on vLLM example](examples/vllm-qwe...
(QB_NEW_EN)
[grammar] ~464-~464: There might be a mistake here.
Context: ...anite-run.yaml) - Qwen3 on vLLM example - [Gemini example](examples/gemini-run.yaml...
(QB_NEW_EN)
[grammar] ~467-~467: There might be a mistake here.
Context: ...le](examples/gemini-run.yaml) > [!NOTE] > RAG functionality is not tested fo...
(QB_NEW_EN)
🪛 markdownlint-cli2 (0.17.2)
README.md
463-463: Unordered list style
Expected: asterisk; Actual: dash
(MD004, ul-style)
464-464: Unordered list style
Expected: asterisk; Actual: dash
(MD004, ul-style)
465-465: Unordered list style
Expected: asterisk; Actual: dash
(MD004, ul-style)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build-pr
- GitHub Check: e2e_tests
🔇 Additional comments (2)
README.md (2)
38-38: TOC entry for RAG looks correctAnchor matches the section header and placement in TOC is consistent.
455-466: Verified README references – all linked files existI’ve confirmed that:
docs/rag_guide.mdexamples/vllm-granite-run.yamlexamples/vllm-qwen3-run.yamlexamples/gemini-run.yamlare all present in the repository. No broken links found.
tisnik
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM thank you very much
matysek
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
Akrog
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR looks good in general, my only concern is that you are using env var EXPORTED_PORT but then assuming afterwards that this has been set to 8000, which doesn't feel quite right.
The rest of the comments are just nits.
…nfigurations Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Added VertexAI configuration example
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
docs/rag_guide.md (1)
1-357: Add CI Checks for Documentation and Example Config ValidityThe new CI script now verifies that all example links in
docs/rag_guide.mdexist, that each example YAML parses correctly, and that every declared API has a corresponding provider. It successfully ran but flagged missing provider definitions in two example files. Please address these before merging:• examples/vertexai-run.yaml
– Declaresapis: ['safety']but has noproviders.safetyentry. Add asafetyprovider config or remove the API declaration.
• examples/vllm-granite-run.yaml
– Declaresapis: ['vector_io']but has noproviders.vector_ioentry. Add avector_ioprovider config or remove the API declaration.Once these are fixed, the CI step will pass and prevent future regressions.
♻️ Duplicate comments (7)
examples/vllm-llama-faiss-run.yaml (2)
36-38: URL normalization: remove trailing slash to avoid double-slash issues when clients join paths.Change http://localhost:8000/v1/ → http://localhost:8000/v1 as shown in the diff above.
29-36: Reindent to 4 spaces per level under providers; current indentation violates yamllint and harms readability.Also remove the trailing slash in vLLM URL and externalize api_token. This mirrors prior feedback.
Apply:
-providers: - inference: - - provider_id: sentence-transformers - provider_type: inline::sentence-transformers - config: {} - - provider_id: vllm - provider_type: remote::vllm - config: - url: http://localhost:8000/v1/ - api_token: key +providers: + inference: + - provider_id: sentence-transformers + provider_type: inline::sentence-transformers + config: {} + - provider_id: vllm + provider_type: remote::vllm + config: + url: ${env.VLLM_URL:=http://localhost:8000/v1} + api_token: ${env.VLLM_API_TOKEN} - agents: - - provider_id: meta-reference - provider_type: inline::meta-reference - config: - persistence_store: - type: sqlite - db_path: .llama/distributions/ollama/agents_store.db - responses_store: - type: sqlite - db_path: .llama/distributions/ollama/responses_store.db + agents: + - provider_id: meta-reference + provider_type: inline::meta-reference + config: + persistence_store: + type: sqlite + db_path: .llama/distributions/ollama/agents_store.db + responses_store: + type: sqlite + db_path: .llama/distributions/ollama/responses_store.db - safety: - - provider_id: llama-guard - provider_type: inline::llama-guard - config: - excluded_categories: [] + safety: + - provider_id: llama-guard + provider_type: inline::llama-guard + config: + excluded_categories: [] - vector_io: - - provider_id: rhel-db - provider_type: inline::faiss - config: - kvstore: - type: sqlite - db_path: /home/USER/vector_dbs/rhel_index/faiss_store.db - namespace: null + vector_io: + - provider_id: rhel-db + provider_type: inline::faiss + config: + kvstore: + type: sqlite + db_path: ${env.VECTOR_DB_PATH:=/home/USER/vector_dbs/rhel_index/faiss_store.db} + namespace: nullAlso applies to: 39-49, 50-55, 56-64, 76-80
examples/openai-pgvector-run.yaml (1)
31-39: Normalize indentation to 4 spaces and parameterize pgvector connection.This mirrors earlier feedback on inconsistent indentation. Also remove trailing space in provider_id and externalize pgvector credentials.
-providers: - inference: - - provider_id: sentence-transformers - provider_type: inline::sentence-transformers - config: {} - - provider_id: openai - provider_type: remote::openai - config: - api_key: ${env.OPENAI_API_KEY} +providers: + inference: + - provider_id: sentence-transformers + provider_type: inline::sentence-transformers + config: {} + - provider_id: openai + provider_type: remote::openai + config: + api_key: ${env.OPENAI_API_KEY} @@ - agents: - - provider_id: meta-reference - provider_type: inline::meta-reference - config: - persistence_store: - type: sqlite - db_path: .llama/distributions/ollama/agents_store.db - responses_store: - type: sqlite - db_path: .llama/distributions/ollama/responses_store.db + agents: + - provider_id: meta-reference + provider_type: inline::meta-reference + config: + persistence_store: + type: sqlite + db_path: .llama/distributions/ollama/agents_store.db + responses_store: + type: sqlite + db_path: .llama/distributions/ollama/responses_store.db @@ - safety: - - provider_id: llama-guard - provider_type: inline::llama-guard - config: - excluded_categories: [] + safety: + - provider_id: llama-guard + provider_type: inline::llama-guard + config: + excluded_categories: [] @@ - vector_io: - - provider_id: pgvector-example - provider_type: remote::pgvector - config: - host: localhost - port: 5432 - db: pgvector_example # PostgreSQL database (psql -d pgvector_example) - user: lightspeed # PostgreSQL user - password: empty - kvstore: - type: sqlite - db_path: .llama/distributions/pgvector/pgvector_registry.db + vector_io: + - provider_id: pgvector-example + provider_type: remote::pgvector + config: + host: ${env.PGVECTOR_HOST:=localhost} + port: ${env.PGVECTOR_PORT:=5432} + db: ${env.PGVECTOR_DB:=pgvector_example} # PostgreSQL database + user: ${env.PGVECTOR_USER:=lightspeed} # PostgreSQL user + password: ${env.PGVECTOR_PASSWORD:=empty} + kvstore: + type: sqlite + db_path: .llama/distributions/pgvector/pgvector_registry.db @@ - provider_id: pgvector-example + provider_id: pgvector-exampleAlso applies to: 40-50, 51-56, 57-69, 81-87
examples/openai-faiss-run.yaml (1)
31-41: Reindent to 4 spaces per level and remove inline style issues.Matches earlier feedback.
Apply:
-providers: - inference: - - provider_id: sentence-transformers - provider_type: inline::sentence-transformers - config: {} - - - provider_id: openai - provider_type: remote::openai - config: - api_key: ${env.OPENAI_API_KEY} +providers: + inference: + - provider_id: sentence-transformers + provider_type: inline::sentence-transformers + config: {} + - provider_id: openai + provider_type: remote::openai + config: + api_key: ${env.OPENAI_API_KEY} @@ - agents: - - provider_id: meta-reference - provider_type: inline::meta-reference - config: - persistence_store: - type: sqlite - db_path: .llama/distributions/ollama/agents_store.db - responses_store: - type: sqlite - db_path: .llama/distributions/ollama/responses_store.db + agents: + - provider_id: meta-reference + provider_type: inline::meta-reference + config: + persistence_store: + type: sqlite + db_path: .llama/distributions/ollama/agents_store.db + responses_store: + type: sqlite + db_path: .llama/distributions/ollama/responses_store.db @@ - safety: - - provider_id: llama-guard - provider_type: inline::llama-guard - config: - excluded_categories: [] + safety: + - provider_id: llama-guard + provider_type: inline::llama-guard + config: + excluded_categories: [] @@ - vector_io: - - provider_id: ocp-docs - provider_type: inline::faiss - config: - kvstore: - type: sqlite - db_path: /home/USER/lightspeed-stack/vector_dbs/ocp_docs/faiss_store.db - namespace: null + vector_io: + - provider_id: ocp-docs + provider_type: inline::faiss + config: + kvstore: + type: sqlite + db_path: ${env.VECTOR_DB_PATH:=/home/USER/lightspeed-stack/vector_dbs/ocp_docs/faiss_store.db} + namespace: null @@ - tool_runtime: - - provider_id: rag-runtime - provider_type: inline::rag-runtime - config: {} + tool_runtime: + - provider_id: rag-runtime + provider_type: inline::rag-runtime + config: {}Also applies to: 45-51, 52-57, 58-66, 67-71
README.md (1)
457-457: Broken-link fix from earlier review is correctly applied.docs/rag_guide.md is now referenced with underscore; previous rag-guide typo resolved.
docs/rag_guide.md (2)
103-104: Fix relative links to examples/ from docs/.Links must go up one level from docs/ to examples/. This mirrors the earlier review; applying here keeps the doc navigable.
-See the full working [config example](examples/openai-faiss-run.yaml) for more details. +See the full working [config example](../examples/openai-faiss-run.yaml) for more details. @@ -See the full working [config example](examples/openai-pgvector-run.yaml) for more details. +See the full working [config example](../examples/openai-pgvector-run.yaml) for more details. @@ -See the full working [config example](examples/vllm-llama-faiss-run.yaml) for more details. +See the full working [config example](../examples/vllm-llama-faiss-run.yaml) for more details. @@ -See the full working [config example](examples/openai-faiss-run.yaml) for more details. +See the full working [config example](../examples/openai-faiss-run.yaml) for more details.Also applies to: 153-154, 206-207, 239-240
239-240: Duplicate relative link fix repeated here; will update with the others.Covered in the earlier multi-hunk diff.
🧹 Nitpick comments (28)
examples/vllm-qwen3-run.yaml (4)
56-59: Suspicious zero‑width character in service_name default; set a safe explicit default.The current default contains a zero‑width char (\u200B), which is easy to miss and can complicate debugging. Prefer a readable default.
- service_name: "${env.OTEL_SERVICE_NAME:=\u200B}" + service_name: "${env.OTEL_SERVICE_NAME:=vllm-qwen3}"
90-95: Explicit provider_model_id recommended for vLLM to avoid ambiguity.Leaving provider_model_id null relies on the vLLM server’s default. Recommend explicitly setting it to the served model name on your vLLM instance.
- provider_model_id: null + # Use the exact model name as loaded by your vLLM server (example shown): + provider_model_id: ${env.QWEN3_VLLM_MODEL:=Qwen3-32B-Instruct}If the server enforces a single model and ignores this, keep it but confirm behavior.
52-59: Ensure telemetry SQLite path exists at runtime.sqlite_db_path points under /tmp/.llama/...; create the parent directory on startup, or allow overriding via env.
- sqlite_db_path: ${env.SQLITE_STORE_DIR:=/tmp/.llama/distributions/starter}/trace_store.db + sqlite_db_path: ${env.SQLITE_STORE_DIR:=/tmp/.llama/distributions/starter}/trace_store.db + # Ensure ${env.SQLITE_STORE_DIR} exists before booting the stack.
23-31: Consistency: normalize empty provider lists and toolgroups formatting.
- Prefer explicit empty lists ([]) where shown, but keep indentation consistent across examples.
- tool_groups already set; consider aligning order/format with other example files.
No functional bug; purely readability and consistency.
Also applies to: 62-74, 100-107
examples/vllm-llama-faiss-run.yaml (3)
2-2: Remove trailing spaces to satisfy yamllint.Lines 2 and 27 contain trailing spaces. This tends to fail strict linters.
Also applies to: 27-27
16-26: Model/provider mapping check.
- For vLLM, consider explicitly setting provider_model_id to the actual served model (even if server defaults it). Keeps examples copy/paste‑safe.
- provider_model_id: null + provider_model_id: ${env.VLLM_MODEL:=meta-llama/Llama-3.1-8B-Instruct}
80-80: Add newline at EOF.Some linters enforce this.
examples/gemini-run.yaml (5)
2-2: Remove trailing spaces to satisfy yamllint.Lines 2 and 5 have trailing spaces after the comment marker.
Also applies to: 5-5
24-29: Environment placeholder style: standardize on ${env.VAR} for consistency across examples.This file mixes ${LLAMA_STACK_} and ${env.}. If the config loader expects ${env.*} (as used in other examples), align these:
- - provider_id: ${LLAMA_STACK_INFERENCE_PROVIDER} + - provider_id: ${env.LLAMA_STACK_INFERENCE_PROVIDER} @@ - model_id: ${LLAMA_STACK_2_0_FLASH_MODEL} - provider_id: ${LLAMA_STACK_INFERENCE_PROVIDER} - provider_model_id: ${LLAMA_STACK_2_0_FLASH_MODEL} + model_id: ${env.LLAMA_STACK_2_0_FLASH_MODEL} + provider_id: ${env.LLAMA_STACK_INFERENCE_PROVIDER} + provider_model_id: ${env.LLAMA_STACK_2_0_FLASH_MODEL}Repeat similarly for 2_5_PRO and 2_5_FLASH.
If both syntaxes are supported, consider adding a comment stating so.
Also applies to: 84-98
50-57: Telemetry defaults: provide safe fallbacks and ensure path exists.If LLAMA_STACK_TELEMETRY_SINKS is unset, the loader may error on an empty value. Provide defaults and ensure sqlite dir exists.
- service_name: "${LLAMA_STACK_OTEL_SERVICE_NAME}" - sinks: ${LLAMA_STACK_TELEMETRY_SINKS} - sqlite_db_path: ${STORAGE_MOUNT_PATH}/sqlite/trace_store.db + service_name: "${env.LLAMA_STACK_OTEL_SERVICE_NAME:=gemini}" + sinks: ${env.LLAMA_STACK_TELEMETRY_SINKS:=console,sqlite} + sqlite_db_path: ${env.STORAGE_MOUNT_PATH:=/var/lib/lightspeed}/sqlite/trace_store.db + # Ensure ${env.STORAGE_MOUNT_PATH}/sqlite exists before boot.
73-76: Consider Postgres for metadata_store for parity with inference_store.Using sqlite for metadata_store while inference_store is Postgres is fine, but mixing stores can surprise operators. Consider making both Postgres or add a comment explaining the choice.
111-112: Server port placeholder should have a default to ease local runs.- port: ${LLAMA_STACK_SERVER_PORT} + port: ${env.LLAMA_STACK_SERVER_PORT:=8321}examples/openai-faiss-run.yaml (1)
72-83: Vector DB block looks consistent; minor nits only.
- Keep the comment explaining provider_id reference — helpful.
- Consider moving the explanatory comment above vector_db_id to avoid very long lines.
No functional changes needed.
examples/vertexai-run.yaml (4)
71-75: Remove unused module: null to avoid schema validation noise.module: null under tool_runtime is unnecessary and can be rejected by strict validators.
- provider_id: model-context-protocol provider_type: remote::model-context-protocol config: {} - module: null
19-31: Tame very chatty logging categories for production-like examples.Global DEBUG plus uvicorn/fastapi DEBUG will spam logs and slow I/O. Recommend INFO for server frameworks, keep component-level DEBUG if needed.
level: DEBUG # Set root logger to DEBUG category_levels: @@ - uvicorn: DEBUG - uvicorn.access: INFO # Keep HTTP requests at INFO to reduce noise - fastapi: DEBUG + uvicorn: INFO + uvicorn.access: INFO # Keep HTTP requests at INFO to reduce noise + fastapi: INFO
2-2: Clean up trailing spaces and add newline at EOF.YAMLlint flagged trailing spaces and missing newline at EOF.
-# +# @@ -# +# @@ - db_path: /tmp/inference_store.db \ No newline at end of file + db_path: /tmp/inference_store.db +Also applies to: 5-5, 91-91
64-69: Call out required Vertex AI credentials and auth flow.project has no default and requires GCP ADC or service account. Add a brief comment to prevent confusion when testing.
- provider_id: google-vertex provider_type: remote::vertexai config: - project: ${env.VERTEXAI_PROJECT} + project: ${env.VERTEXAI_PROJECT} # Required. Use a valid GCP project id. region: ${env.VERTEXAI_REGION:=us-east5} + # Authentication uses Google ADC (Application Default Credentials). + # Ensure 'gcloud auth application-default login' or set GOOGLE_APPLICATION_CREDENTIALS.Would you like me to add a short “How to auth with Vertex AI” snippet to docs/rag_guide.md?
examples/vllm-granite-run.yaml (4)
3-3: Fix trailing spaces and add newline at EOF.These are minor but keep the repo lint-clean.
-# +# @@ -# +# @@ - port: 8321 \ No newline at end of file + port: 8321 +Also applies to: 5-5, 148-148
9-23: Description says “no RAG” but apis include vector_io and tool_runtime (used for RAG).This is confusing for users. Either (a) remove RAG-related APIs/providers, or (b) tweak wording to “RAG not configured by default.”
Option A — remove RAG bits:
apis: - agents - datasetio - eval - files - inference - post_training - safety - scoring - telemetry -- tool_runtime -- vector_io + # no RAG in this example + # - tool_runtime + # - vector_ioAnd below, comment out rag-runtime and tool_groups:
tool_runtime: - - provider_id: rag-runtime - provider_type: inline::rag-runtime - config: {} - - provider_id: model-context-protocol + - provider_id: model-context-protocol provider_type: remote::model-context-protocol config: {} @@ -tool_groups: - - toolgroup_id: builtin::websearch - provider_id: tavily-search - - toolgroup_id: builtin::rag - provider_id: rag-runtime - - toolgroup_id: builtin::wolfram_alpha - provider_id: wolfram-alpha +tool_groups: []Option B — keep config as-is but clarify:
-# Example llama-stack configuration for IBM Granite using vLLM (no RAG) +# Example llama-stack configuration for IBM Granite using vLLM (RAG not configured by default)
28-31: Consider adding tls_verify with secure default for remote vLLM.Mirror the Vertex example and promote safer defaults.
url: ${env.VLLM_URL} api_token: ${env.VLLM_API_TOKEN:fake} max_tokens: 10000 + tls_verify: ${env.VLLM_TLS_VERIFY:true}
51-53: Avoid zero‑width space as default service_name; use an explicit string or omit."${env.OTEL_SERVICE_NAME:\u200B}" is non-obvious. Prefer a readable default or leave unset.
- service_name: "${env.OTEL_SERVICE_NAME:\u200B}" + service_name: "${env.OTEL_SERVICE_NAME:llama-stack}"README.md (2)
463-466: Fix markdown list style to satisfy markdownlint (MD004).Earlier lists use asterisks. Switch dashes to asterisks for consistency.
-- [Granite on vLLM example](examples/vllm-granite-run.yaml) -- [Qwen3 on vLLM example](examples/vllm-qwen3-run.yaml) -- [Gemini example](examples/gemini-run.yaml) -- [VertexAI example](examples/vertexai-run.yaml) +* [Granite on vLLM example](examples/vllm-granite-run.yaml) +* [Qwen3 on vLLM example](examples/vllm-qwen3-run.yaml) +* [Gemini example](examples/gemini-run.yaml) +* [Vertex AI example](examples/vertexai-run.yaml)
466-466: Nit: use “Vertex AI” brand spelling.Included in the diff above.
docs/rag_guide.md (5)
118-121: Match list style with repo convention to satisfy markdownlint (MD004).Use asterisks like other lists.
-- `id` (`text`): UUID identifier of the chunk -- `document` (`jsonb`): json containing content and metadata associated with the embedding -- `embedding` (`vector(n)`): the embedding vector, where `n` is the embedding dimension and will match the model's output size (e.g. 768 for `all-mpnet-base-v2`) +* `id` (`text`): UUID identifier of the chunk +* `document` (`jsonb`): json containing content and metadata associated with the embedding +* `embedding` (`vector(n)`): the embedding vector, where `n` is the embedding dimension and matches the model's output size (e.g., 768 for `all-mpnet-base-v2`)
53-54: Fix capitalization: “Llama Stack”, not “LLama Stack”.-LLama Stack will then download the model to the `~/.cache/huggingface/hub` folder. +Llama Stack will then download the model to the `~/.cache/huggingface/hub` folder.
255-256: Grammar: “were not” instead of “where not”.-The RAG tool calls where not working properly when experimenting with `mistralai/Mistral-7B-Instruct-v0.3` on vLLM. +The RAG tool calls were not working properly when experimenting with `mistralai/Mistral-7B-Instruct-v0.3` on vLLM.
202-204: URL note: parameterize host/port once and reuse.You already use ${env.EXPORTED_PORT:=8000}. Consider also surfacing host as ${env.VLLM_HOST:=localhost} for clarity.
- url: http://localhost:${env.EXPORTED_PORT:=8000}/v1/ # Replace localhost with the url of the vLLM instance + url: http://${env.VLLM_HOST:=localhost}:${env.EXPORTED_PORT:=8000}/v1/ # Replace host with the URL of the vLLM instance
122-124: Tighten the pgvector notes for clarity and consistency with earlier reviewer nits.Earlier comments suggested wording tweaks; incorporating them improves readability.
-> [!NOTE] -> The `vector_db_id` (e.g. `rhdocs`) is used to point to the table named `vector_store_rhdocs` in the specified database, which stores the vector embeddings. +> [!NOTE] +> The `vector_db_id` (e.g., `rhdocs`) maps to a PostgreSQL table named `vector_store_rhdocs` in the selected database, which stores the embeddings. @@ - # If the table was already created, this value must match the ID used at creation. + # If the table was already created, this value will match the ID used at creation.Also applies to: 147-151
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (9)
README.md(2 hunks)docs/rag_guide.md(3 hunks)examples/gemini-run.yaml(1 hunks)examples/openai-faiss-run.yaml(1 hunks)examples/openai-pgvector-run.yaml(1 hunks)examples/vertexai-run.yaml(1 hunks)examples/vllm-granite-run.yaml(1 hunks)examples/vllm-llama-faiss-run.yaml(1 hunks)examples/vllm-qwen3-run.yaml(1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/rag_guide.md
[grammar] ~122-~122: There might be a mistake here.
Context: ...768 for all-mpnet-base-v2) > [!NOTE] > The vector_db_id (e.g. rhdocs) is ...
(QB_NEW_EN)
[grammar] ~161-~161: There might be a mistake here.
Context: ...n RHEL AI (Llama 3.1) example > [!NOTE] > The following example assumes that pod...
(QB_NEW_EN)
[grammar] ~182-~182: There might be a mistake here.
Context: ...ion options, see the vLLM documentation: > [vLLM: Tool Calling](https://docs.vllm...
(QB_NEW_EN)
README.md
[grammar] ~463-~463: There might be a mistake here.
Context: ...deployments: - Granite on vLLM example - [Qwen3 on vLLM example](examples/vllm-qwe...
(QB_NEW_EN)
[grammar] ~464-~464: There might be a mistake here.
Context: ...anite-run.yaml) - Qwen3 on vLLM example - [Gemini example](examples/gemini-run.yaml...
(QB_NEW_EN)
[grammar] ~465-~465: There might be a mistake here.
Context: ...s/vllm-qwen3-run.yaml) - Gemini example - [VertexAI example](examples/vertexai-run....
(QB_NEW_EN)
[grammar] ~468-~468: There might be a mistake here.
Context: ...](examples/vertexai-run.yaml) > [!NOTE] > RAG functionality is not tested fo...
(QB_NEW_EN)
🪛 markdownlint-cli2 (0.17.2)
docs/rag_guide.md
118-118: Unordered list style
Expected: asterisk; Actual: dash
(MD004, ul-style)
119-119: Unordered list style
Expected: asterisk; Actual: dash
(MD004, ul-style)
120-120: Unordered list style
Expected: asterisk; Actual: dash
(MD004, ul-style)
README.md
463-463: Unordered list style
Expected: asterisk; Actual: dash
(MD004, ul-style)
464-464: Unordered list style
Expected: asterisk; Actual: dash
(MD004, ul-style)
465-465: Unordered list style
Expected: asterisk; Actual: dash
(MD004, ul-style)
466-466: Unordered list style
Expected: asterisk; Actual: dash
(MD004, ul-style)
🪛 YAMLlint (1.37.1)
examples/gemini-run.yaml
[error] 2-2: trailing spaces
(trailing-spaces)
[error] 5-5: trailing spaces
(trailing-spaces)
examples/openai-faiss-run.yaml
[error] 2-2: trailing spaces
(trailing-spaces)
[error] 6-6: trailing spaces
(trailing-spaces)
[error] 18-18: trailing spaces
(trailing-spaces)
[error] 28-28: trailing spaces
(trailing-spaces)
[error] 29-29: trailing spaces
(trailing-spaces)
[warning] 31-31: wrong indentation: expected 4 but found 2
(indentation)
[error] 32-32: trailing spaces
(trailing-spaces)
[error] 36-36: trailing spaces
(trailing-spaces)
[warning] 39-39: wrong indentation: expected 8 but found 6
(indentation)
[warning] 45-45: wrong indentation: expected 8 but found 6
(indentation)
[warning] 46-46: wrong indentation: expected 10 but found 8
(indentation)
[warning] 49-49: wrong indentation: expected 10 but found 8
(indentation)
[warning] 56-56: wrong indentation: expected 8 but found 6
(indentation)
[error] 59-59: trailing spaces
(trailing-spaces)
[warning] 62-62: wrong indentation: expected 8 but found 6
(indentation)
[warning] 63-63: wrong indentation: expected 10 but found 8
(indentation)
[error] 68-68: trailing spaces
(trailing-spaces)
[error] 81-81: trailing spaces
(trailing-spaces)
[error] 83-83: no new line character at the end of file
(new-line-at-end-of-file)
examples/openai-pgvector-run.yaml
[error] 2-2: trailing spaces
(trailing-spaces)
[error] 7-7: trailing spaces
(trailing-spaces)
[error] 29-29: trailing spaces
(trailing-spaces)
[warning] 31-31: wrong indentation: expected 4 but found 2
(indentation)
[warning] 38-38: wrong indentation: expected 8 but found 6
(indentation)
[warning] 44-44: wrong indentation: expected 8 but found 6
(indentation)
[warning] 45-45: wrong indentation: expected 10 but found 8
(indentation)
[warning] 48-48: wrong indentation: expected 10 but found 8
(indentation)
[warning] 55-55: wrong indentation: expected 8 but found 6
(indentation)
[error] 58-58: trailing spaces
(trailing-spaces)
[warning] 61-61: wrong indentation: expected 8 but found 6
(indentation)
[warning] 67-67: wrong indentation: expected 10 but found 8
(indentation)
[error] 84-84: trailing spaces
(trailing-spaces)
[error] 87-87: no new line character at the end of file
(new-line-at-end-of-file)
examples/vertexai-run.yaml
[error] 2-2: trailing spaces
(trailing-spaces)
[error] 5-5: trailing spaces
(trailing-spaces)
[warning] 43-43: wrong indentation: expected 4 but found 2
(indentation)
[error] 91-91: no new line character at the end of file
(new-line-at-end-of-file)
examples/vllm-granite-run.yaml
[error] 3-3: trailing spaces
(trailing-spaces)
[error] 5-5: trailing spaces
(trailing-spaces)
[error] 148-148: no new line character at the end of file
(new-line-at-end-of-file)
examples/vllm-llama-faiss-run.yaml
[error] 2-2: trailing spaces
(trailing-spaces)
[error] 27-27: trailing spaces
(trailing-spaces)
[warning] 29-29: wrong indentation: expected 4 but found 2
(indentation)
[warning] 36-36: wrong indentation: expected 8 but found 6
(indentation)
[warning] 43-43: wrong indentation: expected 8 but found 6
(indentation)
[warning] 44-44: wrong indentation: expected 10 but found 8
(indentation)
[warning] 47-47: wrong indentation: expected 10 but found 8
(indentation)
[warning] 54-54: wrong indentation: expected 8 but found 6
(indentation)
[warning] 60-60: wrong indentation: expected 8 but found 6
(indentation)
[warning] 61-61: wrong indentation: expected 10 but found 8
(indentation)
[error] 80-80: no new line character at the end of file
(new-line-at-end-of-file)
examples/vllm-qwen3-run.yaml
[error] 3-3: trailing spaces
(trailing-spaces)
[error] 5-5: trailing spaces
(trailing-spaces)
[error] 108-108: no new line character at the end of file
(new-line-at-end-of-file)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build-pr
🔇 Additional comments (10)
examples/vllm-qwen3-run.yaml (2)
1-108: One-shot validation script for all example YAMLs in this PRRun this from the repo root to:
- Lint all examples
- Detect trailing slashes in remote::vllm URLs
- Ensure no inline secrets remain
- Verify EOF newline
#!/usr/bin/env bash set -euo pipefail echo "== yamllint examples ==" find examples -type f \( -name '*.yml' -o -name '*.yaml' \) -print0 \ | xargs -0 yamllint echo "== check trailing slashes in vLLM URLs ==" grep -RPn 'provider_type:\s*remote::vllm' examples \ | cut -d: -f1 \ | uniq \ | xargs -I{} grep -Pn 'url:\s*.+/v1/$' {} echo "== check inline tokens/keys ==" grep -RPn 'api_token:\s*(<add your api key>|[A-Za-z0-9._-]+)|api_key:\s*(sk-[A-Za-z0-9]+|key)' examples || true echo "== check EOF newline ==" find examples -type f \( -name '*.yml' -o -name '*.yaml' \) -print0 \ | xargs -0 -n1 bash -c ' file="$0" tail -c1 "$file" | od -An -t x1 | grep -q "0a" || echo "missing newline at EOF: $file" 'Flagged issues to address:
- examples/vllm-llama-faiss-run.yaml (line 36): URL ends with
/v1/– remove trailing slash so it ends with/v1- examples/vllm-qwen3-run.yaml (line 29): placeholder
<add your api key>– replace with documentation on where/how to inject a real token- examples/vllm-llama-faiss-run.yaml (line 37):
api_token: key– remove this default or replace with a placeholder- Ensure every example YAML ends with a newline at EOF (script may report any missing files)
After fixing the above, re-run the script to confirm no failures.
27-31: Externalize VLLM configuration via environment variablesIn
examples/vllm-qwen3-run.yaml(lines 27–31), replace hard-coded values with env-vars—keepingtls_verify: true:- url: https://qwen3.rosa.openshiftapps.com/v1 - max_tokens: 32768 - api_token: <add your api key> + url: ${env.VLLM_URL:=https://qwen3.rosa.openshiftapps.com/v1} + max_tokens: ${env.VLLM_MAX_TOKENS:=32768} + api_token: ${env.VLLM_API_TOKEN} tls_verify: true• This matches how other VLLM examples externalize
VLLM_URL,VLLM_MAX_TOKENSandVLLM_API_TOKEN.
• No default forapi_tokenensures you must explicitly set it in your environment.examples/openai-pgvector-run.yaml (1)
18-28: Embedding model dimension matches (768) — good.The embedding metadata aligns with all-mpnet-base-v2’s 768-d vectors and the vector_dbs block. No changes needed.
examples/vertexai-run.yaml (1)
9-9: Confirm config schema version; likely should be '2' for current Llama Stack distributions.Other examples use version: '2'. If this should be '3', please cite the schema change. Otherwise, align to '2' to avoid runtime schema errors.
-version: '3' +version: '2'Would you like me to run a quick schema check script across all new YAMLs to ensure version consistency?
examples/vllm-granite-run.yaml (2)
29-29: Normalize environment interpolation syntax.This file uses ${env.VAR:default} while others use ${env.VAR:=default}. Please standardize across examples to reduce confusion; confirm which form llama-stack officially supports in 0.2.x.
I can submit a follow-up commit to normalize once you confirm the preferred syntax.
Also applies to: 51-51, 53-53, 61-61, 69-69, 76-76, 87-87, 95-95, 101-101, 107-108, 112-113, 118-119, 123-123, 126-129
131-147: Model registration sanity: provider_model_id null may be fine for remote::vllm, but verify.Some distributions accept null and use model_id only; others expect provider_model_id. Ensure runtime registration succeeds.
If needed, set provider_model_id to the served model (or remove it entirely if schema allows).
- provider_model_id: null + provider_model_id: ${env.INFERENCE_MODEL}README.md (2)
38-38: TOC entry added — looks good and consistent with section header.
455-470: RAG section content LGTM; ensure the section is present only once.The AI summary mentions a second identical “RAG Configuration” section near the bottom. Please dedupe to avoid reader confusion and keep a single TOC anchor.
If duplication exists, remove the duplicate block; I can provide a cleanup diff once you confirm its exact range.
docs/rag_guide.md (2)
168-179: Verify podman flags for GPU; use CDI guidance consistently.The command mixes --device and --gpus. For Podman, GPU enablement typically relies on CDI; some environments don’t support --gpus. Please confirm and document the recommended approach for RHEL AI.
If needed, add a short preface with the exact podman GPU flags and CDI setup steps (link already provided).
64-96: FAISS example is clear; nice job calling out embedding_dimension and vector_db_id.
| @@ -0,0 +1,83 @@ | |||
| # Example llama-stack configuration for OpenAI inference + FAISS (RAG) | |||
| # | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trim trailing spaces and add newline at EOF (yamllint/CI).
Multiple trailing spaces; missing newline at EOF. These are typical CI blockers.
Also applies to: 6-6, 18-18, 28-29, 32-32, 36-36, 59-59, 68-68, 81-81, 83-83
🧰 Tools
🪛 YAMLlint (1.37.1)
[error] 2-2: trailing spaces
(trailing-spaces)
🤖 Prompt for AI Agents
In examples/openai-faiss-run.yaml around lines 2 (and additionally lines 6, 18,
28-29, 32, 36, 59, 68, 81, 83), remove any trailing spaces at the ends of those
lines and ensure the file ends with a single newline character; update each
affected line to have no trailing whitespace and save the file with a final
newline (LF) so yamllint/CI passes.
| @@ -0,0 +1,87 @@ | |||
| # Example llama-stack configuration for OpenAI inference + PSQL (pgvector) vector index (RAG) | |||
| # | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whitespace and EOF fixes (yamllint blockers).
- Trailing spaces at Lines 2, 7, 29, 58, 84.
- Missing newline at EOF (Line 87).
Apply trims and add a final newline.
Also applies to: 7-7, 29-29, 58-58, 84-84, 87-87
🧰 Tools
🪛 YAMLlint (1.37.1)
[error] 2-2: trailing spaces
(trailing-spaces)
🤖 Prompt for AI Agents
In examples/openai-pgvector-run.yaml around lines 2, 7, 29, 58, 84 and EOF at
87: remove trailing whitespace on lines 2, 7, 29, 58 and 84, and ensure the file
ends with a single newline character (add missing final newline at EOF). Trim
each indicated line so there are no trailing spaces and save the file with a
terminating newline.
| - agents | ||
| - inference | ||
| - safety | ||
| - telemetry | ||
| - tool_runtime | ||
| - vector_io |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Safety API enabled but no safety provider configured — add one or drop the API.
apis includes safety, but providers has no safety block. Llama Stack will fail to start or the API will be unavailable. Add a minimal inline::llama-guard provider or remove safety from apis.
Apply one of the following diffs.
Option A — add a safety provider:
providers:
@@
telemetry:
@@
provider_type: inline::meta-reference
+
+ safety:
+ - provider_id: llama-guard
+ provider_type: inline::llama-guard
+ config:
+ excluded_categories: []Option B — remove the API:
apis:
- agents
- inference
- - safety
- telemetry
- tool_runtime
- vector_ioAlso applies to: 76-83
🤖 Prompt for AI Agents
In examples/vertexai-run.yaml around lines 12 to 17 (and also apply the same fix
at lines 76 to 83), apis lists "safety" but providers contains no safety
configuration; either add a minimal safety provider or remove "safety" from
apis. To fix, either (A) add a providers.safety entry with an
inline::llama-guard provider and any required minimal config (e.g.,
enable/threshold settings) so the safety API is backed by a provider, or (B)
remove the "safety" item from the apis list in both locations if you do not
intend to enable the safety API; ensure the file remains valid YAML after the
change.
| - config: | ||
| persistence_store: | ||
| db_path: /tmp/agents_store.db | ||
| namespace: null | ||
| type: sqlite | ||
| responses_store: | ||
| db_path: /tmp/responses_store.db | ||
| type: sqlite | ||
| provider_id: meta-reference |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Fix indentation under providers.agents to satisfy YAML parsers and linters.
The list item under agents is indented with 2 spaces; expected 4. YAMLlint flagged this. Incorrect indentation can change structure silently.
agents:
- - config:
- persistence_store:
- db_path: /tmp/agents_store.db
- namespace: null
- type: sqlite
- responses_store:
- db_path: /tmp/responses_store.db
- type: sqlite
- provider_id: meta-reference
- provider_type: inline::meta-reference
+ - config:
+ persistence_store:
+ db_path: /tmp/agents_store.db
+ namespace: null
+ type: sqlite
+ responses_store:
+ db_path: /tmp/responses_store.db
+ type: sqlite
+ provider_id: meta-reference
+ provider_type: inline::meta-reference📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - config: | |
| persistence_store: | |
| db_path: /tmp/agents_store.db | |
| namespace: null | |
| type: sqlite | |
| responses_store: | |
| db_path: /tmp/responses_store.db | |
| type: sqlite | |
| provider_id: meta-reference | |
| agents: | |
| - config: | |
| persistence_store: | |
| db_path: /tmp/agents_store.db | |
| namespace: null | |
| type: sqlite | |
| responses_store: | |
| db_path: /tmp/responses_store.db | |
| type: sqlite | |
| provider_id: meta-reference | |
| provider_type: inline::meta-reference |
🧰 Tools
🪛 YAMLlint (1.37.1)
[warning] 43-43: wrong indentation: expected 4 but found 2
(indentation)
🤖 Prompt for AI Agents
In examples/vertexai-run.yaml around lines 43 to 51, the list item under
providers.agents is indented 2 spaces but must be indented 4 to satisfy YAML
parsers; adjust the indentation so the entire "config:" block (and its nested
keys like persistence_store and responses_store) is shifted two additional
spaces to align as a proper list element under agents, ensuring consistent
4-space indentation for list items and their children.
| url: ${env.VLLM_URL:=http://localhost:8000/v1} | ||
| max_tokens: ${env.VLLM_MAX_TOKENS:=4096} | ||
| api_token: ${env.VLLM_API_TOKEN:=fake} | ||
| tls_verify: ${env.VLLM_TLS_VERIFY:=false} | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Defaulting tls_verify to false is insecure; default to true and allow override.
Disable TLS verification only when required. Safer defaults help avoid accidental MITM exposure when pointing to remote vLLM.
api_token: ${env.VLLM_API_TOKEN:=fake}
- tls_verify: ${env.VLLM_TLS_VERIFY:=false}
+ tls_verify: ${env.VLLM_TLS_VERIFY:=true}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| url: ${env.VLLM_URL:=http://localhost:8000/v1} | |
| max_tokens: ${env.VLLM_MAX_TOKENS:=4096} | |
| api_token: ${env.VLLM_API_TOKEN:=fake} | |
| tls_verify: ${env.VLLM_TLS_VERIFY:=false} | |
| url: ${env.VLLM_URL:=http://localhost:8000/v1} | |
| max_tokens: ${env.VLLM_MAX_TOKENS:=4096} | |
| api_token: ${env.VLLM_API_TOKEN:=fake} | |
| tls_verify: ${env.VLLM_TLS_VERIFY:=true} |
🤖 Prompt for AI Agents
In examples/vertexai-run.yaml around lines 59 to 63, tls_verify is defaulted to
false which is insecure; change the default to true and allow callers to
override via the VLLM_TLS_VERIFY environment variable. Update the YAML to set
tls_verify: ${env.VLLM_TLS_VERIFY:=true} (and if your system treats env vars as
strings, ensure downstream code parses the value as a boolean so "false"/"0"/""
properly disable TLS verification while any other value enables it).
| # | ||
| # Contributed by @eranco74 (2025-08). | ||
| # |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix trailing spaces and add missing newline at EOF to satisfy yamllint/CI.
- Trailing spaces at Line 3 and Line 5.
- Missing newline at end of file (Line 108).
Apply:
-#
+#
-#
+#
- port: 8321
+ port: 8321
+Also applies to: 108-108
🧰 Tools
🪛 YAMLlint (1.37.1)
[error] 3-3: trailing spaces
(trailing-spaces)
[error] 5-5: trailing spaces
(trailing-spaces)
🤖 Prompt for AI Agents
In examples/vllm-qwen3-run.yaml around lines 3-5 and at EOF line 108, there are
trailing spaces on lines 3 and 5 and the file lacks a final newline; remove the
trailing spaces from those lines and ensure the file ends with a single newline
character (add a newline at EOF) so it passes yamllint/CI.
…nfigurations
Description
Added configuration examples (llama-stack) for inference (OpenAI, Llama 3.1 - RHEL AI vLLM, Granite - vLLM, Gemini, Qwen - vLLM) and vector stores (FAISS, pgvector).
Type of change
Related Tickets & Documents
Checklist before requesting a review
Testing
Summary by CodeRabbit
Documentation
Examples